International Journal of Information Technology & Computer Science ( IJITCS )
Information retrieval is the process of sorting, searching, and retrieval of information that matches user query. It is aimed to find relevant documents or information response to user request. Most of the researchers experienced that the problem in information retrieval is matching queries with document. Hence, the information is retrieved with the help of edge index graph by applying semantic relatedness and similarity between terms in corpus. In our approach, every word in the document is pre-processed and stemmed using well known stemming algorithm Porter Stemmer to reduce the data size. Consequently, we proposed a semantic similarity measure to compute similarity between words in the document. The semantic similarity measure is used to retrieve very similar information and also relevant to query term. Our approach provides an efficient technique to user query response and also it optimizes the Set Y generation during PMI computation. We report the results of computation performed using semantic similarity measure and retrieval process.
: Edge Index, Semantic Similarity, Information Retrieval, PMI.
- Fuji Ren and David Brace well, “Advanced Information Retrieval”,Dept. of Information Science and Intelligent System, University of Tokushima, Japan, Electronic Notes in theoretical Computer Science, 303-317, 2009.
- R.Dhanapal,“Intelligent information retrieval agent”,Knowl. Based Systems, 21 466-470, 2008.
- Yi - Chun Liao, “ A weight - based approach to information retrieval and relevance feedback ” ,Elsevier , Expert Systems with applications 35,254-261, 2008.
- Amir Karshhenas, Kamil Dimiller, “ PIRS: An Information Retrieval System based on the Vector Space Model”, IEEE, 978-1-4244-2881-6, 2008.
- Antonio Jimeno - Yepes and Rafael Berlanga - Liavori and Dietrich Rebholz - Schumann,“ Ontology refinement for improved information retrieval”, Department of Computer System and Languages, Universitat Jaume I, Spain, 2007.
- Yu - Chunan Chang and Shyi - Ming Chen, “ A New Query Method for Document Retrieval Based on Genetic Algorithm”, IEEE Transactions on Evolutionary Computation, Vol.10 , No.5, October 2006.
- M. F. Porter,“An algorithm for suffix stripping”,Proc. Computer Laboratory, Corn exchange Street, Cambridge, July 1980.
- Data mining-concepts and techniques, Jiawei Han, Micheline Kamber, 3rd Edition, 2011.
- Mehamet Ali Salahli, “An Approach for Measuring Semantic Relatedness between Words via Related Terms” Mathematical and Computational Applications, Vol.14, No.1, pp.55-63, 2009.
- Lushan Han, Tim Finin, “Improving Word Similarity by Augmenting PMI with Estimates of Word Polysemy, IEEE Transactions on Knowledge and Data Engineering, 2012.
- Elias,Alexandros,”Unsupervised Semantic Similarity Computation between Terms Using Web Documents”,IEEE Trans.on Knowl. and Data Engg.,VOL.22,NO.11, Nov. 2010.
- David Sanchez, Montserrat Batet, David Isern, Aida Valls,” Ontology-based semantic similarity: A new feature-based approach”, Expert Systems with Applications 39: 7718–7728, 2012.
- Zhenjiang Lin, Michael R. Lyu and Irwin King,” MatchSim: a novel similarity measure based on Maximum neighborhood matching”, Knowledge Information System, 32:141–166, 2012.
- Qiming Luo, Enhong Chen, Hui Xiong,” A semantic term weighting scheme for text Categorization”, Expert Systems with Applications, 38:12708–12716, 2011.
- Islam, A. and Inkpen,“ Second Order Co-occurrence PMI for Determining the Semantic Similarity of Words, in Proc. Inter. Conf. LREC 2006.