International Journal of Information Technology & Computer Science ( IJITCS )
Semantic similarity measures play vital roles in information retrieval and Natural Language Processing.Despite the usefulness of semantic similarity measures in various applications, strongly measuring semantic similarity between two words remains a challenging task. Here three semantic similarity measures have been proposed, that uses the information available on the Web to measure similarity between words and sentences. The proposed method exploits page counts and text snippets returned by a Web search engine. We develop indirect associations of words, in addition to direct for estimating their similarity. Evaluation results on different data sets shows that our methods outperform several competing methods.
: Semantic Similarity, Web search engine, Higher Order Association Mining, Support Vector Machine.
- Burgess, K. Livesay, and K. Lund, “Explorations in Context Space: Words, Sentences, Discourse,” Discourse Processes, vol. 25, nos. 2-3, pp. 211-257, 1998.
- J. Jiang and D. Conrath, “Semantic Similarity based on Corpus Statistics and Lexical Taxonomy,” Proc. Int’l Conf.Research in Computational Linguistics, 1997.
- T. K. Landauer and S. T. Dumais, “A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge,” Psychological Review, vol. 104, nos. 2 , pp. 211-240,1997.
- T. K. Landauer, P. W. Foltz, and D. Laham, “Introduction to Latent Semantic Analysis,” Discourse Processes, vol.25, nos. 2-3, pp. 259-284, 1998.
- C. Leacock and M. Chodorow, “Combining Local Context and WordNet Sense Similarity for Word Sense Identification,” WordNet, An Electronic Lexical Database, The MIT Press, 1998.
- M. Lesk, “Automatic Sense Disambiguation using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone,” Proc. SIGDOC Conf., 1986.
- Y. Li, D. McLean, Z. Bandar, J. O’Shea, and K. Crockett, “Sentence Similarity Based on Semantic Nets and Corpus Statistics,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 8, pp. 1138-1149, Aug. 2006. D. Lin, “An Information-theoretic Definition of Similarity,” Proc. Int’l Conf. Machine Learning, 1998.
- A. Maguitman, F. Menczer, H. Roinestad, and A. Vespignani, “Algorithmic Detection of Semantic Similarity,” Proc. 14th Int’l World Wide Web Conf., May 2005.
- C.T. Meadow, B.R. Boyce, and D.H. Kraft, Text Information Retrieval Systems, second ed. Academic Press, 2000.
- R. Mihalcea, C. Corley, and C. Strapparava, “Corpus-based and Knowledge-based Measures of Text Semantic Similarity,” Proc. American Association for Artificial Intelligence, Boston, July, 2006.
- G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K.J. Miller, “Introduction to WordNet: An on-line lexical database,” CSL 43, Cognitive Science Laboratory, Princeton University, Princeton, NJ, 1993.
- P. Resnik, “Using Information Content to Evaluate Semantic Similarity in a Taxonomy,” Proc. 14th Int’l Joint Conf. AI, 1995.
- G. Salton and M. Lesk, Computer evaluation of indexing and text processing. Prentice Hall, Ing. Englewood Cliffs, New Jersey, pp. 143–180., 1971.
- P. Turney, “Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL,” Proc. Twelfth European Conf.Machine Learning, 2001.
- Z. Wu and M. Palmer, “Verb Semantics and Lexical Selection,” Proc. Ann. Meeting Association for Computational Linguistics, 1994.