Open Access Open Access  Restricted Access Subscription or Fee Access

Comparative Analysis of Various Semantic Relatedness Measures

Shirin Chandna, Shalini Batra

Abstract


Measures of Semantic Relatedness (MSR) quantify the degree in which some words or concepts are related, considering not only similarity but any possible semantic relationship among them. Relatedness computation is of great interest in different areas, such as Natural Language Processing, Information Retrieval, or the Semantic Web. Different methods have been proposed in the past; however, current relatedness measures lack some desirable properties for a new generation of Semantic Web applications: maximum coverage, domain independence and universality. In this paper, semantic relatedness between words is explored using various MSRs and comparative analysis is done between usage of semantic relatedness measures proposed by using WordNet as knowledge corpus and those proposed by using search engines as knowledge corpus. The results clearly indicate the advantage of using search engines based measures in comparison to Wordnet based measures.

Keywords


Measures of Semantic Relatedness, Normalized Compression Distance, Normalized Google Distance, Normalized Similarity Score, WordNet.

Full Text:

PDF

References


P. Turney, “Mining the Web for synonyms: PMI- IR versus LSA on TOEFL”, L. De Raedt & P. Flach (Eds.), Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001), Freiburg, Germany, 2001, pp. 491-502.

R. Cilibrasi , P.M. Vitanyi, “ The Google similarity distance”, IEEE Transactions on Knowledge and Data Engineering, 19(3), 2007, pp. 370-383.

T.K. Landauer, S.T. Dumais, “A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge”, Psychological Review, 104(2), 1997, pp. 211-240.

I. Matveeva, G. Levow, A. Farahat, C. Royer, “Term representation with generalized latent semantic analysis”, Conference on Recent Advances in Natural Language Processing, 2005.

D. Vladislav, Z. Veksler Ryan, Wayne D. Gray, “Defining the dimensions of the human semantic space” , 30th Annual Meeting of the Cognitive Science Society. pp. 1282-1287

A. Budanitsky, G. Hirst, “Evaluating WordNet-based measures of semantic distance”, Computational Linguistics , 2006, pp. 13–47.

E.Motta, M.Sabou, “Next generation semantic web applications”, In: 1st Asian Semantic Web Conference. LNCS. Springer, Heidelberg , 2006

A. Kilgarriff, G. Grefenstette, Introduction to the special issue on the web as corpus. Computational Linguistics 29(3), 2003, pp. 333–348.

Dekang Lin, “An information-theoretic definition of similarity”, In Proceedings of the 15th International Conference on Machine Learning, Madison, WI

D. Bollegala, Y. Matsuo, M. Ishizuka, “Measuring semantic similarity between words using web search engines” In: Proc. of WWW 2007, Banff, Canada (2007)

G.A. Miller, W.G. Charles, “Contextual Correlates of Semantic Similarity”, Published in: Language and Cognitive processes, 1991

R.L. Cilibrasi, P.M. Vit´anyi : The Google similarity distance, IEEE Transactions on Knowledge and Data Engineering 19(3), 2007, pp. 370–383.

Graeme Hirst, David St-Onge, “ Lexical chains as representations of context for the detection and correction of malapropisms”, In Fellbaum, 1998, pp. 305–332.

Claudia Leacock, Martin Chodorow, “Combining local context and WordNet similarity for word sense identification”, In Fellbaum 1998, pp. 265–283

P. Resnik, “Using information content to evaluate semantic similarity in a taxonomy”, In: 14th International Joint Conference on AI, Montreal (Canada), 1995

Jay J. Jiang, David W. Conrath, “ Semantic similarity based on corpus statistics and lexical taxonomy”, In Proceedings of International Conference on Research in Computational Linguistics, Taiwan, 1997.


Refbacks

  • There are currently no refbacks.