"Measuring the Strength of the Semantic Relationship Between Words" by Lubomir Stanchev

Computer Science and Software Engineering

Title

Measuring the Strength of the Semantic Relationship Between Words

Author Info

Lubomir Stanchev, California Polytechnic State University, San Luis ObispoFollow

Recommended Citation

Postprint version. Published in International Journal on Artificial Intelligence Tools, Volume 24, Issue 2, April 1, 2015, pages 1540011-1-1540011-30.

The definitive version is available at https://doi.org/10.1142/S0218213015400114.

Abstract

We propose a novel way for extracting the strength of the semantic relationship between words from semi-structured sources, such as WordNet. Unlike existing approaches that only explore the structured information (e.g., the hypernym relationship in WordNet), we present a framework that allows us to utilize all available information, including natural text descriptions. Our approach constructs a similarity graph that stores the strength of the semantic relationship between words. Specifically, an edge between two words describes the probability that someone who is interested in resources about the first word will be also interested in resources about the second word. Note that the graph is asymmetric because the probability that someone is interested in the second word given that they are interested in the first word is not the same as the probability that they are interested in the first word given that they are interested in the second word. The similarity between any two words in the graph can be computed as a function of the directed paths between the two nodes in the graph that represent the words.

We evaluate the quality of the data in the similarity graph by comparing the similarity of pairs of words using our software that uses the graph with results of studies that are performed with human subjects. To the best of our knowledge, our software produces better correlation with the results of both the Miller and Charles study and the WordSimilarity-353 study than any other published research. We also present an extended evaluation section that describes how the different heuristics that we use affect the correlation score.

Disciplines

Computer Sciences

Copyright

2015 World Scientific Publishing Company.

Download

Included in

Computer Sciences Commons

COinS

URL: https://digitalcommons.calpoly.edu/csse_fac/245

Computer Science and Software Engineering

Title

Author Info

Recommended Citation

Abstract

Disciplines

Copyright

Included in

Search

Browse

Author Corner

LINKS

Computer Science and Software Engineering

Title

Author Info

Recommended Citation

Abstract

Disciplines

Copyright

Included in

Share

Search

Browse

Author Corner

LINKS