Abstract

The paper shows how to create a probabilistic graph for WordNet. A node is created for every word and phrase in WordNet. An edge between two nodes is labeled with the probability that a user that is interested in the source concept will also be interested in the destination concept. For example, an edge with weight 0.3 between “canine” and “dog” indicates that there is a 30% probability that a user who searches for “canine” will be interested in results that contain the word “dog”. We refer to the graph as probabilistic because we enforce the constraint that the sum of the weights of all the edges that go out of a node add up to one. Structural (e.g., the word “canine” is a hypernym (i.e., kind of) of the word “dog”) and textual (e.g., the word “canine” appears in the textual definition of the word “dog”) data from WordNet is used to create a Markov logic network, that is, a set of first order formulas with probabilities. The Markov logic network is then used to compute the weights of the edges in the probabilistic graph. We experimentally validate the quality of the data in the probabilistic graph on two independent benchmarks: Miller and Charles and WordSimilarity-353.

Disciplines

Computer Sciences

Number of Pages

12

Publisher statement

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM, Inc., fax +1 (212) 869-0481, or permissions@acm.org.

Share

COinS
 

URL: https://digitalcommons.calpoly.edu/csse_fac/269