"Semantic Search Using a Similarity Graph" by Lubomir Stanchev

Computer Science and Software Engineering

Title

Semantic Search Using a Similarity Graph

Author Info

Lubomir Stanchev, Indiana University - Purdue University Fort WayneFollow

Recommended Citation

Postprint version. Published in 9th IEEE International Conference on Semantic Computing Proceedings: Anaheim, CA, February 7, 2015, pages 93-100.

NOTE: At the time of publication, the author Lubomir Stanchev was not yet affiliated with Cal Poly.

The definitive version is available at https://doi.org/10.1109/ICOSC.2015.7050785.

Abstract

Given a set of documents and an input query that is expressed in a natural language, the problem of document search is retrieving the most relevant documents. Unlike most existing systems that perform document search based on keywords matching, we propose a search method that considers the meaning of the words in the query and the document. As a result, our algorithm can return documents that have no words in common with the input query as long as the documents are relevant. For example, a document that contains the words “Ford”, “Chrysler” and “General Motors” multiple times is surely relevant for the query “car” even if the word “car” does not appear in the document. Our semantic search algorithm is based on a similarity graph that contains the degree of semantic similarity between terms, where a term can be a word or a phrase. We experimentally validate our algorithm on the Cranfield benchmark that contains 1400 documents and 225 natural language queries. The benchmark also contains the relevant documents for every query as determined by human judgment. We show that our semantic search algorithm produces a higher value for the mean average precision (MAP) score than a keywords matching algorithm. This shows that our approach can improve the quality of the result because the meaning of the words and phrases in the documents and the queries is taken into account.

Disciplines

Computer Sciences

Copyright

2015 IEEE.

Number of Pages

Publisher statement

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Download

Included in

Computer Sciences Commons

COinS

URL: https://digitalcommons.calpoly.edu/csse_fac/262

Computer Science and Software Engineering

Title

Author Info

Recommended Citation

Abstract

Disciplines

Copyright

Number of Pages

Publisher statement

Included in

Search

Browse

Author Corner

LINKS

Computer Science and Software Engineering

Title

Author Info

Recommended Citation

Abstract

Disciplines

Copyright

Number of Pages

Publisher statement

Included in

Share

Search

Browse

Author Corner

LINKS