College - Author 1
College of Science and Mathematics
Department - Author 1
Mathematics Department
Advisor
Lubomir Stanchev, CENG, Computer Science Department
Funding Source
Barbara J. Van Ness and the College of Engineering
Date
10-2024
Abstract/Summary
Semantic search plays a critical role in many domains, with numerous algorithms developed to address it. A common approach involves using sentence transformers to generate embeddings for both search queries and documents, allowing for the comparison of their vectors. While many different embedding models are widely used, our approach integrates these models with human-crafted knowledge in a novel way, resulting in an improvement in the Mean Average Precision (MAP) scores. Traditional embeddings often rely heavily on the specific words used in a query or document. Our technique mitigates this dependency by refining the vectors to capture the overall semantic meaning, shifting the focus from individual words to the broader concepts they represent. This approach highlights the importance of semantic understanding in search tasks. In our experiments, using 23 different sentence embedding models, we achieved a statistically significant improvement in MAP scores, with a p-value of 0.047.
October 1, 2024.
Included in
URL: https://digitalcommons.calpoly.edu/ceng_surp/71