首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Optimization of some factors affecting the performance of query expansion
Authors:Young Mee Chung  Jae Yun Lee
Institution:Department of Library and Information Science, Yonsei University, 134 Shinchondong, Sodaemunku, Seoul, South Korea
Abstract:This paper examines the factors affecting the performance of global query expansion based on term co-occurrence data and suggests a way to maximize the retrieval effectiveness. Major parameters to be optimized through experiments are term similarity measure and the weighting scheme of additional terms. The evaluation of four similarity measures tested in query expansion reveal that mutual information and Yule's Y, which emphasize low frequency terms, achieve better performance than cosine and Jaccard coefficients that have the reverse tendency. In the evaluation of three weighting schemes, similarity weight performs well only with short queries, whereas fixed weights of approximately 0.5 and similarity rank weights were effective with queries of any length. Furthermore, the optimal similarity rank weight achieving the best overall performance seems to be the least affected by test collections and the number of additional terms. For the efficiency of retrieval, the number of additional terms needs not exceed 70 in our test collections, but the optimal number may vary according to the characteristics of the similarity measure employed.
Keywords:Query expansion  Similarity measures  Query term weighting  Term weights
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号