首页 | 本学科首页   官方微博 | 高级检索  
     检索      


The Dilution/Concentration conditions for cross-language information retrieval models
Authors:Bo Li  Eric Gaussier  Dan Yang
Institution:1. School of computer science, Central China Normal University, Wuhan, China;2. CNRS-LIG/AMA, Université Grenoble Alpes, Grenoble, France;3. China Electric Power Research Institute, Wuhan, China
Abstract:Experimental results of cross-language information retrieval (CLIR) do not indicate why a model fails or how a model could be improved. One basic research question is thus whether it is possible to provide conditions by which one can evaluate any existing or new CLIR strategy analytically and one can improve the design of CLIR models. Inspired by the heuristics in monolingual IR, we introduce in this paper Dilution/Concentration (D/C) conditions to characterize good CLIR models based on direct intuitions under artificial settings. The conditions, derived from first principles in CLIR, generalize the idea of query structuring approach. Empirical results with state-of-the-art CLIR models show that when a condition is not satisfied, it often indicates non-optimality of the method. In general, we find that the empirical performance of a retrieval formula is tightly related to how well it satisfies the conditions. Lastly, we propose, by following the D/C conditions, several novel CLIR models based on the information-based models, which again shows that the D/C conditions are efficient to feature good CLIR models.
Keywords:Cross-language information retrieval  D/C condition  Information retrieval heuristic
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号