A risk minimization framework for information retrieval |
| |
Authors: | ChengXiang Zhai John Lafferty |
| |
Institution: | 1. Department of Computer Science, University of Illinois at Urbana-Champaign, Illinois, United States;2. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States |
| |
Abstract: | This paper presents a probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We discuss how this framework can unify existing retrieval models and accommodate systematic development of new retrieval models. As an example of using the framework to model non-traditional retrieval problems, we derive retrieval models for subtopic retrieval, which is concerned with retrieving documents to cover many different subtopics of a general query topic. These new models differ from traditional retrieval models in that they relax the traditional assumption of independent relevance of documents. |
| |
Keywords: | Retrieval models Risk minimization Statistical language models Bayesian decision theory |
本文献已被 ScienceDirect 等数据库收录! |
|