Query reformulation mining: models,patterns, and applications |
| |
Authors: | Paolo Boldi Francesco Bonchi Carlos Castillo Sebastiano Vigna |
| |
Institution: | 1.DSI, Università degli studi di Milano,Milan,Italy;2.Yahoo! Research,Barcelona,Spain |
| |
Abstract: | Understanding query reformulation patterns is a key task towards next generation web search engines. If we can do that, then
we can build systems able to understand and possibly predict user intent, providing the needed assistance at the right time,
and thus helping users locate information more effectively and improving their web-search experience. As a step in this direction,
we build a very accurate model for classifying user query reformulations into broad classes (generalization, specialization,
error correction or parallel move), achieving 92% accuracy. We then apply the model to automatically label two very large
query logs sampled from different geographic areas, and containing a total of approximately 17 million query reformulations.
We study the resulting reformulation patterns, matching some results from previous studies performed on smaller manually annotated
datasets, and discovering new interesting reformulation patterns, including connections between reformulation types and topical
categories. We annotate two large query-flow graphs with reformulation type information, and run several graph-characterization
experiments on these graphs, extracting new insights about the relationships between the different query reformulation types.
Finally we study query recommendations based on short random walks on the query-flow graphs. Our experiments show that these
methods can match in precision, and often improve, recommendations based on query-click graphs, without the need of users’
clicks. Our experiments also show that it is important to consider transition-type labels on edges for having recommendations
of good quality. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|