首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Analyzing imbalance among homogeneous index servers in a web search system
Authors:CS Badue  R Baeza-Yates  B Ribeiro-Neto  A Ziviani  N Ziviani
Institution:1. Department of Computer Science, Federal University of Minas Gerais, Av. Antônio Carlos 6627, 31.270-010, Belo Horizonte, Brazil;2. Yahoo! Research, Spain & Chile;3. Google Engineering Belo Horizonte, Belo Horizonte, Brazil;4. National Laboratory for Scientific Computing (LNCC), Petrópolis, Brazil
Abstract:The performance of parallel query processing in a cluster of index servers is crucial for modern web search systems. In such a scenario, the response time basically depends on the execution time of the slowest server to generate a partial ranked answer. Previous approaches investigate performance issues in this context using simulation, analytical modeling, experimentation, or a combination of them. Nevertheless, these approaches simply assume balanced execution times among homogeneous servers (by uniformly distributing the document collection among them, for instance)—a scenario that we did not observe in our experimentation. On the contrary, we found that even with a balanced distribution of the document collection among index servers, correlations between the frequency of a term in the query log and the size of its corresponding inverted list lead to imbalances in query execution times at these same servers, because these correlations affect disk caching behavior. Further, the relative sizes of the main memory at each server (with regard to disk space usage) and the number of servers participating in the parallel query processing also affect imbalance of local query execution times. These are relevant findings that have not been reported before and that, we understand, are of interest to the research community.
Keywords:Parallel query processing  Imbalance  Search engines  Performance analysis
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号