首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A statistics-based approach to incrementally update inverted files
Institution:1. State Engineering Laboratory of Efficient Water Use of Crops and Disaster Loss Mitigation/MOAR Key Laboratory for Dryland Agriculture, Institute of Environment and Sustainable Development in Agriculture, Chinese Academy of Agricultural Sciences, Beijing 100081, China;2. Department of Hydraulic Engineering, Hebei University of Water Resources and Electric Engineering, Cangzhou 061001, China;3. State Key Laboratory of Hydraulics and Mountain River Engineering & College of Water Resource and Hydropower, Sichuan University, Chengdu 610065, China;4. Key Lab of Ecosystem Network Observation and Modelling, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China;1. Department of Psychology, University of Arizona, United States;2. Department of Psychology, University of Münster, Germany;3. Department of Psychology, University of California, Riverside, United States;4. Department of Psychiatry and Psychology, Mayo Clinic, Rochester, United States;5. Minnesota Epilepsy Group, United States;6. Department of Psychiatry, University of Wisconsin-Madison, United States;7. Alpert Medical School, Brown University, United States;1. State Key Laboratory of Oral Diseases and Department of Oral and Maxillofacial Surgery, West China College of Stomatology, Sichuan University, Chengdu, China;2. State Key Laboratory of Oral Diseases & National Clinical Research Center for Oral Diseases & Dept. of Orthognathic & TMJ Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, China;1. Department of Physical Medicine and Rehabilitation, Linkou Chang Gung Memorial Hospital, College of Medicine, Chang Gung University, Taoyuan, Taiwan, ROC;2. Department of Computer Science and Information Engineering, Chang Gung University, Taiwan, ROC;3. Department of Neurology, Linkou Chang Gung Memorial Hospital, College of Medicine, Chang Gung University, Taoyuan, Taiwan, ROC;1. School of Earth Sciences, University of Bristol, Wills Memorial Building, Queens Road, Bristol, BS8 1RJ, UK;2. Departamento de Biologia, FFCLRP, Universidade de São Paulo, Av. Bandeirantes 3900, 14040-901, Ribeirao Preto-SP, Brazil
Abstract:Many information retrieval systems use the inverted file as indexing structure. The inverted file, however, requires inefficient reorganization when new documents are to be added to an existing collection. Most studies suggest dealing with this problem by sparing free space in an inverted file for incremental updates. In this paper, we propose a run-time statistics-based approach to allocate the spare space. This approach estimates the space requirements in an inverted file using only a little most recent statistical data on space usage and document update request rate. For best indexing speed and space efficiency, the amount of the spare space to be allocated is determined by adaptively balancing the trade-offs between reorganization reduction and space utilization. Experiment results show that the proposed space-sparing approach significantly avoids reorganization in updating an inverted file, and in the meantime, unused free space can be well controlled such that the file access speed is not affected.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号