首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A New Approach to Clustering Records in Information Retrieval Systems
Authors:IAR Moghrabi  RA Makholian
Institution:(1) Natural Science Division, Lebanese American University, P.O. Box 13-5053, Beirut, Lebanon;(2) Natural Science Division, Lebanese American University, P.O. Box 13-5053, Beirut, Lebanon
Abstract:This work introduces a new approach to record clustering where a hybrid algorithm is presented to cluster records based upon threshold values and the query patterns made to a particular database. The Hamming Distance of a file is used as a measure of space density. The objective of the algorithm is to minimize the Hamming Distance of the file while attaching significance to the most frequent queries being asked. Simulation experiments conducted proved that a great reduction in response time is yielded after the restructuring of a file. We study the space density properties of a file and how it affects retrieval time before and after clustering, as a means of predicting file performance and making appropriate choices of parameters. Criteria, such as, block size, threshold value, percentage of records satisfying a given set of queries, etc., which affect clustering and response time are also studied.
Keywords:information retrieval  clustering  statistical analysis  database systems
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号