首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Large-scale similarity data management with distributed Metric Index
Authors:David Novak  Michal Batko  Pavel Zezula
Institution:Masaryk University, Brno, Czech Republic
Abstract:Metric space is a universal and versatile model of similarity that can be applied in various areas of non-text information retrieval. However, a general, efficient and scalable solution for metric data management is still a resisting research challenge. In this work, we try to make an important step towards such management system that would be able to scale to data collections of billions of objects. We propose a distributed index structure for similarity data management called the Metric Index (M-Index) which can answer queries in precise and approximate manner. This technique can take advantage of any distributed hash table that supports interval queries and utilize it as an underlying index. We have performed numerous experiments to test various settings of the M-Index structure and we have proved its usability by developing a full-featured publicly-available Web application.
Keywords:Distributed data structures  Performance tuning  Similarity search  Scalability  Peer-to-peer structured networks  Metric space
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号