首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A framework for understanding Latent Semantic Indexing (LSI) performance
Authors:April Kontostathis  William M Pottenger
Institution:1. Ursinus College, PO Box 1000, 601 Main Street, Collegeville, PA 19426, United States;2. Lehigh University, 19 Memorial Drive West, Bethlehem, PA 18015, United States
Abstract:In this paper we present a theoretical model for understanding the performance of Latent Semantic Indexing (LSI) search and retrieval application. Many models for understanding LSI have been proposed. Ours is the first to study the values produced by LSI in the term by dimension vectors. The framework presented here is based on term co-occurrence data. We show a strong correlation between second-order term co-occurrence and the values produced by the Singular Value Decomposition (SVD) algorithm that forms the foundation for LSI. We also present a mathematical proof that the SVD algorithm encapsulates term co-occurrence information.
Keywords:Latent Semantic Indexing  Term co-occurrence  Singular value  Decomposition  Information retrieval theory
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号