首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A reliable FAQ retrieval system using a query log classification technique based on latent semantic analysis
Authors:Harksoo Kim  Hyunjung Lee  Jungyun Seo
Institution:1. Program of Computer and Communications Engineering, College of Information Technology, Kangwon National University, 192-1 Hyoja 2(i)-dong, Chuncheon-si, Gangwon-do 200-701, Republic of Korea;2. Natural Language Processing Laboratory, Department of Computer Science, Sogang University, Sinsu-dong 1, Seoul 121-742, Republic of Korea;3. Department of Computer Science and Interdisciplinary Program of Integrated Biotechnology, Sogang University, 1 Sinsu-dong, Mapo-gu, Seoul 121-742, Republic of Korea
Abstract:To obtain high performances, previous works on FAQ retrieval used high-level knowledge bases or handcrafted rules. However, it is a time and effort consuming job to construct these knowledge bases and rules whenever application domains are changed. To overcome this problem, we propose a high-performance FAQ retrieval system only using users’ query logs as knowledge sources. During indexing time, the proposed system efficiently clusters users’ query logs using classification techniques based on latent semantic analysis. During retrieval time, the proposed system smoothes FAQs using the query log clusters. In the experiment, the proposed system outperformed the conventional information retrieval systems in FAQ retrieval. Based on various experiments, we found that the proposed system could alleviate critical lexical disagreement problems in short document retrieval. In addition, we believe that the proposed system is more practical and reliable than the previous FAQ retrieval systems because it uses only data-driven methods without high-level knowledge sources.
Keywords:FAQ retrieval  Lexical disagreement problem  Query log clusters  Latent semantic analysis
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号