首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Table extraction for answer retrieval
Authors:Xing Wei  Bruce Croft  Andrew McCallum
Institution:(1) Center for Intelligent Information Retrieval, University of Massachusetts Amherst, 140 Governors Drive, Amherst, MA 01003, USA
Abstract:The ability to find tables and extract information from them is a necessary component of many information retrieval tasks. Documents often contain tables in order to communicate densely packed, multi-dimensional information. Tables do this by employing layout patterns to efficiently indicate fields and records in two-dimensional form. Their rich combination of formatting and content presents difficulties for traditional retrieval techniques. This paper describes techniques for extracting tables from text and retrieving answers from the extracted information. We compare machine learning (especially, Conditional Random Fields) and heuristic methods for table extraction. To retrieve answers, our approach creates a cell document, which contains the cell and its metadata (headers, titles) for each table cell, and the retrieval model ranks the cells of the extracted tables using a language-modeling approach. Performance is tested using government statistical Web sites and news articles, and errors are analyzed in order to improve the system.
Keywords:Table extraction  Conditional random fields  Question answering  Information extraction
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号