首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Introducing structure management in automatic reference resolution: An XML-based approach
Authors:M Mercedes Martínez-Gonzlez  Pablo de la Fuente
Institution:aUniversidad de Valladolid, Edificio TIT, Campus “Miguel Delibes” s/n, 47011 Valladolid, Spain
Abstract:References to parts of structured documents use their structure to locate the piece of document which is the reference target. On the other hand, XML has become an increasingly important language for structured documents. One of its most important related languages is XPath, the language that permits fragments of XML documents to be selected. In this article we present a methodology, and an application case, to automatically extract and solve references to fragments of structured documents. This approach combines structure manipulation and information extraction, to enhance reference extraction tools by improving the precision of the references extracted. We take advantage of XML markup to locate the position within the structure in which the references are found. The use of XPath, one of the most important XML related languages, for reference resolution is original: the resolution tool automatically builds XPath expressions. This proposal is inspired (and implemented) from our work with legislative documents.
Keywords:Information extraction  XML  Structured documents  Reference extraction  Reference resolution  Legislative documents
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号