Introducing structure management in automatic reference resolution: An XML-based approach |
| |
Authors: | M Mercedes Martínez-Gonzlez Pablo de la Fuente |
| |
Institution: | aUniversidad de Valladolid, Edificio TIT, Campus “Miguel Delibes” s/n, 47011 Valladolid, Spain |
| |
Abstract: | References to parts of structured documents use their structure to locate the piece of document which is the reference target. On the other hand, XML has become an increasingly important language for structured documents. One of its most important related languages is XPath, the language that permits fragments of XML documents to be selected. In this article we present a methodology, and an application case, to automatically extract and solve references to fragments of structured documents. This approach combines structure manipulation and information extraction, to enhance reference extraction tools by improving the precision of the references extracted. We take advantage of XML markup to locate the position within the structure in which the references are found. The use of XPath, one of the most important XML related languages, for reference resolution is original: the resolution tool automatically builds XPath expressions. This proposal is inspired (and implemented) from our work with legislative documents. |
| |
Keywords: | Information extraction XML Structured documents Reference extraction Reference resolution Legislative documents |
本文献已被 ScienceDirect 等数据库收录! |
|