首页 | 本学科首页   官方微博 | 高级检索  
     检索      

大规模地名本体数据库系统的建构技术与方法
引用本文:俞敬松,王惠临,杨洁.大规模地名本体数据库系统的建构技术与方法[J].图书情报工作,2016,60(8):126-131.
作者姓名:俞敬松  王惠临  杨洁
作者单位:1. 北京大学信息管理系 北京 100871;2. 北京大学软件与微电子学院 北京 100871
摘    要:目的/意义] 实用的大规模地名本体数据库系统在自然语言处理、信息检索和情报分析领域具有重要的应用价值。本研究的目的是在减少人工干预的情况下,实现对地名简称、俗名以及随时间变化的复杂地名文本的自动识别与处理。方法/过程] 以多种方法获取大规模名址数据为根基,简化地名元素间复杂关系,在开发名址元素切分、属性与关系分析及推理工具包的基础上,利用Neo4j图数据库工具开发实用地名本体数据库系统。结果/结论] 基于所介绍的技术与方法而构建的系统具有良好的容错性和持续的数据更新能力,其地名分析、地名元素间关系推理达到了期望的精度,并在面向诸如新闻主题追踪、金融征信中的地名匹配等多种自然语言处理任务中取得良好效果。

关 键 词:自然语言处理地名  本体库名址  分析  关系推理  
收稿时间:2016-02-19
修稿时间:2016-04-09

Research on Large-scale Toponym Ontology Database Construction Techniques and Methods
Yu Jingsong,Wang Huilin,Yang Jie.Research on Large-scale Toponym Ontology Database Construction Techniques and Methods[J].Library and Information Service,2016,60(8):126-131.
Authors:Yu Jingsong  Wang Huilin  Yang Jie
Institution:1. Department of Information Management, Beijing 100871;2. School of Software and Microelectronics, Peking University, Beijing 100871
Abstract:Purpose/significance] The geographic ontology has great value in natural language processing(NLP), information retrieval and intelligence analysis tasks. The purpose of this study is to analyze complicated address text automatically with less manual processing, such like Abbreviation, unstandardized names even changing with time.Method/process] Previous studies have primarily focused on rigorous ontology building and languages like OWL are used to create standardized statements. In this study, we changed the way to simplify the relationship set and emphasize on obtaining and using massive data from different types of resources. Through the development of address text segmentation and attribute annotation as well as other relationship reasoning software toolkit, we generate a large-scale geographic ontology database by using Neo4j graph database software.Result/conclusion] The system based on the methods and technologies introduced in this paper has abilities of fault-tolerant and long-lasting data growth and renewal. The precision of toponym analysis and geographic elements relationship reasoning achieved the desired requirements and led to the success of many NLP tasks, such like news topic tracking, black address lists comparison for credit investigation and so on.
Keywords:natural language processing  geographicontologies  toponym information analysis  relationship reasoning  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号