首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Named entity recognition with multiple segment representations
Authors:Han-Cheol Cho  Naoaki Okazaki  Makoto Miwa  Jun’ichi Tsujii
Institution:1. Suda Lab., Dept. of Computer Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8656, Japan;2. Inui & Okazaki Lab., Dept. of System Information Sciences, Tohoku University, 6-3-09 Aramakiaza-Aoba, Aoba-ku, Sendai 980-8579, Japan;3. National Centre for Text Mining, Manchester Interdisciplinary Biocentre, 131 Princess Street, Manchester M1 7DN, UK;4. Microsoft Research Asia, New West Campus, 3rd Floor, Tower 2, No. 5, Dan Ling Street, Haidian District, Beijing 1000080, People’s Republic of China
Abstract:Named entity recognition (NER) is mostly formalized as a sequence labeling problem in which segments of named entities are represented by label sequences. Although a considerable effort has been made to investigate sophisticated features that encode textual characteristics of named entities (e.g. PEOPLE, LOCATION, etc.), little attention has been paid to segment representations (SRs) for multi-token named entities (e.g. the IOB2 notation). In this paper, we investigate the effects of different SRs on NER tasks, and propose a feature generation method using multiple SRs. The proposed method allows a model to exploit not only highly discriminative features of complex SRs but also robust features of simple SRs against the data sparseness problem. Since it incorporates different SRs as feature functions of Conditional Random Fields (CRFs), we can use the well-established procedure for training. In addition, the tagging speed of a model integrating multiple SRs can be accelerated equivalent to that of a model using only the most complex SR of the integrated model. Experimental results demonstrate that incorporating multiple SRs into a single model improves the performance and the stability of NER. We also provide the detailed analysis of the results.
Keywords:Named entity recognition  Machine learning  Conditional random fields  Feature engineering
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号