首页 | 本学科首页   官方微博 | 高级检索  
     检索      

现代汉语缩略语自动识别研究的现状与展望
引用本文:丁俊苗.现代汉语缩略语自动识别研究的现状与展望[J].渭南师范学院学报,2008,23(6):39-43.
作者姓名:丁俊苗
作者单位:巢湖学院,中文系,安徽,巢湖,238000;陕西师范大学,文学院,西安,710062
基金项目:安徽省高校青年教师科研项目,巢湖学院科研启动基金
摘    要:缩略语自动识别意义重大,有助于提高自动分词和标注的准确率、及时快捷地编写缩略语词典。缩略语自动识别的内容主要有:自动抽取、自动还原、面向中文信息处理的分类体系、缩略语知识库建设等。研究方法上,依托语料库和缩略机制,自觉地把基于规则和统计的方法结合起来。缩略语自动识别研究取得了较大的进展:研究目标明确;进行了一定程度的实验和工程化,识别的准确率和召回率都达到了一定的高度;建立了高质量的缩略语知识库。但也还存在一些问题,研究大都还是初步的,实验的规模也较小,识别的准确率和召回率还不太高,离实用尚有距离。

关 键 词:缩略语  未登录词  中文信息处理  自动识别

Current Conditions and Trends on Studies of Automatic Identification of Modem Chinese Abbreviations
DING Jun-miao.Current Conditions and Trends on Studies of Automatic Identification of Modem Chinese Abbreviations[J].Journal of Weinan Teachers College,2008,23(6):39-43.
Authors:DING Jun-miao
Institution:DING Jun-miao (1. Department of Chinese Language and Literature, Chaohu College, Chaohu, 238000, China; 2 College of Chinese Language and Literature, Shaanxi Normal University, Xi'an 710062, China)
Abstract:The automatic identification of abbreviations is of great significance for the automatic segmentation and tagging of Chinese words in the Chinese Information Processing as well as the quick and timely compiling of abbreviation dictionary. The main researching contents of the automatic identification of abbreviations are as follows : the automatic extraction and restoration of abbreviations, the classification system of abbreviations for the Chinese Information Processing and the constructing of abbreviation knowledge-base. Based on Chinese corpus and the mechanism of abbreviation, research methods are based on rules and statistical methods which are often consciously combined. By far the Automatic identification of abbreviations has made great progresses ; the research goals are more and more clear, many experiments have been made, and the experiments show by close testing that the recall rate and correct rate are higher; the high-quality abbreviation knowledge - base has been successfully established. Of course, there are still some problems about the automatic identification of abbreviations existed, such as most of the researches are still preliminary, the scale of experiment is still small, the recall rate and correct rate are not better, and the last research results haven't indeed been put into practice efficiently.
Keywords:abbreviation  unknown words  Chinese information processing  automatic identification
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号