基于既定词表的自适应汉语分词技术研究 Study of Self-adaptive Matching Method in Chinese Segmentation Based on Decided Vocabulary期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于既定词表的自适应汉语分词技术研究

引用本文：	黄水清,程冲.基于既定词表的自适应汉语分词技术研究[J].现代图书情报技术,2006,1(5):13-17.

作者姓名：	黄水清程冲

作者单位：	南京农业大学信息科技学院,南京,210095

摘要：	提出一种汉语分词算法，在给定的分词词表的基础上进行汉语分词时，不但能成功切分出分词词表中已有的词，而且能同时自动识别出分词词表中没有的词，即未登录词。与逆向最长匹配法以及其他未登录词识别算法进行的测试比较表明，该分词算法可以有效地解决大多数未登录词的识别问题，并且能减少分词错误，同时对分词算法的效率基本没有影响。
关键词：	新词识别未登录词
收稿时间：	2005-12-01
修稿时间：	2006-02-07
Study of Self-adaptive Matching Method in Chinese Segmentation Based on Decided Vocabulary

Huang Shuiqing,Cheng Chong.Study of Self-adaptive Matching Method in Chinese Segmentation Based on Decided Vocabulary[J].New Technology of Library and Information Service,2006,1(5):13-17.

Authors:	Huang Shuiqing Cheng Chong

Institution:	College of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, China

Abstract:	This paper presents an algorithm of self- adaptive matching method in Chinese segmentation. This algorithm not only identifies Chinese words in vocabulary successfully but also identifies unlisted words which are not in vocabulary on basis of decided vocabulary automatically. The test which compares this algorithm with Reverse Maximum Matching Method and some methods which identify unlisted words proves that it can resolve unknown words segmentation effectively, decreases mistakes of Chinese segmentation and has no effect on the efficiency of Chinese segmentation largely.

Keywords:	Automatic segmentation New word identification Unlisted words
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《现代图书情报技术》浏览原始摘要信息
	点击此处可从《现代图书情报技术》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏