首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种基于近邻匹配的中文分词算法Jlppeccz
引用本文:耿新青,陶凤梅,黄宏光.一种基于近邻匹配的中文分词算法Jlppeccz[J].鞍山师范学院学报,2010,12(4):46-48.
作者姓名:耿新青  陶凤梅  黄宏光
作者单位:鞍山师范学院,数学系,辽宁,鞍山,114007
基金项目:国家自然科学基金资助项目 
摘    要:提出一种基于近邻匹配新的分词算法Jlppeccz,该算法首先把一篇文章以标点符号为界线分成若干个句子,然后用近邻匹配方法把一句话切分成1~4字的词,通过对词库的搜索,对已分的词进行重组,把小词合并成大词,再将处理过的词存储到一个临时的词库里,以备后续的句子查找,并可实现对词库添加词的功能.与经典MM算法和词频统计方法相比,本文算法有较大的改进.

关 键 词:中文分词  近邻匹配  分词系统

Jlppeccz:A New Word Segmentation Algorithm Based on Neiboring Match
GENG Xin-qing,TAO Feng-mei,HUANG Hong-guang.Jlppeccz:A New Word Segmentation Algorithm Based on Neiboring Match[J].Journal of Anshan Teachers College,2010,12(4):46-48.
Authors:GENG Xin-qing  TAO Feng-mei  HUANG Hong-guang
Institution:GENG Xin-qing,TAO Feng-mei,HUANG Hong-guang(Department of Mathematics,Anshan Normal University,Anshan Liaoning 114007,China)
Abstract:This paper presents a new Chinese word segmentation algorithm Jlppeccz based on neighboring match.The traditional MM algorithm which may easily produce ambiguity depends on dictionary strongly.JIppeccz algorithm divided a article into some sentences with the benchmark of punctuation mark,then one sentence is cut into one word or multiword by neighboring match.The database of the words is searched;the words which have been divided are recombined;the small phrases are combined into the big ones,the words are ...
Keywords:Chinese word segmentation  Neighboring match  Word segmentation system  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号