多类多标签汉语文本自动分类的研究 Research on the Chinese Text Categorization of Multi-Classification and Multi-Label期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

多类多标签汉语文本自动分类的研究

引用本文：	施彤年,卢忠良,荣融,王家云.多类多标签汉语文本自动分类的研究[J].情报学报,2003,22(3):306-309.

作者姓名：	施彤年卢忠良荣融王家云

作者单位：	1. 上海交通大学计算机与工程系,上海,200032 2. 国防科技大学电子科学与工程学院,长沙,410073 3. 解放军61587部队,上海,200336

摘要：	本文提出了一种高效的汉语文本分类方法 ,并在实验中收到了良好的效果。由于汉语文本的特殊性 ,在训练前对训练文本进行自动分词和降维预处理。许多文本往往可能归到多个类 ,分类算法采用改进的Boosting算法。实验表明 ,在多类多标签的汉语文本特征提取和文档分类中 ,该算法收敛快、准确性高、综合效果较好
关键词：	多类多标签分词降维弱假设弱学习
修稿时间：	2002年4月21日
Research on the Chinese Text Categorization of Multi-Classification and Multi-Label

Shi Tongnian.Research on the Chinese Text Categorization of Multi-Classification and Multi-Label[J].Journal of the China Society for Scientific andTechnical Information,2003,22(3):306-309.

Authors:	Shi Tongnian

Abstract:	This paper has initiated a high efficiency method of the Chinese text categorization, which has led to good experimenting results. On account of the uniqueness of the Chinese texts, word segmenting and space reducing are done preliminarily to the training text. The given texts can always be classified into different classes. Therefore, the algorithm here adopted an improved Boosting Algorithm. The experiments proved that in abstracting the characteristics and classifying the documents of Chinese texts under the Multi Classification and Multi Label, this algorithm is of high accuracy and quick convergence, which improved the classifying efficiency.

Keywords:	Multi Classification and Multi Label word segmentation space reduction weak hypotheses weak learner
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏