基于潜在语义分析和改进的HS-SVM的文本分类模型研究 Research of Text Classification Model Based on Latent Semantic Analysis and Improved of HS-SVM期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于潜在语义分析和改进的HS-SVM的文本分类模型研究

引用本文：	张玉峰,何超.基于潜在语义分析和改进的HS-SVM的文本分类模型研究[J].图书情报工作,2010,54(10):109-113.

作者姓名：	张玉峰何超

作者单位：	武汉大学信息资源研究中心

基金项目：	教育部人文社会科学重点研究基地重大项目

摘要：	为提高文本分类的准确性与效率，提出一种基于潜在语义分析和改进的超球支持向量机的文本分类模型。该模型利用潜在语义分析进行特征抽取，消除同义词和多义词在文本表示时所造成的偏差，实现文本向量的降维。针对超球重叠区域的文本分类问题，设计一种新的决策方法-基于密集度的决策策略。实验结果表明，该模型在类别数目较小时具有较好的分类效果，改进的算法有效可行。
关键词：	文本分类潜在语义分析改进的超球支持向量机重叠区域文本
收稿时间：	2010-01-06
Research of Text Classification Model Based on Latent Semantic Analysis and Improved of HS-SVM

Zhang Yufeng,He Chao.Research of Text Classification Model Based on Latent Semantic Analysis and Improved of HS-SVM[J].Library and Information Service,2010,54(10):109-113.

Authors:	Zhang Yufeng He Chao

Institution:	1. Center for Studies of Information Resources of Wuhan University,;2. Center for Studies of Information Resources of Wuhan University,;

Abstract:	A text classification model, which is based on Latent Semantic Analysis and Improved of Hyper-sphere Support Vector Machine, is proposed in order to improve the accuracy and efficiency of text classification. Using the latent semantic analysis for feature extraction in this model, the affect of synonymy and polysemy in text representation process is eliminated and the dimension of text vector is reduced. A new approach to decision making, which is based on the intensity, is designed for the text classification of ultra-overlapping regions in the ball. Experimental results show that the model will give a good classification results when the number of the classes is small. The improved algorithm is effective and feasible.

Keywords:	text classification latent semantic analysis improved hyper-sphere support vector machine text in overlapping regions
本文献已被万方数据等数据库收录！
	点击此处可从《图书情报工作》浏览原始摘要信息
	点击此处可从《图书情报工作》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏