首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于潜在语义分析和改进的HS-SVM的文本分类模型研究
引用本文:张玉峰,何超.基于潜在语义分析和改进的HS-SVM的文本分类模型研究[J].图书情报工作,2010,54(10):109-113.
作者姓名:张玉峰  何超
作者单位:武汉大学信息资源研究中心
基金项目:教育部人文社会科学重点研究基地重大项目
摘    要:为提高文本分类的准确性与效率,提出一种基于潜在语义分析和改进的超球支持向量机的文本分类模型。该模型利用潜在语义分析进行特征抽取,消除同义词和多义词在文本表示时所造成的偏差,实现文本向量的降维。针对超球重叠区域的文本分类问题,设计一种新的决策方法-基于密集度的决策策略。实验结果表明,该模型在类别数目较小时具有较好的分类效果,改进的算法有效可行。

关 键 词:文本分类  潜在语义分析  改进的超球支持向量机  重叠区域文本  
收稿时间:2010-01-06

Research of Text Classification Model Based on Latent Semantic Analysis and Improved of HS-SVM
Zhang Yufeng,He Chao.Research of Text Classification Model Based on Latent Semantic Analysis and Improved of HS-SVM[J].Library and Information Service,2010,54(10):109-113.
Authors:Zhang Yufeng  He Chao
Institution:1. Center for Studies of Information Resources of Wuhan University,;2. Center for Studies of Information Resources of Wuhan University,;
Abstract:A text classification model, which is based on Latent Semantic Analysis and Improved of Hyper-sphere Support Vector Machine, is proposed in order to improve the accuracy and efficiency of text classification. Using the latent semantic analysis for feature extraction in this model, the affect of synonymy and polysemy in text representation process is eliminated and the dimension of text vector is reduced. A new approach to decision making, which is based on the intensity, is designed for the text classification of ultra-overlapping regions in the ball. Experimental results show that the model will give a good classification results when the number of the classes is small. The improved algorithm is effective and feasible.
Keywords:text classification  latent semantic analysis  improved hyper-sphere support vector machine  text in overlapping regions  
本文献已被 万方数据 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号