首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于粒子群的模糊C均值文本聚类算法研究
引用本文:高劲松,张俊丽.基于粒子群的模糊C均值文本聚类算法研究[J].图书情报工作,2010,54(6):57-65.
作者姓名:高劲松  张俊丽
作者单位:1. 华中师范大学信息管理系;2. 南京大学信息管理系;
摘    要:利用模糊C均值算法解决文本聚类问题时,随机选取的初始聚类中心和聚类数会导致不同的聚类结果,且容易陷入局部最优。提出利用粒子群优化算法确定模糊C均值的初始聚类中心,并通过向量空间模型和特征提取,再利用模糊C均值进行文档聚类。实验表明,这种基于粒子群的模糊C均值聚类算法迭代次数少,能解决经典模糊C均值算法对初始值敏感和易陷入局部极小的缺点,且聚类速度和效果得到明显提高。

关 键 词:模糊C均值  粒子群  文本聚类  
收稿时间:2009-08-30

The Algorithm Research on Particle Swarm Based on Fuzzy C-Means Text Clustering
Gao Jinsong,Zhang Junli.The Algorithm Research on Particle Swarm Based on Fuzzy C-Means Text Clustering[J].Library and Information Service,2010,54(6):57-65.
Authors:Gao Jinsong  Zhang Junli
Institution:1. Department of Information Management, HuaZhong Normal University,;2. Department of Information Management, Nanjing University,;
Abstract:The classical fuzzy c means clustering algorithm, which is used to clustering Chinese text, is sensitive to the initial clustering center and the clustering number, it also has the limitation of converging to the local infinitesimal point. In this paper, a fuzzy c means clustering algorithm based on particle swarm optimization is proposed to cluster Chinese text, the particle swarm optimization helps determining the initial clustering center, furthermore using the vector space model and features extraction preprocessed, then a fuzzy c means clustering is used for text clustering. The experimental results show that this algorithm avoids the limitation of fuzzy c means and is obviously superior to the classical fuzzy c means in accuracy ratio and clustering performances.
Keywords:fuzzy c-means  particle swarm  text clustering  
本文献已被 万方数据 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号