首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于C-value与TF-IDF的文献簇主题识别研究
引用本文:陈仕吉,王小梅.基于C-value与TF-IDF的文献簇主题识别研究[J].情报学报,2009,28(6).
作者姓名:陈仕吉  王小梅
作者单位:1. 中国农业大学图书馆,北京,100193;中国科学院国家科学图书馆,北京,100190;中国科学院研究生院,北京,100049
2. 中国科学院国家科学图书馆,北京,100190
摘    要:引文分析是科技情报分析的一种重要方法和技术,特别是建立在共耦合和共被引基础上的引文聚类分析逐渐发展成为科技情报分析中最活跃的研究领域之一.引文聚类分析形成一系列由科技文献组成的文献簇,并不能直接体现出文献簇的主题,因此需要识别这些文献簇的内容特征.本文分析了引文分析中文献簇主题识别的典型方法及局限,提出了结合C-value和TF-IDF算法的文献簇主题识别方法.实验表明,该方法可以充分地利用C-value和TF-IDF算法的优点,对C-value和TF-IDF算法中不合理的地方予以了改进,从而可以更好地应用于引文分析中文献簇的主题识别.

关 键 词:引文分析  主题识别

The Topic Recognition Research of Paper Cluster by Combining C-value and TF-IDF
Chen Shiji,Wang Xiaomei.The Topic Recognition Research of Paper Cluster by Combining C-value and TF-IDF[J].Journal of the China Society for Scientific andTechnical Information,2009,28(6).
Authors:Chen Shiji  Wang Xiaomei
Abstract:Citation analysis is an important method in information analysis of science and technology and especially citation cluster analysis based on bibliographic coupling or co-citation has become one of the most active research areas. Citation cluster analysis forms a series of paper clusters which consists of science and technology documents. It is necessary to recoginize the topic of these clusters. This paper analyzes some typical approaches of topic recognization in citation analysis and their drawbacks . Then a new method that combines C-value algorithm with TF-IDF algorithm for topic recoginization is proposed. Our experimental results prove that the proposed approach can utilize the merits of C-value algorithm and TF-IDF algorithm,and thus can be better used in topic recognization of paper clusters.
Keywords:C-value  TF-IDF  CV-IDF
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号