首页 | 本学科首页   官方微博 | 高级检索  
     检索      

利用本体关联度改进的TF-IDF特征词提取方法
引用本文:徐建民,王金花,马伟瑜.利用本体关联度改进的TF-IDF特征词提取方法[J].情报科学,2011(2).
作者姓名:徐建民  王金花  马伟瑜
作者单位:河北大学工商学院;河北大学管理学院;河北大学数学与计算机学院;
基金项目:国家博士后科学基金资助项目(20070420700)
摘    要:针对传统TF-IDF方法提取文本特征词时未考虑词语间关系的不足,提出一种利用本体关联度改进的文本特征词提取方法。该方法首先利用传统的TF-IDF方法构建候选特征词集合和非候选特征词集合,然后根据领域本体知识在非候选特征词集合中提取候选特征词的本体关联词,利用候选特征词与其本体关联词之间的本体关联度以及本体关联词本身的权重调整候选特征词的权重,得到新的候选特征词权重排序。实验证明,该方法能够有效提高文本特征词提取的准确度。

关 键 词:文本特征词提取  TF-IDF  本体关联词  本体关联度  

Improved TF-IDF Feature Selection Method Based on Ontology Relative Degree
XU Jian-min,WANG Jin-hua,MA Wei-yu.Improved TF-IDF Feature Selection Method Based on Ontology Relative Degree[J].Information Science,2011(2).
Authors:XU Jian-min  WANG Jin-hua  MA Wei-yu
Institution:XU Jian-min1,WANG Jin-hua2,MA Wei-yu3(1.School of Industry and Commerce,Hebei University,Baoding 071002,China,2.School of Management,3.School of Mathematics and Computer,China)
Abstract:A method of improved feature extraction based on Ontology was proposed to compensate for the weakness of Traditional TF-IDF that Traditional TF-IDF does not consider the relation between the words.This method gets a set of candidate feature words which are the previous n words and a set of non-candidate feature words by Traditional TF-IDF,and gets a set of ontology associated concepts by the ontology relative degree;last,adjusts the weights of the feature keys by the ontology relative degree and the weights...
Keywords:feature extraction  TF-IDF  ontology relative term  ontology relative degree  
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号