首页 | 本学科首页   官方微博 | 高级检索  
     检索      

范例推理在文本自动分类中的应用研究
引用本文:耿焕同,李杰.范例推理在文本自动分类中的应用研究[J].情报理论与实践,2007,30(6):837-840.
作者姓名:耿焕同  李杰
作者单位:1. 南京信息工程大学,计算机与软件学院,江苏,南京,210044;安徽师范大学,计算机系,安徽,芜湖,241000
2. 安徽师范大学,计算机系,安徽,芜湖,241000
基金项目:安徽省科技厅软科学基金;安徽省高校青年教师科研项目
摘    要:文本自动分类是文本信息处理中的一项基础性工作。将范例推理应用于文本分类中,并利用词语间的词共现信息从文本中抽取主题词和频繁词共现项目集,以及借助聚类算法对范例库进行索引,实现了基于范例推理的文本自动分类系统。实验表明,与基于TFIDF的文本表示方法和最近邻分类算法相比,基于词共现信息的文本表示方法和范例库的聚类索引能有效地改善分类的准确性和效率,从而拓宽了范例推理的应用领域。

关 键 词:推理  文本分类  聚类
修稿时间:2007-05-25

Research on the Application of Case-based Reasoning in Text Auto-categorization
Geng Huantong.Research on the Application of Case-based Reasoning in Text Auto-categorization[J].Information Studies:Theory & Application,2007,30(6):837-840.
Authors:Geng Huantong
Abstract:Text auto-categorization is a foundational task in text information processing. This article discusses how to apply case-based reasoning in text categorization. It uses the term co-occurrence information among words to extract the topic words and the frequent term co-occurrence item set in text, and the clustering algorithm to index the case base, thus constructing a text auto-categorization system based on case-based reasoning. The experimental results show that the text representation method based on the term co-occurrence information and the clustering index of case base can effectively improve the precision and efficiency of categorization compared with the text representation method based on TFIDF and the Nearest Neighbor Categorization Mgorithm, thereby widening the application range of case-based reasoning.
Keywords:inference  text classification  clustering
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号