首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于统计-规则方法的网页层次分类技术研究
引用本文:谭金波,杨晓江,李艺.基于统计-规则方法的网页层次分类技术研究[J].现代图书情报技术,2007,2(8):59-62.
作者姓名:谭金波  杨晓江  李艺
作者单位:1. 山东师范大学教育技术系,济南,250014
2. 南京师范大学教育技术系,南京,210097
摘    要:基于统计的自动分类是网页层次分类中常用的技术,但其有不足之处,主要表现为当子类之间出现严重的特征交叉现象时,分类精确率将大大下降。而网页层次分类的本质决定了同一大类下的子类存在许多相同的特征。针对这一局限性,结合基于规则的自动分类技术的优点,提出一种基于统计-规则方法的网页层次分类技术。实验表明,基于统计-规则方法的网页层次分类技术能够获得比较理想的分类效果。

关 键 词:统计分类  网页层次分类  基于统计-规则的分类
收稿时间:2007-06-11
修稿时间:2007-06-11

Study of Statistics-rule Based Hierarchical Web Page Classification
Tan Jinbo,Yang Xiaojiang,Li Yi.Study of Statistics-rule Based Hierarchical Web Page Classification[J].New Technology of Library and Information Service,2007,2(8):59-62.
Authors:Tan Jinbo  Yang Xiaojiang  Li Yi
Institution:1.Educational Technology Department, Shandong Normal University, Jinan 250014, China;2. Educational Technology Department, Nanjing Normal University, Nanjing 210097, China
Abstract:Statistics-based classification methods are common-used in hierarchical Web classification.However,classification precision of statistics-based methods often drops when categories are very similar to each other because of feature overlapping.Due to the nature of hierarchical Web classification,categories sharing the same parent(e.g.,leaf categories in the hierarchy) are often very similar to each other.To improve the classification precision,the paper proposes to use rule-based classification methods on top of statistics-based methods in hierarchical Web classification.Experiments show that our methods perform well on our education Web collections.
Keywords:Statistics-based classification Rule-based classification Hierarchical Web classification Statistics-rule based classification
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号