首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Web上基于特定主题的RG-HITS算法研究
引用本文:丁一.Web上基于特定主题的RG-HITS算法研究[J].现代图书情报技术,2005,21(6):26-29.
作者姓名:丁一
作者单位:湖北师范学院计算机科学与技术系,黄石,435000
摘    要:Web 信息检索(Information Retrieval)技术研究是应用文本检索研究的成果,它结合Web图论的思想,研究Web上的信息检索,是行之有效的Web知识发现的途径。传统HITS方法所获得的信息精确度相当低,而PageRank作为一通用的搜索方法,不能够应用于特定主题的信息获取。在充分分析了PageRank、HITS等现有算法和Web文档的相似度计算方法的基础上,提出了Web上查询特定主题相关信息发现的RG-HITS算法。它结合了Web超链接、网页知识表示的信息相关度以及HITS方法来搜索Web上特定主题的相关知识。

关 键 词:知识发现  网页搜索  相似度计算  信息检索
收稿时间:2005-02-08

on the Specific Topic on Web
Ding Yi.on the Specific Topic on Web[J].New Technology of Library and Information Service,2005,21(6):26-29.
Authors:Ding Yi
Institution:(Department of Computer Science and Technology, Hubei Normal University, Huangshi 435000, China)
Abstract:Information Retrieval (IR) on the Web is the automatic retrieval of all relevant documents, the same as resource finding of intended Web documents, while the same time retrieves as few of the non - relevant as possible. Web IR has become very popular and favorite at present. It concentrates on the using traditional text IR methods in the Internet, as well as the properties of Web graph. This research focuses on how to effectively and broadly get relevant Web pages and contents, filter Web pages and assign proper labels for them. Accurate finding user-specific information in the Web is very difficult. And traditional Web search engines take a query as input and produce a set of (hopefully) relevant pages that match the query terms. While useful in many circumstances, search engines have the disadvantage that users have to formulate queries that specify their information need, which is prone to errors. Based on the discussion of Page Rank, HITS and similarity between Web texts, some new algorithms called RG-HITS ( Resemblance Graph-HITS) for finding relevant documents on the Web are introduced.
Keywords:Web mining Web search Similarity scoring Information retrieval
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《现代图书情报技术》浏览原始摘要信息
点击此处可从《现代图书情报技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号