基于超链分析的Web资源自动发现技术 Web Resource Automatic Discovery Based on Hyperlink Analysis期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于超链分析的Web资源自动发现技术

引用本文：	陈定权.基于超链分析的Web资源自动发现技术[J].图书情报工作,2003,47(9):94-98.

作者姓名：	陈定权

作者单位：	中国科学院文献情报中心北京 100080

摘要：	传统的Web资源自动发现是基于Web页面内容实现的。本文试图从超链分析的角度探讨Web资源的自动发现技术。超链分析技术起源于社会网络分析和科学引文分析理论,它只分析页面之间的关系,而不关心页面本身的属性。通过试验证明,单纯使用超链,根据用户提供的网页实例,我们能够自动发现与学科资源相关的网站。该技术可以有效的减少网络爬行器的无谓爬行,提高采集效率,减轻网络负担,在学科资源建设中起了重要的作用。
关键词：	Web资源自动发现超链分析 HITS 主题爬行
收稿时间：	2003-03-05
Web Resource Automatic Discovery Based on Hyperlink Analysis

Chen Dingquan.Web Resource Automatic Discovery Based on Hyperlink Analysis[J].Library and Information Service,2003,47(9):94-98.

Authors:	Chen Dingquan

Institution:	The Library of Chinese Academy of Sciences, Beijing

Abstract:	The traditional Web resource automatic discovery is based on page content. However, this paper discusses the technology of Web resource automatic discovery from the viewpoint of hyperlink. Hyperlink analysis is originated from social network analysis and science citation analysis, which only analyzes the rela-tions among Web pages, not the Web page content. Through our experiments, the result proves that we can discover many subject - related Web resources according to subject examples ( URLs) only using hyperlink analysis. The technology can provide high quality URLs to Web spiders, improve the efficiency of crawling, and lighten the load burden of network.

Keywords:	Web resource automatic discovery hyperlink analysis HITS focus crawling
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《图书情报工作》浏览原始摘要信息
	点击此处可从《图书情报工作》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏