基于主动学习的Web页面信息抽取 Information Extraction from Web Pages Based on Active Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于主动学习的Web页面信息抽取

引用本文：	张清军,朱才连.基于主动学习的Web页面信息抽取[J].情报学报,2004,23(6):667-671.

作者姓名：	张清军朱才连

作者单位：	中国科学院测量与地球物理研究所,武汉,430077

摘要：	本文提出一种基于主动学习的Web页面信息抽取方法 ,可以使用户在标记少量具有代表性的样本页面的情况下 ,有效地提高信息抽取规则的覆盖性 ,从而使包装器具有一定的自适应性。
关键词：	主动学习 Web信息抽取包装器
修稿时间：	2003年12月15
Information Extraction from Web Pages Based on Active Learning

Zhang Qingjun and Zhu Cailian.Information Extraction from Web Pages Based on Active Learning[J].Journal of the China Society for Scientific andTechnical Information,2004,23(6):667-671.

Authors:	Zhang Qingjun and Zhu Cailian

Abstract:	In this paper, an approach of information extraction from web pages based on active learning is presented. It can effectively improve covering of information extraction rules by labeling a few representative web pages. So the wrapper can adapt to changes in the sites from which the data is being extracted.

Keywords:	active learning information extraction from web pages wrapper
本文献已被 CNKI 万方数据等数据库收录！