首页 | 本学科首页   官方微博 | 高级检索  
     检索      

《中国分类主题词表》的自动扩充研究——从元数据中提取关键词并定位
引用本文:吕美香.《中国分类主题词表》的自动扩充研究——从元数据中提取关键词并定位[J].情报科学,2012(8):1160-1166.
作者姓名:吕美香
作者单位:南京大学信息管理学院
摘    要:词表是图书馆和信息检索领域最重要的知识组织工具,《中国分类主题词表》是传统词表的一种,它的更新和维护一直依靠手工进行,这制约了它在数字图书馆和网络信息环境下的应用。本文介绍了一项基于统计的、从元数据的标题中抽取关键词并定位在词表中的方法。大致包括三个步骤:从标题中提取关键词;确定抽取出的关键词的专指度;将专指度高的专业词汇定位在词表中。在《中国分类主题词表》和上海图书馆提供的计算机科技领域的元数据上所进行实验,结果证明该方法是可行的。这一方法可以应用到自动标引或编目中,有一定的实用性和广阔的应用前景。

关 键 词:中国分类主题词表  元数据  关键词提取  中文信息处理

Research on Autogenous Enlargement of Chinese Classified Thesaurus——Keyword Extracting from Metadata
LV Mei-xiang.Research on Autogenous Enlargement of Chinese Classified Thesaurus——Keyword Extracting from Metadata[J].Information Science,2012(8):1160-1166.
Authors:LV Mei-xiang
Institution:LV Mei-xiang(College of Information Management,Nanjing University,Nanjing 210093,China)
Abstract:Chinese Classified Thesaurus,a kind of traditional thesauri,is a very important knowledge organization tool in the library and information retrieval field,and its application in digital libraries in the network information environment is seriously constrained by the manual nature of current thesaurus maintenance mechanism.This paper proposes a statistical method of extracting new terms from titles of metadata and settling them into the thesaurus.It includes about there steps: extracting new terms from titles;calculating the specialization degree of the terms;setting the terms with high specialization degree into the thesaurus.An experiment was conducted on the Chinese Classified Thesaurus and a corpus of metadata of computing domain supplied by ShangHai Library.The successful result demonstrates that the techniques proposed are effective.The method can be applied to automatic indexing or cataloging,and has a definite practicability with broad application prospect.
Keywords:Chinese classified thesaurus  metadata  keyword extraction  Chinese information processing
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号