首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于字角色标注的中文书目关键词标引研究
引用本文:邓三鸿,王昊,秦嘉杭,苏新宁.基于字角色标注的中文书目关键词标引研究[J].中国图书馆学报,2012,38(2):38-49.
作者姓名:邓三鸿  王昊  秦嘉杭  苏新宁
作者单位:1. 南京大学信息管理系,210093
2. 南京财经大学,210046
基金项目:本文系国家社科基金项目“面向语义网本体的知识管理研究”(编号:09CTQ010)的研究成果之一。
摘    要:中文书目机器自动标引是数字图书馆建设中亟待解决的关键问题之一。本文试图将条件随机场(CRFs)序列标注机器学习算法引入到关键词抽取中,建立面向图书内容、基于字角色标注的中文书目关键词标引模型。将图书内容转化为字序列,进而提出构建关键词角色空间模型和综合利用字序列上下文特征的设计思路。通过实验,从题名和内容提要中分别自动抽取关键词,论证该模型的合理性和实用性。

关 键 词:中文书目  关键词标引  字角色  序列标注  自动标引

Research on Keywords Indexing for Chinese Bibliography Based on Word Roles Annotation
Deng Sanhong,Wang Hao,Qin Jiahang and Su Xinning.Research on Keywords Indexing for Chinese Bibliography Based on Word Roles Annotation[J].Journal of Library Science In China,2012,38(2):38-49.
Authors:Deng Sanhong  Wang Hao  Qin Jiahang and Su Xinning
Institution:Deng Sanhong,Wang Hao,Qin Jiahang & Su Xinning
Abstract:Automatic indexing by computers for Chinese bibliography has become one of the most critical problems which should be solved immediately in digital library construction.This paper tries to introduce Conditional Random Fields(CFRs) algorithm into the keyword extraction of Chinese bibliography,and builds the model which faces book contents based on the word roles annotation.The model turns the book contents into sequences of words.Based on that,an idea which combines word roles space model building with context features of word sequence comprehensive utilization has been proposed.Moreover,the paper also verifies the rationality and practicality of the model by showing the experiment of automatically extracting keywords from titles and abstracts.
Keywords:Chinese bibliography  Keywords indexing  Word roles  Sequence annotation  Automatic indexing
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《中国图书馆学报》浏览原始摘要信息
点击此处可从《中国图书馆学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号