首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于互信息的二阶共现概念相关度研究
引用本文:刘菊红,缪有刚,于建荣.基于互信息的二阶共现概念相关度研究[J].图书情报工作,2009,53(18):123-127.
作者姓名:刘菊红  缪有刚  于建荣
作者单位:1. 中国科学院国家科学图书馆;2. 中国科学院上海生命科学信息中心;
摘    要:中间集和目标集的膨胀,导致基于非相关文献知识发现的准确率低;基于排序的方法存在缺陷,且过度关注B集的排序是对发现有趣的A、C的偏离。直接计算二阶共现概念相关度是基于非相关文献知识发现的薄弱环节,以互信息方法和回归分析方法为基础,构造算法计算二阶共现概念之间的相关度。以PubMed收录的2型糖尿病领域文献为样本,对算法的可行性进行实证研究。模型取得较好的效果,为二阶共现概念之间的关系提取和评价提供新的方法。

关 键 词:互信息  二阶共现  相关度  2型糖尿病  基于非相关文献的知识发现  
收稿时间:2009-02-10
修稿时间:2009-04-16

Research of Correlation Strength of Second Order Co-Occurrence Concepts Based on Mutual Information
Liu Juhong,Miao Yougang,Yu Jianrong.Research of Correlation Strength of Second Order Co-Occurrence Concepts Based on Mutual Information[J].Library and Information Service,2009,53(18):123-127.
Authors:Liu Juhong  Miao Yougang  Yu Jianrong
Institution:1. hanghai Information Center for Life Sciences, Chinese Academy of Sciences,;2. National Science Library, Chinese Academy of Sciences,;
Abstract:Explosion of intermediate concepts (B terms) and aim concepts(C terms) results in low correctness of disjoint literature based discovery. The method of ranking has drawbacks and focus on ranking of B terms is a departure of discovering interesting relationship between A terms and C terms. The paper designs a model to calculate correlation strength of second order co occurrence concepts directly based on mutual information measure and regression model. Taking concepts from diabetes mellitus, type 2 from PubMed as an example to test feasibility of the model and gain good effects. The model provides a new method to the relation extraction of Second Order Co Occurrence Concepts.
Keywords:mutual information  second order co occurrence  correlation strength  diabetes mellitus type 2  disjoint literature based discovery  
本文献已被 万方数据 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号