首页 | 本学科首页   官方微博 | 高级检索  
     检索      

多示例多标签学习在中文专利自动分类中的应用研究
引用本文:包翔,刘桂锋,崔靖华.多示例多标签学习在中文专利自动分类中的应用研究[J].图书情报工作,2021,65(8):107-113.
作者姓名:包翔  刘桂锋  崔靖华
作者单位:1.江苏大学科技信息研究所 镇江 212013;2.南京大学信息管理学院 南京 210093
基金项目:本文系江苏省高校哲学社会科学研究一般项目"主题模型在高校图书馆知识产权信息服务中的研究与实践"(项目编号:2019SJA1870)和江苏省高校自然科学研究面上项目"基于多示例多标签学习及深度神经网络的专利主题分类研究"(项目编号:19KJB520005)研究成果之一。
摘    要:目的/意义] 旨在对大量的中文专利实现快速分类,满足专利审查以及情报分析等工作的要求。方法/过程] 结合专利文本的固有格式以及存在多个IPC分类号的实际情况,将多示例多标签学习应用于专利自动分类中,在介绍几种经典的多示例多标签模型的基本原理之后,将这些模型运用于中文专利IPC分类号的确定。结果/结论] 实验证明,多示例多标签模型适合运用在专利的自动分类中,并且从Average precision、Hamming Loss、Ranking Loss、One Error、Coverage、Training time等指标分析可以发现,MIMLRBF模型能快速、准确地运用在中文专利IPC分类号的确定中,为大规模专利的自动分类提供借鉴。

关 键 词:专利  分类  IPC分类号  多示例多标签  
收稿时间:2020-10-28
修稿时间:2021-01-13

Application of Multi Instance Multi Label Learning in Chinese Patent Automatic Classification
Bao Xiang,Liu Guifeng,Cui Jinghua.Application of Multi Instance Multi Label Learning in Chinese Patent Automatic Classification[J].Library and Information Service,2021,65(8):107-113.
Authors:Bao Xiang  Liu Guifeng  Cui Jinghua
Institution:1.Institute of Science and Technology Information, Jiangsu University, Zhenjiang 212013;2.School of Information Management, Nanjing University, Nanjing 210093
Abstract:Purpose/significance] In order to achieve rapid classification in a large number of Chinese patents to meet the requirements of patent examination and intelligence analysis.Method/process] Combined with the inherent format of patent text and the fact that there are multiple classification numbers, this paper applied multi-instance multi-label learning to automatic patent classification. Firstly, several classical multi-instance multi-label learning methods were introduced, and then these methods were applied to determine IPC number of Chinese patent.Result/conclusion] It is experimentally demonstrated that the multi-instance multi-label learning methods are suitable for patent automatic classification, according to average precision, hamming loss, ranking loss, one error, coverage, training time, it is found that MIMLRBF can be used to determine the IPC number of Chinese patents quickly and accurately, which provides a new perspective for classifying large-scale patents.
Keywords:patent  classification  IPC  multi-instance multi-label  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号