首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于机器学习的人工智能技术专利数据集构建新策略
引用本文:陈悦,宋凯,刘安蓉,曹晓阳.基于机器学习的人工智能技术专利数据集构建新策略[J].情报学报,2021(3):286-296.
作者姓名:陈悦  宋凯  刘安蓉  曹晓阳
作者单位:大连理工大学科学学与科技管理研究所暨WISE实验室;中国工程科技创新战略研究院
基金项目:国家重点研发计划项目“颠覆性技术识别理论、方法与专家预判系统”(2019YFA0707201);中央高校基本科研业务费资助项目“技术供给侧的人工智能技术颠覆性轨迹及其发展规律研究”(DUT20RW202)。
摘    要:颠覆性技术是一个具有复杂的内在结构的技术群。从空间维度来看,颠覆性技术是包含了主导技术、辅助技术、支撑技术的复杂技术群,涉及多学科、多领域。在此背景下,运用科学计量的方法对颠覆性技术进行科技评价和科学技术演变规律探索面临挑战,实质表现为数据检索。本文探索了一种基于机器学习的专利数据集构建新策略,将专利检索任务作为机器学习的二分类任务,类似于信息检索中基于主动学习的查询分类思想,并提出了将F-measure特征最大化方法与CNN(convolutional neural networks)模型相结合的文本分类改进方法。本文以人工智能(artificial intelligence,AI)技术域为例进行训练实验,实验结果的准确率、召回率和F1值分别达到98.01%、97.04%和97.89%,这表明本文提出的策略能够精准地识别人工智能专利,提高了专利检索的准确率和召回率,以利于构建精、准、全的人工智能技术域专利数据集。

关 键 词:颠覆性技术  专利检索  机器学习  人工智能

Artificial Intelligence Technology:Novel Strategy for Patent Dataset Creation Based on Machine Learning
Chen Yue,Song Kai,Liu Anrong,Cao Xiaoyang.Artificial Intelligence Technology:Novel Strategy for Patent Dataset Creation Based on Machine Learning[J].Journal of the China Society for Scientific andTechnical Information,2021(3):286-296.
Authors:Chen Yue  Song Kai  Liu Anrong  Cao Xiaoyang
Institution:(Institution of Science of Science and S&T Management&WISE Lab,Dalian University of Technology,Dalian 116024;Chinese Academy of Engineering Innovation Strategy,Beijing 100089)
Abstract:Disruptive technology is a technology group with a complex internal structure that spans multiple disciplines and fields,and from a spatial perspective,it includes leading,auxiliary,and supporting technologies.The use of scientometrics to evaluate disruptive technologies and explore the evolution of science and technology is facing challenges that manifest in data retrieval.This paper explores a novel strategy for patent dataset construction for complex technology based on machine learning,with a focus on the patent retrieval tasks(binary classification tasks)of machine learning.This is similar to query classification,which is based on active learning in information retrieval.Additionally,we propose an improved text classification method that combines feature maximization with the CNN model.In this paper,the technical domain of artificial intelligence(AI)is used as an example.The results show an accuracy of 98.01%,a recall rate of 97.04%,and an F1 value of 97.89%;this demonstrates that the proposed strategy accurately identifies AI patents,improves the accuracy and recall rate of patent searches,and facilitates the creation of accurate and comprehensive patent datasets for the technical domain of AI.
Keywords:disruptive technology  patent retrieval  machine learning  artificial intelligence(AI)
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号