首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于Uni LM模型的学术文摘观点自动生成研究
引用本文:曾江峰,刘园园,程征,段尧清.基于Uni LM模型的学术文摘观点自动生成研究[J].图书情报工作,2023,67(2):131-139.
作者姓名:曾江峰  刘园园  程征  段尧清
作者单位:华中师范大学信息管理学院 武汉 430079
基金项目:本文系教育部人文社会科学青年项目“情境大数据驱动的社交媒体虚假信息识别模型与治理策略研究”(项目编号:21YJC870002)和中央高校基本科研业务费资助项目“信息交互行为与隐私保护研究”(项目编号:CCNU22QN017)研究成果之一。
摘    要:目的 /意义]将海量学术文本观点提取工作由人工转向机器,提高效率的同时又能够保证观点提取的准确性、客观性。方法 /过程]使用UniLM统一语言预训练模型,训练过程中对模型进行精调,以人工标注数据集进行机器学习。将学术文摘作为长度为a的文本序列,经过机器学习,生成长度为b的句子序列(a≥b),并且作为学术论文观点句输出。结果 /结论 ]研究结果表明:UniLM模型对于规范型文摘、半规范型文摘、非规范型文摘观点生成精准度分别为94.36%、77.27%、57.43%,规范型文摘生成效果最好。将机器学习模型应用于长文本观点生成,为学术论文观点生成提供一种新方法。不足之处在于本文模型依赖文摘的结构性,对非规范型文摘观点生成效果有所欠缺。

关 键 词:学术文摘  观点自动生成  UniLM模型  机器学习
收稿时间:2022-07-25
修稿时间:2022-10-06

An Automatic Generation Study of Academic Abstract Viewpoints Based on the UniLM Model
Zeng Jiangfeng,Liu Yuanyuan,Cheng Zheng,Duan Yaoqing.An Automatic Generation Study of Academic Abstract Viewpoints Based on the UniLM Model[J].Library and Information Service,2023,67(2):131-139.
Authors:Zeng Jiangfeng  Liu Yuanyuan  Cheng Zheng  Duan Yaoqing
Institution:School of Information Management, Central China Normal University, Wuhan 430079
Abstract:Purpose/Significance] The extraction of views from massive academic texts has shifted from manual to machine, which improves efficiency and ensures the accuracy and objectivity of point of view extraction.Method/Process] Pre-train models using UniLM unified language, fine-tuning the model during training, and machine learning with manually labeled datasets. Using the academic abstract as a sequence of text of length a, after machine learning, it was possible to generate a sentence sequence of length b (a ≥ b) and output as an academic paper point of view sentence.Result/Conclusion] The results show that the UniLM model has the best effect on the generation of normative abstracts with 94.36%, semi-canonical abstracts with 77.27%, and non-normative abstracts with 57.43%. Applying machine learning models to long text idea generation provides a new approach to academic paper idea generation. The disadvantage is that the model of this paper relies on the structure of the abstract, and the effect of generating non-normative abstract views is lacking.
Keywords:academic abstracts  automatic generation of ideas  UniLM models  machine learning  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号