首页 | 本学科首页   官方微博 | 高级检索  
     检索      

面向公共安全领域的词典构建及其舆情事件识别研究
引用本文:王连喜.面向公共安全领域的词典构建及其舆情事件识别研究[J].情报探索,2020(2):13-20.
作者姓名:王连喜
作者单位:广东外语外贸大学非通用语种智能处理重点实验室 广东广州 510006;广东外语外贸大学信息科学与技术学院 广东广州 510006
基金项目:国家社会科学基金青年项目“东盟涉华舆情的观点挖掘及信息聚合研究”(项目编号:17CTQ045)成果之一
摘    要:目的/意义]旨在提出一种基于领域词典的突发公共安全领域舆情事件自动识别方法,有效识别公共安全领域的热点舆情事件,预防危机舆情事件,提高政府公信力。方法/过程]首先以中国应急服务网中的公共安全事件语料为数据来源,提取并筛选公共安全领域的高频词汇;然后结合人工干预方式选择部分高频且与领域高度相关的种子词;随后以互信息方法计算种子词与语料中的其他词汇共现概率(点互信息),同时以与种子词具有较高点互信息的词汇作为领域候选词,并结合人工审核方式对候选词汇进行调整。最后在对待识别语料进行文本表示的基础上,将其与词典中的领域词汇进行匹配,并以语料中出现的公共安全领域词汇的数量和权重来判断待识别语料是否为突发公共安全舆情事件。结果/结论]在标注语料上的实验结果表明,与经典的Naive Bayes方法相比,提出的方法能够有效提高公共安全领域热点舆情事件的识别准确率。

关 键 词:领域词典  热点舆情  突发公共安全事件  事件识别

Public Security Field-oriented Dictionary Construction and the Recognition of Public Opinion Events
Wang Lianxi.Public Security Field-oriented Dictionary Construction and the Recognition of Public Opinion Events[J].Information Research,2020(2):13-20.
Authors:Wang Lianxi
Institution:(Guangzhou Key Laboratory of Multilingual Intelligent Processing,Guangdong University of Fordign Studies,Guangzhou Guangdong 510006;School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou Guangdong 510006)
Abstract:Purpose/significance]The paper is to propose an automatic identification method of public opinion emergencies in public security filed based on domain dictionary,so as to effectively identify the hot public opinion events in public security filed,prevent public opinion of crisis events,and enhance the credibility of government.Method/process] Firstly,the paper takes the corpus on public security events in China Emergency Services Platform as data source,extracts and filters the high-frequency words in public security filed,and selects some high-frequency and highly related seed words by manual intervention,then calculates the co-occurrence probability of the seed words and the other words in the corpus by mutual information method,and at the same time,takes the vocabularies which have higher pointwise mutual information with the seed words as the domain candidates,and adjusts the domain candidates by manual auditing. Finally,based on the text representation of the corpus to be recognized,it matches the corpus to the domain vocabularies in the dictionary,and uses the quantity and weight of the vocabulary in public security field to judge whether the corpus to be recognized is a public security event. Result/conclusion] Compared with the classical Naive Bayes,the experimental results on tagged corpus show that the proposed method can effectively improve the recognition accuracy of hot public opinion events in public security field.
Keywords:domain dictionary  public opinion  public security events  event recognition
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号