首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
张国标  李洁  胡潇戈 《情报科学》2021,39(10):126-132
【目的/意义】社交媒体在改变新闻传播以及人类获取信息方式的同时,也成为了虚假新闻传播的主要渠 道。因此,快速识别社交媒体中的虚假新闻,扼制虚假信息的传播,对净化网络空间、维护公共安全至关重要。【方 法/过程】为了有效识别社交媒体上发布的虚假新闻,本文基于对虚假新闻内容特征的深入剖析,分别设计了文本 词向量、文本情感、图像底层、图像语义特征的表示方法,用以提取社交网络中虚假新闻的图像特征信息和文本特 征信息,构建多模态特征融合的虚假新闻检测模型,并使用MediaEval2015数据集对模型性能进行效果验证。【结果/ 结论】通过对比分析不同特征组合方式和不同分类方法的实验结果,发现融合文本特征和图像特征的多模态模型 可以有效提升虚假新闻检测效果。【创新/局限】研究从多模态的角度设计了虚假新闻检测模型,融合了文本与图像 的多种特征。然而采用向量拼接来实现特征融合,不仅无法实现各种特征的充分互补,而且容易造成维度灾难。  相似文献   

2.
To improve the effect of multimodal negative sentiment recognition of online public opinion on public health emergencies, we constructed a novel multimodal fine-grained negative sentiment recognition model based on graph convolutional networks (GCN) and ensemble learning. This model comprises BERT and ViT-based multimodal feature representation, GCN-based feature fusion, multiple classifiers, and ensemble learning-based decision fusion. Firstly, the image-text data about COVID-19 is collected from Sina Weibo, and the text and image features are extracted through BERT and ViT, respectively. Secondly, the image-text fused features are generated through GCN in the constructed microblog graph. Finally, AdaBoost is trained to decide the final sentiments recognized by the best classifiers in image, text, and image-text fused features. The results show that the F1-score of this model is 84.13% in sentiment polarity recognition and 82.06% in fine-grained negative sentiment recognition, improved by 4.13% and 7.55% compared to the optimal recognition effect of image-text feature fusion, respectively.  相似文献   

3.
4.
范昊  何灏 《情报科学》2022,40(6):90-97
【目的/意义】随着社交媒体的发展,各类新闻数量激增,舆情监测处理越来越重要,高效精确的识别舆情新 闻可以帮助有关部门及时搜集跟踪突发事件信息并处理,减小舆论对社会的影响。本文提出一种融合 BERT、 TEXTCNN、BILSTM的新闻标题文本分类模型,充分考虑词嵌入信息、文本特征和上下文信息,以提高新闻标题类 别识别的准确率。【方法/过程】将使用BERT生成的新闻标题文本向量输入到TEXTCNN提取特征,将TEXTCNN 的结果输入到 BILSTM 捕获新闻标题上下文信息,利用 softmax判断分类结果。【结果/结论】研究表明,本文提出的 融合了基于语言模型的 BERT、基于词向量 TEXTCNN 和基于上下文机制 BILSTM 三种算法的分类模型在准确 率、精确率、召回率和F1值均达到了0.92以上,而且具有良好的泛化能力,优于传统的文本分类模型。【创新/局限】 本文使用BERT进行词嵌入,同时进行特征提取和捕获上下文语义,模型识别新闻类别表现良好,但模型参数较多 向量维度较大对训练设备要求较高,同时数据类别只有10类,未对类别更多或类别更细化的数据进行实验。  相似文献   

5.
Digital information exchange enables quick creation and sharing of information and thus changes existing habits. Social media is becoming the main source of news for end-users replacing traditional media. This also enables the proliferation of fake news, which misinforms readers and is used to serve the interests of the creators. As a result, automated fake news detection systems are attracting attention. However, automatic fake news detection presents a major challenge; content evaluation is increasingly becoming the responsibility of the end-user. Thus, in the present study we used information quality (IQ) as an instrument to investigate how users can detect fake news. Specifically, we examined how users perceive fake news in the form of shorter paragraphs on individual IQ dimensions. We also investigated which user characteristics might affect fake news detection. We performed an empirical study with 1123 users, who evaluated randomly generated stories with statements of various level of correctness by individual IQ dimensions. The results reveal that IQ can be used as a tool for fake news detection. Our findings show that (1) domain knowledge has a positive impact on fake news detection; (2) education in combination with domain knowledge improves fake news detection; and (3) personality trait conscientiousness contributes significantly to fake news detection in all dimensions.  相似文献   

6.
The proliferation of false information is a growing problem in today's dynamic online environment. This phenomenon requires automated detection of fake news to reduce its harmful effect on society. Even though various methods are used to detect fake news, most methods only consider data-oriented text features; ignoring dual emotion features (publisher emotions and social emotions) and thus lack higher levels of accuracy. This study addresses this issue by utilizing dual emotion features to detect fake news. The study proposes a Deep Normalized Attention-based mechanism for enriched extraction of dual emotion features and an Adaptive Genetic Weight Update-Random Forest (AGWu-RF) for classification. First, the deep normalized attention-based mechanism incorporates BiGRU, which improves feature value by extracting long-range context information to eliminate gradient explosion issues. The genetic weight for the model is adjusted to RF and updated to achieve optimized hyper parameter values ​​that support the classifiers' detection accuracy. The proposed model outperforms baseline methods on standard benchmark metrics in three real-world datasets. It outperforms state-of-the-art approaches by 5%, 11%, and 14% in terms of accuracy, highlighting the significance of dual emotion capabilities and optimizations in improving fake news detection.  相似文献   

7.
An idiom is a common phrase that means something other than its literal meaning. Detecting idioms automatically is a serious challenge in natural language processing (NLP) domain applications like information retrieval (IR), machine translation and chatbot. Automatic detection of Idioms plays an important role in all these applications. A fundamental NLP task is text classification, which categorizes text into structured categories known as text labeling or categorization. This paper deals with idiom identification as a text classification task. Pre-trained deep learning models have been used for several text classification tasks; though models like BERT and RoBERTa have not been exclusively used for idiom and literal classification. We propose a predictive ensemble model to classify idioms and literals using BERT and RoBERTa, fine-tuned with the TroFi dataset. The model is tested with a newly created in house dataset of idioms and literal expressions, numbering 1470 in all, and annotated by domain experts. Our model outperforms the baseline models in terms of the metrics considered, such as F-score and accuracy, with a 2% improvement in accuracy.  相似文献   

8.
Sarcasm expression is a pervasive literary technique in which people intentionally express the opposite of what is implied. Accurate detection of sarcasm in a text can facilitate the understanding of speakers’ true intentions and promote other natural language processing tasks, especially sentiment analysis tasks. Since sarcasm is a kind of implicit sentiment expression and speakers deliberately confuse the audience, it is challenging to detect sarcasm only by text. Existing approaches based on machine learning and deep learning achieved unsatisfactory performance when handling sarcasm text with complex expression or needing specific background knowledge to understand. Especially, due to the characteristics of the Chinese language itself, sarcasm detection in Chinese is more difficult. To alleviate this dilemma on Chinese sarcasm detection, we propose a sememe and auxiliary enhanced attention neural model, SAAG. At the word level, we introduce sememe knowledge to enhance the representation learning of Chinese words. Sememe is the minimum unit of meaning, which is a fine-grained portrayal of a word. At the sentence level, we leverage some auxiliary information, such as the news title, to learning the representation of the context and background of sarcasm expression. Then, we construct the representation of text expression progressively and dynamically. The evaluation on a sarcasm dateset, consisting of comments on news text, reveals that our proposed approach is effective and outperforms the state-of-the-art models.  相似文献   

9.
Health misinformation has become an unfortunate truism of social media platforms, where lies could spread faster than truth. Despite considerable work devoted to suppressing fake news, health misinformation, including low-quality health news, persists and even increases in recent years. One promising approach to fighting bad information is studying the temporal and sentiment effects of health news stories and how they are discussed and disseminated on social media platforms like Twitter. As part of the effort of searching for innovative ways to fight health misinformation, this study analyzes a dataset of more than 1600 objectively and independently reviewed health news stories published over a 10-year span and nearly 50,000 Twitter posts responding to them. Specifically, it examines the source credibility of health news circulated on Twitter and the temporal, sentiment features of the tweets containing or responding to the health news reports. The results show that health news stories that are rated low by experts are discussed more, persist longer, and produce stronger sentiments than highly rated ones in the tweetosphere. However, the highly rated stories retained a fresh interest in the form of new tweets for a longer period. An in-depth understanding of the characteristics of health news distribution and discussion is the first step toward mitigating the surge of health misinformation. The findings provide insights into understanding the mechanism of health information dissemination on social media and practical implications to fight and mitigate health misinformation on digital media platforms.  相似文献   

10.
In recent years, fake news detection has been a significant task attracting much attention. However, most current approaches utilize the features from a single modality, such as text or image, while the comprehensive fusion between features of different modalities has been ignored. To deal with the above problem, we propose a novel model named Bidirectional Cross-Modal Fusion (BCMF), which comprehensively integrates the textual and visual representations in a bidirectional manner. Specifically, the proposed model is decomposed into four submodules, i.e., the input embedding, the image2text fusion, the text2image fusion, and the prediction module. We conduct intensive experiments on four real-world datasets, i.e., Weibo, Twitter, Politi, and Gossip. The results show 2.2, 2.5, 4.9, and 3.1 percentage points of improvements in classification accuracy compared to the state-of-the-art methods on Weibo, Twitter, Politi, and Gossip, respectively. The experimental results suggest that the proposed model could better capture integrated information of different modalities and has high generalizability among different datasets. Further experiments suggest that the bidirectional fusions, the number of multi-attention heads, and the aggregating function could impact the performance of the cross-modal fake news detection. The research sheds light on the role of bidirectional cross-modal fusion in leveraging multi-modal information to improve the effect of fake news detection.  相似文献   

11.
马达  卢嘉蓉  朱侯 《情报科学》2023,41(2):60-68
【目的/意义】探究针对微博文本的基于深度学习的情绪分类有效方法,研究微博热点事件下用户转发言论的情绪类型与隐私信息传播的关系。【方法/过程】选用BERT、BERT+CNN、BERT+RNN和ERNIE四个深度学习分类模型设置对比实验,在重新构建情绪7分类语料库的基础上验证性能较好的模型。选取4个微博热点案例,从情绪分布、情感词词频、转发时间和转发次数四个方面展开实证分析。【结果/结论】通过实证研究发现,用户在传播隐私信息是急速且短暂的,传播时以“愤怒”和“厌恶”等为代表的消极情绪占主导地位,且会因隐私信息主体的不同而产生情绪类型和表达方式上的差异。【创新/局限】研究了用户在传播隐私信息行为时的情绪特征及二者的联系,为保护社交网络用户隐私信息安全提供有价值的理论和现实依据,但所构建的语料库数据量对于训练一个高准确率的深度学习模型而言还不够,且模型对于反话、反讽等文本的识别效果不佳。  相似文献   

12.
Social media users are increasingly using both images and text to express their opinions and share their experiences, instead of only using text in the conventional social media. Consequently, the conventional text-based sentiment analysis has evolved into more complicated studies of multimodal sentiment analysis. To tackle the challenge of how to effectively exploit the information from both visual content and textual content from image-text posts, this paper proposes a new image-text consistency driven multimodal sentiment analysis approach. The proposed approach explores the correlation between the image and the text, followed by a multimodal adaptive sentiment analysis method. To be more specific, the mid-level visual features extracted by the conventional SentiBank approach are used to represent visual concepts, with the integration of other features, including textual, visual and social features, to develop a machine learning sentiment analysis approach. Extensive experiments are conducted to demonstrate the superior performance of the proposed approach.  相似文献   

13.
Detecting sentiments in natural language is tricky even for humans, making its automated detection more complicated. This research proffers a hybrid deep learning model for fine-grained sentiment prediction in real-time multimodal data. It reinforces the strengths of deep learning nets in combination to machine learning to deal with two specific semiotic systems, namely the textual (written text) and visual (still images) and their combination within the online content using decision level multimodal fusion. The proposed contextual ConvNet-SVMBoVW model, has four modules, namely, the discretization, text analytics, image analytics, and decision module. The input to the model is multimodal text, m ε {text, image, info-graphic}. The discretization module uses Google Lens to separate the text from the image, which is then processed as discrete entities and sent to the respective text analytics and image analytics modules. Text analytics module determines the sentiment using a hybrid of a convolution neural network (ConvNet) enriched with the contextual semantics of SentiCircle. An aggregation scheme is introduced to compute the hybrid polarity. A support vector machine (SVM) classifier trained using bag-of-visual-words (BoVW) for predicting the visual content sentiment. A Boolean decision module with a logical OR operation is augmented to the architecture which validates and categorizes the output on the basis of five fine-grained sentiment categories (truth values), namely ‘highly positive,’ ‘positive,’ ‘neutral,’ ‘negative’ and ‘highly negative.’ The accuracy achieved by the proposed model is nearly 91% which is an improvement over the accuracy obtained by the text and image modules individually.  相似文献   

14.
There is an ongoing debate about what is more important in the modern online media newsroom, whether it is the news content and worthiness, or the audience clicks. Using a dataset of over one million articles from five countries (Belarus, Kazakhstan, Poland, Russia, and Ukraine) and a novel machine learning methodology, I demonstrate that the content of news articles has a significant impact on their lifespan. My findings show that articles with positive sentiment tend to be displayed longer, and that high fear emotion scores can extend the lifespan of news articles in autocratic regimes, and the impact is substantial in magnitude. This paper proposes four new methods for improving information management methodology: a flexible version of Latent Dirichlet Allocation (LDA), a technique for performing relative sentiment analysis, a method for determining semantic similarity between a news article and a newspaper's dominant narrative, and a novel approach to unsupervised model validation based on inter-feature consistency.  相似文献   

15.
Stock prediction via market data analysis is an attractive research topic. Both stock prices and news articles have been employed in the prediction processes. However, how to combine technical indicators from stock prices and news sentiments from textual news articles, and make the prediction model be able to learn sequential information within time series in an intelligent way, is still an unsolved problem. In this paper, we build up a stock prediction system and propose an approach that 1) represents numerical price data by technical indicators via technical analysis, and represents textual news articles by sentiment vectors via sentiment analysis, 2) setup a layered deep learning model to learn the sequential information within market snapshot series which is constructed by the technical indicators and news sentiments, 3) setup a fully connected neural network to make stock predictions. Experiments have been conducted on more than five years of Hong Kong Stock Exchange data using four different sentiment dictionaries, and results show that 1) the proposed approach outperforms the baselines in both validation and test sets using two different evaluation metrics, 2) models incorporating prices and news sentiments outperform models that only use either technical indicators or news sentiments, in both individual stock level and sector level, 3) among the four sentiment dictionaries, finance domain-specific sentiment dictionary (Loughran–McDonald Financial Dictionary) models the news sentiments better, which brings more prediction performance improvements than the other three dictionaries.  相似文献   

16.
Effective learning schemes such as fine-tuning, zero-shot, and few-shot learning, have been widely used to obtain considerable performance with only a handful of annotated training data. In this paper, we presented a unified benchmark to facilitate the problem of zero-shot text classification in Turkish. For this purpose, we evaluated three methods, namely, Natural Language Inference, Next Sentence Prediction and our proposed model that is based on Masked Language Modeling and pre-trained word embeddings on nine Turkish datasets for three main categories: topic, sentiment, and emotion. We used pre-trained Turkish monolingual and multilingual transformer models which can be listed as BERT, ConvBERT, DistilBERT and mBERT. The results showed that ConvBERT with the NLI method yields the best results with 79% and outperforms previously used multilingual XLM-RoBERTa model by 19.6%. The study contributes to the literature using different and unattempted transformer models for Turkish and showing improvement of zero-shot text classification performance for monolingual models over multilingual models.  相似文献   

17.
Multimodal sentiment analysis aims to judge the sentiment of multimodal data uploaded by the Internet users on various social media platforms. On one hand, existing studies focus on the fusion mechanism of multimodal data such as text, audio and visual, but ignore the similarity of text and audio, text and visual, and the heterogeneity of audio and visual, resulting in deviation of sentiment analysis. On the other hand, multimodal data brings noise irrelevant to sentiment analysis, which affects the effectness of fusion. In this paper, we propose a Polar-Vector and Strength-Vector mixer model called PS-Mixer, which is based on MLP-Mixer, to achieve better communication between different modal data for multimodal sentiment analysis. Specifically, we design a Polar-Vector (PV) and a Strength-Vector (SV) for judging the polar and strength of sentiment separately. PV is obtained from the communication of text and visual features to decide the sentiment that is positive, negative, or neutral sentiment. SV is gained from the communication between the text and audio features to analyze the sentiment strength in the range of 0 to 3. Furthermore, we devise an MLP-Communication module (MLP-C) composed of several fully connected layers and activation functions to make the different modal features fully interact in both the horizontal and the vertical directions, which is a novel attempt to use MLP for multimodal information communication. Finally, we mix PV and SV to obtain a fusion vector to judge the sentiment state. The proposed PS-Mixer is tested on two publicly available datasets, CMU-MOSEI and CMU-MOSI, which achieves the state-of-the-art (SOTA) performance on CMU-MOSEI compared with baseline methods. The codes are available at: https://github.com/metaphysicser/PS-Mixer.  相似文献   

18.
19.
The spread of fake news has become a significant social problem, drawing great concern for fake news detection (FND). Pretrained language models (PLMs), such as BERT and RoBERTa can benefit this task much, leading to state-of-the-art performance. The common paradigm of utilizing these PLMs is fine-tuning, in which a linear classification layer is built upon the well-initialized PLM network, resulting in an FND mode, and then the full model is tuned on a training corpus. Although great successes have been achieved, this paradigm still involves a significant gap between the language model pretraining and target task fine-tuning processes. Fortunately, prompt learning, a new alternative to PLM exploration, can handle the issue naturally, showing the potential for further performance improvements. To this end, we propose knowledgeable prompt learning (KPL) for this task. First, we apply prompt learning to FND, through designing one sophisticated prompt template and the corresponding verbal words carefully for the task. Second, we incorporate external knowledge into the prompt representation, making the representation more expressive to predict the verbal words. Experimental results on two benchmark datasets demonstrate that prompt learning is better than the baseline fine-tuning PLM utilization for FND and can outperform all previous representative methods. Our final knowledgeable model (i.e, KPL) can provide further improvements. In particular, it achieves an average increase of 3.28% in F1 score under low-resource conditions compared with fine-tuning.  相似文献   

20.
The phenomenal spread of fake news online necessitates further research into fake news perception. We stress human factors in misinformation management. This study extends prior research on fake news and media consumption to examine how people perceive fake news. The objective is to understand how news categories and sources influence individuals' perceptions of fake news. Participants (N = 1008) were randomly allocated to six groups in which they evaluated the believability of news from three categories (misinformation, conspiracy, and correction news) coupled with six online news sources whose background (official media, commercial media, and social media) and expertise level varied (the presence or absence of a professional editorial team). Our findings indicated people could distinguish media sources, which have a significant effect on fake news perception. People believed most in conspiracy news and then misinformation included in correction news, demonstrating the backfire of correction news. The significant interaction effects indicate people are more sensitive to misinformation news and show more skepticism toward misinformation on social media. The findings support news literacy that users are capable to leverage credible sources in navigating online news. Meanwhile, challenges of processing correction news require design measures to promote truth-telling news.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号