首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
2.
张国标  李洁  胡潇戈 《情报科学》2021,39(10):126-132
【目的/意义】社交媒体在改变新闻传播以及人类获取信息方式的同时,也成为了虚假新闻传播的主要渠 道。因此,快速识别社交媒体中的虚假新闻,扼制虚假信息的传播,对净化网络空间、维护公共安全至关重要。【方 法/过程】为了有效识别社交媒体上发布的虚假新闻,本文基于对虚假新闻内容特征的深入剖析,分别设计了文本 词向量、文本情感、图像底层、图像语义特征的表示方法,用以提取社交网络中虚假新闻的图像特征信息和文本特 征信息,构建多模态特征融合的虚假新闻检测模型,并使用MediaEval2015数据集对模型性能进行效果验证。【结果/ 结论】通过对比分析不同特征组合方式和不同分类方法的实验结果,发现融合文本特征和图像特征的多模态模型 可以有效提升虚假新闻检测效果。【创新/局限】研究从多模态的角度设计了虚假新闻检测模型,融合了文本与图像 的多种特征。然而采用向量拼接来实现特征融合,不仅无法实现各种特征的充分互补,而且容易造成维度灾难。  相似文献   

3.
4.
Social media users are increasingly using both images and text to express their opinions and share their experiences, instead of only using text in the conventional social media. Consequently, the conventional text-based sentiment analysis has evolved into more complicated studies of multimodal sentiment analysis. To tackle the challenge of how to effectively exploit the information from both visual content and textual content from image-text posts, this paper proposes a new image-text consistency driven multimodal sentiment analysis approach. The proposed approach explores the correlation between the image and the text, followed by a multimodal adaptive sentiment analysis method. To be more specific, the mid-level visual features extracted by the conventional SentiBank approach are used to represent visual concepts, with the integration of other features, including textual, visual and social features, to develop a machine learning sentiment analysis approach. Extensive experiments are conducted to demonstrate the superior performance of the proposed approach.  相似文献   

5.
6.
突发事件中社交媒体信息可信度研究   总被引:1,自引:0,他引:1  
为了研究突发事件中社交媒体信息可信度的影响因素,本文以甬温线铁路交通事故作为背景,构建了突发事件中社交媒体信息可信度的信任模型,通过问卷调查法,应用结构方程模型分析所得数据,分析结果表明:来源可信度、传播渠道可信度、信息内容可信度、传者专业权威性、传者可信赖性、网络依赖以及信息客观性正向影响公众对突发事件中社交媒体信息信任度的评价,评论质疑负向影响公众对突发事件信息可信度的评估,网络使用和信息完整性对突发事件信息可信度影响不显著。  相似文献   

7.
Multimodal fake news detection methods based on semantic information have achieved great success. However, these methods only exploit the deep features of multimodal information, which leads to a large loss of valid information at the shallow level. To address this problem, we propose a progressive fusion network (MPFN) for multimodal disinformation detection, which captures the representational information of each modality at different levels and achieves fusion between modalities at the same level and at different levels by means of a mixer to establish a strong connection between the modalities. Specifically, we use a transformer structure, which is effective in computer vision tasks, as a visual feature extractor to gradually sample features at different levels and combine features obtained from a text feature extractor and image frequency domain information at different levels for fine-grained modeling. In addition, we design a feature fusion approach to better establish connections between modalities, which can further improve the performance and thus surpass other network structures in the literature. We conducted extensive experiments on two real datasets, Weibo and Twitter, where our method achieved 83.3% accuracy on the Twitter dataset, which has increased by at least 4.3% compared to other state-of-the-art methods. This demonstrates the effectiveness of MPFN for identifying fake news, and the method reaches a relatively advanced level by combining different levels of information from each modality and a powerful modality fusion method.  相似文献   

8.
Social emotion refers to the emotion evoked to the reader by a textual document. In contrast to the emotion cause extraction task which analyzes the cause of the author's sentiments based on the expressions in text, identifying the causes of social emotion evoked to the reader from text has not been explored previously. Social emotion mining and its cause analysis is not only an important research topic in Web-based social media analytics and text mining but also has a number of applications in multiple domains. As the focus of social emotion cause identification is on analyzing the causes of the reader's emotions elicited by a text that are not explicitly or implicitly expressed, it is a challenging task fundamentally different from the previous research. To tackle this, it also needs a deeper level understanding of the cognitive process underlying the inference of social emotion and its cause analysis. In this paper, we propose the new task of social emotion cause identification (SECI). Inspired by the cognitive structure of emotions (OCC) theory, we present a Cognitive Emotion model Enhanced Sequential (CogEES) method for SECI. Specifically, based on the implications of the OCC model, our method first establishes the correspondence between words/phrases in text and emotional dimensions identified in OCC and builds the emotional dimension lexicons with 1,676 distinct words/phrases. Then, our method utilizes lexicons information and discourse coherence for the semantic segmentation of document and the enhancement of clause representation learning. Finally, our method combines text segmentation and clause representation into a sequential model for cause clause prediction. We construct the SECI dataset for this new task and conduct experiments to evaluate CogEES. Our method outperforms the baselines and achieves over 10% F1 improvement on average, with better interpretability of the prediction results.  相似文献   

9.
Due to the particularity of Chinese word formation, the Chinese Named Entity Recognition (NER) task has attracted extensive attention over recent years. Recently, some researchers have tried to solve this problem by using a multimodal method combining acoustic features and text features. However, the text-speech data pairs required by the above methods are lacking in real-world scenarios, making it difficult to apply widely. To address this, we proposed a multimodal Chinese NER method called USAF, which uses synthesized acoustic features instead of actual human speech. USAF aligns text and acoustic features through unique position embeddings and uses a multi-head attention mechanism to fuse the features of the two modalities, which stably improves the performance of Chinese named entity recognition. To evaluate USAF, we implemented USAF on three Chinese NER datasets. Experimental results show that USAF witnesses a stable improvement compare to text-only methods on each dataset, and outperforms SOTA external-vocabulary-based method on two datasets. Specifically, compared to the SOTA external-vocabulary-based method, the F1 score of USAF is improved by 1.84 and 1.24 on CNERTA and Aishell3-NER, respectively.  相似文献   

10.
We propose a CNN-BiLSTM-Attention classifier to classify online short messages in Chinese posted by users on government web portals, so that a message can be directed to one or more government offices. Our model leverages every bit of information to carry out multi-label classification, to make use of different hierarchical text features and the labels information. In particular, our designed method extracts label meaning, the CNN layer extracts local semantic features of the texts, the BiLSTM layer fuses the contextual features of the texts and the local semantic features, and the attention layer selects the most relevant features for each label. We evaluate our model on two public large corpuses, and our high-quality handcraft e-government multi-label dataset, which is constructed by the text annotation tool doccano and consists of 29920 data points. Experimental results show that our proposed method is effective under common multi-label evaluation metrics, achieving micro-f1 of 77.22%, 84.42%, 87.52%, and marco-f1 of 77.68%, 73.37%, 83.57% on these three datasets respectively, confirming that our classifier is robust. We conduct ablation study to evaluate our label embedding method and attention mechanism. Moreover, case study on our handcraft e-government multi-label dataset verifies that our model integrates all types of semantic information of short messages based on different labels to achieve text classification.  相似文献   

11.
Multimodal relation extraction is a critical task in information extraction, aiming to predict the class of relations between head and tail entities from linguistic sequences and related images. However, the current works are vulnerable to less relevant visual objects detected from images and are not able to sufficiently fuse visual information into text pre-trained models. To overcome these problems, we propose a Two-Stage Visual Fusion Network (TSVFN) that employs the multimodal fusion approach in vision-enhanced entity relation extraction. In the first stage, we design multimodal graphs, whose novelty lies mainly in transforming the sequence learning into the graph learning. In the second stage, we merge the transformer-based visual representation into the text pre-trained model by a multi-scale cross-model projector. Specifically, two multimodal fusion operations are implemented inside the pre-trained model respectively. We finally accomplish deep interaction of multimodal multi-structured data in two fusion stages. Extensive experiments are conducted on a dataset (MNRE), our model outperforms the current state-of-the-art method by 1.76%, 1.52%, 1.29%, and 1.17% in terms of accuracy, precision, recall, and F1 score, respectively. Moreover, our model also achieves excellent results under the condition of fewer samples.  相似文献   

12.
Multimodal sentiment analysis aims to judge the sentiment of multimodal data uploaded by the Internet users on various social media platforms. On one hand, existing studies focus on the fusion mechanism of multimodal data such as text, audio and visual, but ignore the similarity of text and audio, text and visual, and the heterogeneity of audio and visual, resulting in deviation of sentiment analysis. On the other hand, multimodal data brings noise irrelevant to sentiment analysis, which affects the effectness of fusion. In this paper, we propose a Polar-Vector and Strength-Vector mixer model called PS-Mixer, which is based on MLP-Mixer, to achieve better communication between different modal data for multimodal sentiment analysis. Specifically, we design a Polar-Vector (PV) and a Strength-Vector (SV) for judging the polar and strength of sentiment separately. PV is obtained from the communication of text and visual features to decide the sentiment that is positive, negative, or neutral sentiment. SV is gained from the communication between the text and audio features to analyze the sentiment strength in the range of 0 to 3. Furthermore, we devise an MLP-Communication module (MLP-C) composed of several fully connected layers and activation functions to make the different modal features fully interact in both the horizontal and the vertical directions, which is a novel attempt to use MLP for multimodal information communication. Finally, we mix PV and SV to obtain a fusion vector to judge the sentiment state. The proposed PS-Mixer is tested on two publicly available datasets, CMU-MOSEI and CMU-MOSI, which achieves the state-of-the-art (SOTA) performance on CMU-MOSEI compared with baseline methods. The codes are available at: https://github.com/metaphysicser/PS-Mixer.  相似文献   

13.
Viewer gifting is an important business mode in live streaming industry, which closely relates to the income of the platforms and streamers. Previous studies on gifting prediction are often limited to cross-section data and consider the problem from the macro perspective of the whole live streaming. However, the multimodal information and the time accumulation effect of live streaming content on viewer gifting behavior are ignored. In this paper, we put forward a multimodal time-series method (MTM) for predicting real-time gifting. The core module of the method is the multimodal time-series analysis (MTA), which targets at effectively fusing multimodal information. Specifically, the proposed orthogonal projection (OP) model can promote cross-modal information interaction without introducing additional parameters. To achieve the interaction of multi-modal information at the same level, we also design a stackable joint representation layer, which makes each target modality's representation (visual, acoustic and textual modality) can benefit from all the other modalities. The residual connections are introduced as well to ensure the integration of low-level and high-level information. On our dataset, our model shows improved performance compared to other advanced models by at least 8% on F1. Meanwhile, the MTA is able to meet the real-time requirements of the live streaming setting, and has demonstrated its robustness and transferability in other tasks. Our research may offer some insights about how to efficiently fuse multimodal information, and contribute to the research on viewer gifting behavior prediction in the live streaming context.  相似文献   

14.
The multi-modal retrieval is considered as performing information retrieval among different modalities of multimedia information. Nowadays, it becomes increasingly important in the information science field. However, it is so difficult to bridge the meanings of different multimedia modalities that the performance of multimodal retrieval is deteriorated now. In this paper, we propose a new mechanism to build the relationship between visual and textual modalities and to verify the multimodal retrieval. Specifically, this mechanism depends on the multimodal binary classifiers based on the Extreme Learning Machine (ELM) to verify whether the answers are related to the query examples. Firstly, we propose the multimodal probabilistic semantic model to rank the answers according to their generative probabilities. Furthermore, we build the multimodal binary classifiers to filter out unrelated answers. The multimodal binary classifiers are called the word classifiers. It can improve the performance of the multimodal probabilistic semantic model. The experimental results show that the multimodal probabilistic semantic model and the word classifiers are effective and efficient. Also they demonstrate that the word classifiers based on ELM not only can improve the performance of the probabilistic semantic model but also can be easily applied to other probabilistic semantic models.  相似文献   

15.
Medical question and answering is a crucial aspect of medical artificial intelligence, as it aims to enhance the efficiency of clinical diagnosis and improve treatment outcomes. Despite the numerous methods available for medical question and answering, they tend to overlook the data generation mechanism’s imbalance and the pseudo-correlation caused by the task’s text characteristics. This pseudo-correlation is due to the fact that many words in the question and answering task are irrelevant to the answer but carry significant weight. These words can affect the feature representation and establish a false correlation with the final answer. Furthermore, the data imbalance mechanism can cause the model to blindly follow a large number of classes, leading to bias in the final answer. Confounding factors, including the data imbalance mechanism, bias due to textual characteristics, and other unknown factors, may also mislead the model and limit its performance.In this study, we propose a new counterfactual-based approach that includes a feature encoder and a counterfactual decoder. The feature encoder utilizes ChatGPT and label resetting techniques to create counterfactual data, compensating for distributional differences in the dataset and alleviating data imbalance issues. Moreover, the sampling prior to label resetting also helps us alleviate the data imbalance issue. Subsequently, label resetting can yield better and more balanced counterfactual data. Additionally, the construction of counterfactual data aids the subsequent counterfactual classifier in better learning causal features. The counterfactual decoder uses counterfactual data compared with real data to optimize the model and help it acquire the causal characteristics that genuinely influence the label to generate the final answer. The proposed method was tested on PubMedQA, a medical dataset, using machine learning and deep learning models. The comprehensive experiments demonstrate that this method achieves state-of-the-art results and effectively reduces the false correlation caused by confounders.  相似文献   

16.
In this era, the proliferating role of social media in our lives has popularized the posting of the short text. The short texts contain limited context with unique characteristics which makes them difficult to handle. Every day billions of short texts are produced in the form of tags, keywords, tweets, phone messages, messenger conversations social network posts, etc. The analysis of these short texts is imperative in the field of text mining and content analysis. The extraction of precise topics from large-scale short text documents is a critical and challenging task. The conventional approaches fail to obtain word co-occurrence patterns in topics due to the sparsity problem in short texts, such as text over the web, social media like Twitter, and news headlines. Therefore, in this paper, the sparsity problem is ameliorated by presenting a novel fuzzy topic modeling (FTM) approach for short text through fuzzy perspective. In this research, the local and global term frequencies are computed through a bag-of-words (BOW) model. To remove the negative impact of high dimensionality on the global term weighting, the principal component analysis is adopted; thereafter the fuzzy c-means algorithm is employed to retrieve the semantically relevant topics from the documents. The experiments are conducted over the three real-world short text datasets: the snippets dataset is in the category of small dataset whereas the other two datasets, Twitter and questions, are the bigger datasets. Experimental results show that the proposed approach discovered the topics more precisely and performed better as compared to other state-of-the-art baseline topic models such as GLTM, CSTM, LTM, LDA, Mix-gram, BTM, SATM, and DREx+LDA. The performance of FTM is also demonstrated in classification, clustering, topic coherence and execution time. FTM classification accuracy is 0.95, 0.94, 0.91, 0.89 and 0.87 on snippets dataset with 50, 75, 100, 125 and 200 number of topics. The classification accuracy of FTM on questions dataset is 0.73, 0.74, 0.70, 0.68 and 0.78 with 50, 75, 100, 125 and 200 number of topics. The classification accuracies of FTM on snippets and questions datasets are higher than state-of-the-art baseline topic models.  相似文献   

17.
18.
In the current era of internet, information related to crime is scattered across many sources namely news media, social networks, blogs, and video repositories, etc. Crime reports published in online newspapers are often considered as reliable compared to crowdsourced data like social media and contain crime information not only in the form of unstructured text but also in the form of images. Given the volume and availability of crime-related information present in online newspapers, gathering and integrating crime entities from multiple modalities and representing them as a knowledge base in machine-readable form will be useful for any law enforcement agencies to analyze and prevent criminal activities. Extant research works to generate the crime knowledge base, does not address extraction of all non-redundant entities from text and image data present in multiple newspapers. Hence, this work proposes Crime Base, an entity relationship based system to extract and integrate crime related text and image data from online newspapers with a focus towards reducing duplicity and loss of information in the knowledge base. The proposed system uses a rule-based approach to extract the entities from text and image captions. The entities extracted from text data are correlated using contextual as-well-as semantic similarity measures and image entities are correlated using low-level and high-level image features. The proposed system also presents an integrated view of these entities and their relations in the form of a knowledge base using OWL. The system is tested for a collection of crime related articles from popular Indian online newspapers.  相似文献   

19.
The use of social media and Web 2.0 platforms is proliferating and affecting different formal and highly structured organisations including public safety agencies. Much of the research in the area has focussed on public use of social media during an emergency as well as how emergency agencies benefit from the data and information generated by this process. However, there is little understanding of “what are the operational implications of this public use on emergency management agencies and how does social media either positively or negatively impact these operations”? In order to progress research into this topic, we chose an engaged scholarship framework to shape a research agenda with the active participation of stakeholders. Hence, we conducted a series of workshops primarily involving over 100 public safety practitioners working in the area of disasters and emergency management who work in public safety agencies, humanitarian organisations, volunteering online platforms and volunteer groups in addition to 20 academics working on this area of enquiry. The findings highlight six different challenges that emergency responding organisations currently face in relation to social media use. We conceptualise these challenges as creating six operational tension zones for organisations. We discuss these tensions and their implications for future research and practice.  相似文献   

20.
Developing a tourism forecasting function in decision support systems has become critical for businesses and governments. The existing forecasting models considering spatial relations contain insufficient information, and the spatial aggregation of simple tourist volume series limits the forecasting accuracy. Using human-generated search engines and social media data has the potential to address this issue. In this paper, a spatial aggregation-based multimodal deep learning method for hourly attraction tourist volume forecasting is developed. The model first extracts the daily features of attractions from search engine data; then mines the spatial aggregation relationships in social media data and multi-attraction tourist volume data. Finally, the model fuses hourly features with daily features to make forecasting. The model is tested using a dataset containing several attractions with real-time tourist volume at 15-minute intervals from November 27, 2018, to March 18, 2019, in Beijing. And the empirical and Diebold-Mariano test results demonstrate that the proposed framework can outperform state-of-the-art baseline models with statistically significant improvements at the 1% level. Compared with the best baseline model, the MAPE values are reduced by 50.0% and 27.3% in 4A attractions and 5A attractions, respectively; and the RMSE values are reduced by 48.3% and 26.1%, respectively. The method in this paper can be used as a function embedded in the decision support system to help multi-department collaboration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号