首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Extracting opinions and emotions from text is becoming increasingly important, especially since the advent of micro-blogging and social networking. Opinion mining is particularly popular and now gathers many public services, datasets and lexical resources. Unfortunately, there are few available lexical and semantic resources for emotion recognition that could foster the development of new emotion aware services and applications. The diversity of theories of emotion and the absence of a common vocabulary are two of the main barriers to the development of such resources. This situation motivated the creation of Onyx, a semantic vocabulary of emotions with a focus on lexical resources and emotion analysis services. It follows a linguistic Linked Data approach, it is aligned with the Provenance Ontology, and it has been integrated with the Lexicon Model for Ontologies (lemon), a popular RDF model for representing lexical entries. This approach also means a new and interesting way to work with different theories of emotion. As part of this work, Onyx has been aligned with EmotionML and WordNet-Affect.  相似文献   

2.
Quickly and accurately summarizing representative opinions is a key step for assessing microblog sentiments. The Ortony-Clore-Collins (OCC) model of emotion can offer a rule-based emotion export mechanism. In this paper, we propose an OCC model and a Convolutional Neural Network (CNN) based opinion summarization method for Chinese microblogging systems. We test the proposed method using real world microblog data. We then compare the accuracy of manual sentiment annotation to the accuracy using our OCC-based sentiment classification rule library. Experimental results from analyzing three real-world microblog datasets demonstrate the efficacy of our proposed method. Our study highlights the potential of combining emotion cognition with deep learning in sentiment analysis of social media data.  相似文献   

3.
Since previous studies in cognitive psychology show that individuals’ affective states can help analyze and predict their future behaviors, researchers have explored emotion mining for predicting online activities, firm profitability, and so on. Existing emotion mining methods are divided into two categories: feature-based approaches that rely on handcrafted annotations and deep learning-based methods that thrive on computational resources and big data. However, neither category can effectively detect emotional expressions captured in text (e.g., social media postings). In addition, the utilization of these methods in downstream explanatory and predictive applications is also rare. To fill the aforementioned research gaps, we develop a novel deep learning-based emotion detector named DeepEmotionNet that can simultaneously leverage contextual, syntactic, semantic, and document-level features and lexicon-based linguistic knowledge to bootstrap the overall emotion detection performance. Based on three emotion detection benchmark corpora, our experimental results confirm that DeepEmotionNet outperforms state-of-the-art baseline methods by 4.9% to 29.8% in macro-averaged F-score. For the downstream application of DeepEmotionNet to a real-world financial application, our econometric analysis highlights that top executives’ emotions of fear and anger embedded in their social media postings are significantly associated with corporate financial performance. Furthermore, these two emotions can significantly improve the predictive power of corporate financial performance when compared to sentiments. To the best of our knowledge, this is the first study to develop a deep learning-based emotion detection method and successfully apply it to enhance corporate performance prediction.  相似文献   

4.
5.
Researchers have been aware that emotion is not one-hot encoded in emotion-relevant classification tasks, and multiple emotions can coexist in a given sentence. Recently, several works have focused on leveraging a distribution label or a grayscale label of emotions in the classification model, which can enhance the one-hot label with additional information, such as the intensity of other emotions and the correlation between emotions. Such an approach has been proven effective in alleviating the overfitting problem and improving the model robustness by introducing a distribution learning component in the objective function. However, the effect of distribution learning cannot be fully unfolded as it can reduce the model’s discriminative ability within similar emotion categories. For example, “Sad” and “Fear” are both negative emotions. To address such a problem, we proposed a novel emotion extension scheme in the prior work (Li, Chen, Xie, Li, and Tao, 2021). The prior work incorporated fine-grained emotion concepts to build an extended label space, where a mapping function between coarse-grained emotion categories and fine-grained emotion concepts was identified. For example, sentences labeled “Joy” can convey various emotions such as enjoy, free, and leisure. The model can further benefit from the extended space by extracting dependency within fine-grained emotions when yielding predictions in the original label space. The prior work has shown that it is more apt to apply distribution learning in the extended label space than in the original space. A novel sparse connection method, i.e., Leaky Dropout, is proposed in this paper to refine the dependency-extraction step, which further improves the classification performance. In addition to the multiclass emotion classification task, we extensively experimented on sentiment analysis and multilabel emotion prediction tasks to investigate the effectiveness and generality of the label extension schema.  相似文献   

6.
The proliferation of false information is a growing problem in today's dynamic online environment. This phenomenon requires automated detection of fake news to reduce its harmful effect on society. Even though various methods are used to detect fake news, most methods only consider data-oriented text features; ignoring dual emotion features (publisher emotions and social emotions) and thus lack higher levels of accuracy. This study addresses this issue by utilizing dual emotion features to detect fake news. The study proposes a Deep Normalized Attention-based mechanism for enriched extraction of dual emotion features and an Adaptive Genetic Weight Update-Random Forest (AGWu-RF) for classification. First, the deep normalized attention-based mechanism incorporates BiGRU, which improves feature value by extracting long-range context information to eliminate gradient explosion issues. The genetic weight for the model is adjusted to RF and updated to achieve optimized hyper parameter values ​​that support the classifiers' detection accuracy. The proposed model outperforms baseline methods on standard benchmark metrics in three real-world datasets. It outperforms state-of-the-art approaches by 5%, 11%, and 14% in terms of accuracy, highlighting the significance of dual emotion capabilities and optimizations in improving fake news detection.  相似文献   

7.
A new method is described to extract significant phrases in the title and the abstract of scientific or technical documents. The method is based upon a text structure analysis and uses a relatively small dictionary. The dictionary has been constructed based on the knowledge about concepts in the field of science or technology and some lexical knowledge, for significant phrases and their component items may be used in different meanings among the fields. A text analysis approach has been applied to select significant phrases as substantial and semantic information carriers of the contents of the abstract.The results of the experiment for five sets of documents have shown that the significant phrases are effectively extracted in all cases, and the number of them for every document and the processing time is fairly satisfactory. The information representation of the document, partly using the method, is discussed with relation to the construction of the document information retrieval system.  相似文献   

8.
Detecting suicidal tendencies and preventing suicides is an important social goal. The rise and continuance of emotion, the emotion category, and the intensity of the emotion are important clues about suicidal tendencies. The three determinants of emotion, viz. Valence, Arousal, and Dominance (VAD) can help determine a person’s exact emotion(s) and its intensity. This paper introduces an end-to-end VAD-assisted transformer-based multi-task network for detecting emotion (primary task) and its intensity (auxiliary task) in suicide notes. As part of this research, we expand the utility of the emotion-annotated benchmark dataset of suicide notes, CEASE-v2.0, by annotating all its sentences with emotion intensity labels. Empirical results show that our multi-task method performs better than the corresponding single-task systems, with the best attained overall Mean Recall (MR) of 65.25% on the emotion task. On a similar task, we improved MR by 8.78% over the existing state-of-the-art system. We evaluated our approach on three benchmark datasets for three different tasks. We observed that the introduced method consistently outperformed existing state-of-the-art approaches on the studied datasets, demonstrating its capacity to generalize to other downstream correlated tasks. We qualitatively examined our model’s output by comparing it to the labeling of a psychiatrist.  相似文献   

9.
Sentiment lexicons are essential tools for polarity classification and opinion mining. In contrast to machine learning methods that only leverage text features or raw text for sentiment analysis, methods that use sentiment lexicons embrace higher interpretability. Although a number of domain-specific sentiment lexicons are made available, it is impractical to build an ex ante lexicon that fully reflects the characteristics of the language usage in endless domains. In this article, we propose a novel approach to simultaneously train a vanilla sentiment classifier and adapt word polarities to the target domain. Specifically, we sequentially track the wrongly predicted sentences and use them as the supervision instead of addressing the gold standard as a whole to emulate the life-long cognitive process of lexicon learning. An exploration-exploitation mechanism is designed to trade off between searching for new sentiment words and updating the polarity score of one word. Experimental results on several popular datasets show that our approach significantly improves the sentiment classification performance for a variety of domains by means of improving the quality of sentiment lexicons. Case-studies also illustrate how polarity scores of the same words are discovered for different domains.  相似文献   

10.
Automated keyphrase extraction is a fundamental textual information processing task concerned with the selection of representative phrases from a document that summarize its content. This work presents a novel unsupervised method for keyphrase extraction, whose main innovation is the use of local word embeddings (in particular GloVe vectors), i.e., embeddings trained from the single document under consideration. We argue that such local representation of words and keyphrases are able to accurately capture their semantics in the context of the document they are part of, and therefore can help in improving keyphrase extraction quality. Empirical results offer evidence that indeed local representations lead to better keyphrase extraction results compared to both embeddings trained on very large third corpora or larger corpora consisting of several documents of the same scientific field and to other state-of-the-art unsupervised keyphrase extraction methods.  相似文献   

11.
This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64%.  相似文献   

12.
The breeding and spreading of negative emotion in public emergencies posed severe challenges to social governance. The traditional government information release strategies ignored the negative emotion evolution mechanism. Focusing on the information release policies from the perspectives of the government during public emergency events, by using cognitive big data analytics, our research applies deep learning method into news framing framework construction process, and tries to explore the influencing mechanism of government information release strategy on contagion-evolution of negative emotion. In particular, this paper first uses Word2Vec, cosine word vector similarity calculation and SO-PMI algorithms to build a public emergencies-oriented emotional lexicon; then, it proposes a emotion computing method based on dependency parsing, designs an emotion binary tree and dependency-based emotion calculation rules; and at last, through an experiment, it shows that the emotional lexicon proposed in this paper has a wider coverage and higher accuracy than the existing ones, and it also performs a emotion evolution analysis on an actual public event based on the emotional lexicon, using the emotion computing method proposed. And the empirical results show that the algorithm is feasible and effective. The experimental results showed that this model could effectively conduct fine-grained emotion computing, improve the accuracy and computational efficiency of sentiment classification. The final empirical analysis found that due to such defects as slow speed, non transparent content, poor penitence and weak department coordination, the existing government information release strategies had a significant negative impact on the contagion-evolution of anxiety and disgust emotion, could not regulate negative emotions effectively. These research results will provide theoretical implications and technical supports for the social governance. And it could also help to establish negative emotion management mode, and construct a new pattern of the public opinion guidance.  相似文献   

13.
[目的/意义]社会化在线评论与传统的专业性评论相比,具有更为显著的传播速度和影响力。文本评论中的情感因素并非单纯的数量化评分能够完全体现的。对本文评论中情感因素的测量与分析,能够有助于在线评论的全角度识别与揭示,更加客观准确地反映在线评论的价值。[过程/方法]通过提取用户发布的在线文本评论数据,采用有监督机器学习的算法,分别计算文本评论的情感分类得分、情感倾向得分、综合情感得分。从类型、地区、人数多个维度对情感得分与总评分进行交叉对比分析。[结果/结论]研究结果表明,文本评论蕴含的情感因素对总评分具有部分的影响作用。用户的认知偏好、社会文化背景和评论人数占比会对情感因素的有用性产生影响。  相似文献   

14.
We have developed methods for storing and retrieving large dictionaries of word pairs and other multi-word phrases based on hashed indexing. From analysis of text samples we have derived Zipfian laws for the frequency distributions of word pairs and longer phrases. We show where these Zipfian curves cross and deduce that the number of multi-word phrases which occur frequently in text is surprisingly small, of the same order of magnitude as the number of individual word-types in a text. Dictionaries of phrases are therefore amenable to fast processing with modest computer equipment. Finally, we suggest that in stylistic analysis word phrases might better discriminate between authors than do single words.  相似文献   

15.
Acoustic feature selection for automatic emotion recognition from speech   总被引:1,自引:0,他引:1  
Emotional expression and understanding are normal instincts of human beings, but automatical emotion recognition from speech without referring any language or linguistic information remains an unclosed problem. The limited size of existing emotional data samples, and the relative higher dimensionality have outstripped many dimensionality reduction and feature selection algorithms. This paper focuses on the data preprocessing techniques which aim to extract the most effective acoustic features to improve the performance of the emotion recognition. A novel algorithm is presented in this paper, which can be applied on a small sized data set with a high number of features. The presented algorithm integrates the advantages from a decision tree method and the random forest ensemble. Experiment results on a series of Chinese emotional speech data sets indicate that the presented algorithm can achieve improved results on emotional recognition, and outperform the commonly used Principle Component Analysis (PCA)/Multi-Dimensional Scaling (MDS) methods, and the more recently developed ISOMap dimensionality reduction method.  相似文献   

16.
张亚明  赵杨  王林 《软科学》2016,(7):118-123
以执行意向理论为基础,利用文本挖掘技术和社会网络分析方法对网购评论行为反应模式进行了初步探索。研究发现:驱动网购评论行为的因素按影响由大到小依次为网购体验、感觉与知觉、商品细节、包装品质、时间感知、色彩偏好、空间转移、价值让渡、人际网络、口碑传播、环境感知、情绪反应和决策行为等;网购目标导向体现为实用、享乐、速度、社交、让利、炫耀、情感、忠诚、分享、推荐和认同导向等行为目标,以及在其刺激下表现为抱怨、焦虑、紧张、愉悦感、恐惧感、情绪起伏、行为忠诚、口碑传播等网购行为反应;深入分析网购评论行为反应的6个主要过程及行为结果,为多学科交叉研究提供了一种可借鉴的方法。  相似文献   

17.
Identifying the emotional causes of mental illnesses is key to effective intervention. Existing emotion-cause analysis approaches can effectively detect simple emotion-cause expressions where only one cause and one emotion exist. However, emotions may often result from multiple causes, implicitly or explicitly, with complex interactions among these causes. Moreover, the same causes may result in multiple emotions. How to model the complex interactions between multiple emotion spans and cause spans remains under-explored. To tackle this problem, a contrastive learning-based framework is presented to detect the complex emotion-cause pairs with the introduction of negative samples and positive samples. Additionally, we developed a large-scale emotion-cause dataset with complex emotion-cause instances based on subreddits associated with mental health. Our proposed approach was compared to prevailing CNN-based, LSTM-based, Transformer-based and GNN-based methods. Extensive experiments have been conducted and the quantifiable outcomes indicate that our proposed solution achieves competitive performance on simple emotion-cause pairs and significantly outperformed baseline methods in extracting complex emotion-cause pairs. Empirical studies further demonstrated that our proposed approach can be used to reveal the emotional causes of mental disorders for effective intervention.  相似文献   

18.
郝彦辉  王曦  陈铎 《情报科学》2021,39(8):78-85
【目的/意义】教育招生考试备受社会各界关注,极易触发舆情事件。及时监测并准确研判相关网络信息传 播发展态势,发现潜在舆情并处置应对,对于保障考试安全和维护学校声誉具有重要意义。【方法/过程】采集研究 生复试期间主流媒体社交平台数据,将BERT语言训练模型同BiLSTM相结合,构建深度神经网络模型,对文本的 情感极性进行分析。用TextRank算法提取不同情感极性类属文本的热门主题词,监测潜在舆情并提出管理建议。 【结果/结论】实证结果表明,该模型能够有效挖掘不同情感极性下的热门主题信息,从而发现潜在隐患以及可能发 生的舆情焦点,为高校网络舆情管控提供了方法参考和实践依据。【创新/局限】与传统方法相比,基于BERT的预训 练语言模型可有效解决因数据量少而导致模型无法准确表示不同语句之间复杂关系的局限性,同时BERT可对文 本进行双向建模,捕获不同句子之间的关系特点,提升对文本情感主题挖掘的准确性。  相似文献   

19.
Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates – words that co-occur with them in the same windows of text. Experiments suggest significant differences between the lexical cohesion in relevant and non-relevant document sets exist. A document ranking method based on lexical cohesion shows some performance improvements.  相似文献   

20.
Methods for document clustering and topic modelling in online social networks (OSNs) offer a means of categorising, annotating and making sense of large volumes of user generated content. Many techniques have been developed over the years, ranging from text mining and clustering methods to latent topic models and neural embedding approaches. However, many of these methods deliver poor results when applied to OSN data as such text is notoriously short and noisy, and often results are not comparable across studies. In this study we evaluate several techniques for document clustering and topic modelling on three datasets from Twitter and Reddit. We benchmark four different feature representations derived from term-frequency inverse-document-frequency (tf-idf) matrices and word embedding models combined with four clustering methods, and we include a Latent Dirichlet Allocation topic model for comparison. Several different evaluation measures are used in the literature, so we provide a discussion and recommendation for the most appropriate extrinsic measures for this task. We also demonstrate the performance of the methods over data sets with different document lengths. Our results show that clustering techniques applied to neural embedding feature representations delivered the best performance over all data sets using appropriate extrinsic evaluation measures. We also demonstrate a method for interpreting the clusters with a top-words based approach using tf-idf weights combined with embedding distance measures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号