首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sentiment analysis is a text classification branch, which is defined as the process of extracting sentiment terms (i.e. feature/aspect, or opinion) and determining their opinion semantic orientation. At aspect level, aspect extraction is the core task for sentiment analysis which can either be implicit or explicit aspects. The growth of sentiment analysis has resulted in the emergence of various techniques for both explicit and implicit aspect extraction. However, majority of the research attempts targeted explicit aspect extraction, which indicates that there is a lack of research on implicit aspect extraction. This research provides a review of implicit aspect/features extraction techniques from different perspectives. The first perspective is making a comparison analysis for the techniques available for implicit term extraction with a brief summary of each technique. The second perspective is classifying and comparing the performance, datasets, language used, and shortcomings of the available techniques. In this study, over 50 articles have been reviewed, however, only 45 articles on implicit aspect extraction that span from 2005 to 2016 were analyzed and discussed. Majority of the researchers on implicit aspects extraction rely heavily on unsupervised methods in their research, which makes about 64% of the 45 articles, followed by supervised methods of about 27%, and lastly semi-supervised of 9%. In addition, 25 articles conducted the research work solely on product reviews, and 5 articles conducted their research work using product reviews jointly with other types of data, which makes product review datasets the most frequently used data type compared to other types. Furthermore, research on implicit aspect features extraction has focused on English and Chinese languages compared to other languages. Finally, this review also provides recommendations for future research directions and open problems.  相似文献   

2.
3.
Although deep learning breakthroughs in NLP are based on learning distributed word representations by neural language models, these methods suffer from a classic drawback of unsupervised learning techniques. Furthermore, the performance of general-word embedding has been shown to be heavily task-dependent. To tackle this issue, recent researches have been proposed to learn the sentiment-enhanced word vectors for sentiment analysis. However, the common limitation of these approaches is that they require external sentiment lexicon sources and the construction and maintenance of these resources involve a set of complexing, time-consuming, and error-prone tasks. In this regard, this paper proposes a method of sentiment lexicon embedding that better represents sentiment word's semantic relationships than existing word embedding techniques without manually-annotated sentiment corpus. The major distinguishing factor of the proposed framework was that joint encoding morphemes and their POS tags, and training only important lexical morphemes in the embedding space. To verify the effectiveness of the proposed method, we conducted experiments comparing with two baseline models. As a result, the revised embedding approach mitigated the problem of conventional context-based word embedding method and, in turn, improved the performance of sentiment classification.  相似文献   

4.
Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews.QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon.On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users’ needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods.  相似文献   

5.
Aspect level sentiment analysis is important for numerous opinion mining and market analysis applications. In this paper, we study the problem of identifying and rating review aspects, which is the fundamental task in aspect level sentiment analysis. Previous review aspect analysis methods seldom consider entity or rating but only 2-tuples, i.e., head and modifier pair, e.g., in the phrase “nice room”, “room” is the head and “nice” is the modifier. To solve this problem, we novelly present a Quad-tuple Probability Latent Semantic Analysis (QPLSA), which incorporates entity and its rating together with the 2-tuples into the PLSA model. Specifically, QPLSA not only generates fine-granularity aspects, but also captures the correlations between words and ratings. We also develop two novel prediction approaches, the Quad-tuple Prediction (from the global perspective) and the Expectation Prediction (from the local perspective). For evaluation, systematic experiments show that: Quad-tuple PLSA outperforms 2-tuple PLSA significantly on both aspect identification and aspect rating prediction for publication datasets. Moreover, for aspect rating prediction, QPLSA shows significant superiority over state-of-the-art baseline methods. Besides, the Quad-tuple Prediction and the Expectation Prediction also show their strong ability in aspect rating on different datasets.  相似文献   

6.
Opinion mining in a multilingual and multi-domain environment as YouTube requires models to be robust across domains as well as languages, and not to rely on linguistic resources (e.g. syntactic parsers, POS-taggers, pre-defined dictionaries) which are not always available in many languages. In this work, we i) proposed a convolutional N-gram BiLSTM (CoNBiLSTM) word embedding which represents a word with semantic and contextual information in short and long distance periods; ii) applied CoNBiLSTM word embedding for predicting the type of a comment, its polarity sentiment (positive, neutral or negative) and whether the sentiment is directed toward the product or video; iii) evaluated the efficiency of our model on the SenTube dataset, which contains comments from two domains (i.e. automobile, tablet) and two languages (i.e. English, Italian). According to the experimental results, CoNBiLSTM generally outperforms the approach using SVM with shallow syntactic structures (STRUCT) – the current state-of-the-art sentiment analysis on the SenTube dataset. In addition, our model achieves more robustness across domains than the STRUCT (e.g. 7.47% of the difference in performance between the two domains for our model vs. 18.8% for the STRUCT)  相似文献   

7.
Multimodal sentiment analysis aims to judge the sentiment of multimodal data uploaded by the Internet users on various social media platforms. On one hand, existing studies focus on the fusion mechanism of multimodal data such as text, audio and visual, but ignore the similarity of text and audio, text and visual, and the heterogeneity of audio and visual, resulting in deviation of sentiment analysis. On the other hand, multimodal data brings noise irrelevant to sentiment analysis, which affects the effectness of fusion. In this paper, we propose a Polar-Vector and Strength-Vector mixer model called PS-Mixer, which is based on MLP-Mixer, to achieve better communication between different modal data for multimodal sentiment analysis. Specifically, we design a Polar-Vector (PV) and a Strength-Vector (SV) for judging the polar and strength of sentiment separately. PV is obtained from the communication of text and visual features to decide the sentiment that is positive, negative, or neutral sentiment. SV is gained from the communication between the text and audio features to analyze the sentiment strength in the range of 0 to 3. Furthermore, we devise an MLP-Communication module (MLP-C) composed of several fully connected layers and activation functions to make the different modal features fully interact in both the horizontal and the vertical directions, which is a novel attempt to use MLP for multimodal information communication. Finally, we mix PV and SV to obtain a fusion vector to judge the sentiment state. The proposed PS-Mixer is tested on two publicly available datasets, CMU-MOSEI and CMU-MOSI, which achieves the state-of-the-art (SOTA) performance on CMU-MOSEI compared with baseline methods. The codes are available at: https://github.com/metaphysicser/PS-Mixer.  相似文献   

8.
As a hot spot these years, cross-domain sentiment classification aims to learn a reliable classifier using labeled data from a source domain and evaluate the classifier on a target domain. In this vein, most approaches utilized domain adaptation that maps data from different domains into a common feature space. To further improve the model performance, several methods targeted to mine domain-specific information were proposed. However, most of them only utilized a limited part of domain-specific information. In this study, we first develop a method of extracting domain-specific words based on the topic information derived from topic models. Then, we propose a Topic Driven Adaptive Network (TDAN) for cross-domain sentiment classification. The network consists of two sub-networks: a semantics attention network and a domain-specific word attention network, the structures of which are based on transformers. These sub-networks take different forms of input and their outputs are fused as the feature vector. Experiments validate the effectiveness of our TDAN on sentiment classification across domains. Case studies also indicate that topic models have the potential to add value to cross-domain sentiment classification by discovering interpretable and low-dimensional subspaces.  相似文献   

9.
In recent years, there has been a rapid growth of user-generated data in collaborative tagging (a.k.a. folksonomy-based) systems due to the prevailing of Web 2.0 communities. To effectively assist users to find their desired resources, it is critical to understand user behaviors and preferences. Tag-based profile techniques, which model users and resources by a vector of relevant tags, are widely employed in folksonomy-based systems. This is mainly because that personalized search and recommendations can be facilitated by measuring relevance between user profiles and resource profiles. However, conventional measurements neglect the sentiment aspect of user-generated tags. In fact, tags can be very emotional and subjective, as users usually express their perceptions and feelings about the resources by tags. Therefore, it is necessary to take sentiment relevance into account into measurements. In this paper, we present a novel generic framework SenticRank to incorporate various sentiment information to various sentiment-based information for personalized search by user profiles and resource profiles. In this framework, content-based sentiment ranking and collaborative sentiment ranking methods are proposed to obtain sentiment-based personalized ranking. To the best of our knowledge, this is the first work of integrating sentiment information to address the problem of the personalized tag-based search in collaborative tagging systems. Moreover, we compare the proposed sentiment-based personalized search with baselines in the experiments, the results of which have verified the effectiveness of the proposed framework. In addition, we study the influences by popular sentiment dictionaries, and SenticNet is the most prominent knowledge base to boost the performance of personalized search in folksonomy.  相似文献   

10.
Sentiment analysis on Twitter has attracted much attention recently due to its wide applications in both, commercial and public sectors. In this paper we present SentiCircles, a lexicon-based approach for sentiment analysis on Twitter. Different from typical lexicon-based approaches, which offer a fixed and static prior sentiment polarities of words regardless of their context, SentiCircles takes into account the co-occurrence patterns of words in different contexts in tweets to capture their semantics and update their pre-assigned strength and polarity in sentiment lexicons accordingly. Our approach allows for the detection of sentiment at both entity-level and tweet-level. We evaluate our proposed approach on three Twitter datasets using three different sentiment lexicons to derive word prior sentiments. Results show that our approach significantly outperforms the baselines in accuracy and F-measure for entity-level subjectivity (neutral vs. polar) and polarity (positive vs. negative) detections. For tweet-level sentiment detection, our approach performs better than the state-of-the-art SentiStrength by 4–5% in accuracy in two datasets, but falls marginally behind by 1% in F-measure in the third dataset.  相似文献   

11.
Aspect-based sentiment analysis allows one to compute the sentiment for an aspect in a certain context. One problem in this analysis is that words possibly carry different sentiments for different aspects. Moreover, an aspect’s sentiment might be highly influenced by the domain-specific knowledge. In order to tackle these issues, in this paper, we propose a hybrid solution for sentence-level aspect-based sentiment analysis using A Lexicalized Domain Ontology and a Regularized Neural Attention model (ALDONAr). The bidirectional context attention mechanism is introduced to measure the influence of each word in a given sentence on an aspect’s sentiment value. The classification module is designed to handle the complex structure of a sentence. The manually created lexicalized domain ontology is integrated to utilize the field-specific knowledge. Compared to the existing ALDONA model, ALDONAr uses BERT word embeddings, regularization, the Adam optimizer, and different model initialization. Moreover, its classification module is enhanced with two 1D CNN layers providing superior results on standard datasets.  相似文献   

12.
Aspect-based sentiment analysis aims to determine sentiment polarities toward specific aspect terms within the same sentence or document. Most recent studies adopted attention-based neural network models to implicitly connect aspect terms with context words. However, these studies were limited by insufficient interaction between aspect terms and opinion words, leading to poor performance on robustness test sets. In addition, we have found that robustness test sets create new sentences that interfere with the original information of a sentence, which often makes the text too long and leads to the problem of long-distance dependence. Simultaneously, these new sentences produce more non-target aspect terms, misleading the model because of the lack of relevant knowledge guidance. This study proposes a knowledge guided multi-granularity graph convolutional neural network (KMGCN) to solve these problems. The multi-granularity attention mechanism is designed to enhance the interaction between aspect terms and opinion words. To address the long-distance dependence, KMGCN uses a graph convolutional network that relies on a semantic map based on fine-tuning pre-trained models. In particular, KMGCN uses a mask mechanism guided by conceptual knowledge to encounter more aspect terms (including target and non-target aspect terms). Experiments are conducted on 12 SemEval-2014 variant benchmarking datasets, and the results demonstrated the effectiveness of the proposed framework.  相似文献   

13.
Electronic word of mouth (eWOM) is prominent and abundant in consumer domains. Both consumers and product/service providers need help in understanding and navigating the resulting information spaces, which are vast and dynamic. The general tone or polarity of reviews, blogs or tweets provides such help. In this paper, we explore the viability of automatic sentiment analysis (SA) for assessing the polarity of a product or a service review. To do so, we examine the potential of the major approaches to sentiment analysis, along with star ratings, in capturing the true sentiment of a review. We further model contextual factors (specifically, product type and review length) as two moderators affecting SA accuracy. The results of our analysis of 900 reviews suggest that different tools representing the main approaches to SA display differing levels of accuracy, yet overall, SA is very effective in detecting the underlying tone of the analyzed content, and can be used as a complement or an alternative to star ratings. The results further reveal that contextual factors such as product type and review length, play a role in affecting the ability of a technique to reflect the true sentiment of a review.  相似文献   

14.
In recent years, fake news detection has been a significant task attracting much attention. However, most current approaches utilize the features from a single modality, such as text or image, while the comprehensive fusion between features of different modalities has been ignored. To deal with the above problem, we propose a novel model named Bidirectional Cross-Modal Fusion (BCMF), which comprehensively integrates the textual and visual representations in a bidirectional manner. Specifically, the proposed model is decomposed into four submodules, i.e., the input embedding, the image2text fusion, the text2image fusion, and the prediction module. We conduct intensive experiments on four real-world datasets, i.e., Weibo, Twitter, Politi, and Gossip. The results show 2.2, 2.5, 4.9, and 3.1 percentage points of improvements in classification accuracy compared to the state-of-the-art methods on Weibo, Twitter, Politi, and Gossip, respectively. The experimental results suggest that the proposed model could better capture integrated information of different modalities and has high generalizability among different datasets. Further experiments suggest that the bidirectional fusions, the number of multi-attention heads, and the aggregating function could impact the performance of the cross-modal fake news detection. The research sheds light on the role of bidirectional cross-modal fusion in leveraging multi-modal information to improve the effect of fake news detection.  相似文献   

15.
Aspect-based sentiment analysis technologies may be a very practical methodology for securities trading, commodity sales, movie rating websites, etc. Most recent studies adopt the recurrent neural network or attention-based neural network methods to infer aspect sentiment using opinion context terms and sentence dependency trees. However, due to a sentence often having multiple aspects sentiment representation, these models are hard to achieve satisfactory classification results. In this paper, we discuss these problems by encoding sentence syntax tree, words relations and opinion dictionary information in a unified framework. We called this method heterogeneous graph neural networks (Hete_GNNs). Firstly, we adopt the interactive aspect words and contexts to encode the sentence sequence information for parameter sharing. Then, we utilized a novel heterogeneous graph neural network for encoding these sentences’ syntax dependency tree, prior sentiment dictionary, and some part-of-speech tagging information for sentiment prediction. We perform the Hete_GNNs sentiment judgment and report the experiments on five domain datasets, and the results confirm that the heterogeneous context information can be better captured with heterogeneous graph neural networks. The improvement of the proposed method is demonstrated by aspect sentiment classification task comparison.  相似文献   

16.
Existing methods for text generation usually fed the overall sentiment polarity of a product as an input into the seq2seq model to generate a relatively fluent review. However, these methods cannot express more fine-grained sentiment polarity. Although some studies attempt to generate aspect-level sentiment controllable reviews, the personalized attribute of reviews would be ignored. In this paper, a hierarchical template-transformer model is proposed for personalized fine-grained sentiment controllable generation, which aims to generate aspect-level sentiment controllable reviews with personalized information. The hierarchical structure can effectively learn sentiment information and lexical information separately. The template transformer uses a part of speech (POS) template to guide the generation process and generate a smoother review. To verify our model, we used the existing model to obtain a corpus named FSCG-80 from Yelp, which contains 800K samples and conducted a series of experiments on this corpus. Experimental results show that our model can achieve up to 89.93% aspect-sentiment control accuracy and generate more fluent reviews.  相似文献   

17.
Sentiment lexicons are essential tools for polarity classification and opinion mining. In contrast to machine learning methods that only leverage text features or raw text for sentiment analysis, methods that use sentiment lexicons embrace higher interpretability. Although a number of domain-specific sentiment lexicons are made available, it is impractical to build an ex ante lexicon that fully reflects the characteristics of the language usage in endless domains. In this article, we propose a novel approach to simultaneously train a vanilla sentiment classifier and adapt word polarities to the target domain. Specifically, we sequentially track the wrongly predicted sentences and use them as the supervision instead of addressing the gold standard as a whole to emulate the life-long cognitive process of lexicon learning. An exploration-exploitation mechanism is designed to trade off between searching for new sentiment words and updating the polarity score of one word. Experimental results on several popular datasets show that our approach significantly improves the sentiment classification performance for a variety of domains by means of improving the quality of sentiment lexicons. Case-studies also illustrate how polarity scores of the same words are discovered for different domains.  相似文献   

18.
Recently, sentiment classification has received considerable attention within the natural language processing research community. However, since most recent works regarding sentiment classification have been done in the English language, there are accordingly not enough sentiment resources in other languages. Manual construction of reliable sentiment resources is a very difficult and time-consuming task. Cross-lingual sentiment classification aims to utilize annotated sentiment resources in one language (typically English) for sentiment classification of text documents in another language. Most existing research works rely on automatic machine translation services to directly project information from one language to another. However, different term distribution between original and translated text documents and translation errors are two main problems faced in the case of using only machine translation. To overcome these problems, we propose a novel learning model based on active learning and semi-supervised co-training to incorporate unlabelled data from the target language into the learning process in a bi-view framework. This model attempts to enrich training data by adding the most confident automatically-labelled examples, as well as a few of the most informative manually-labelled examples from unlabelled data in an iterative process. Further, in this model, we consider the density of unlabelled data so as to select more representative unlabelled examples in order to avoid outlier selection in active learning. The proposed model was applied to book review datasets in three different languages. Experiments showed that our model can effectively improve the cross-lingual sentiment classification performance and reduce labelling efforts in comparison with some baseline methods.  相似文献   

19.
Analyzing and extracting insights from user-generated data has become a topic of interest among businesses and research groups because such data contains valuable information, e.g., consumers’ opinions, ratings, and recommendations of products and services. However, the true value of social media data is rarely discovered due to overloaded information. Existing literature in analyzing online hotel reviews mainly focuses on a single data resource, lexicon, and analysis method and rarely provides marketing insights and decision-making information to improve business’ service and quality of products. We propose an integrated framework which includes a data crawler, data preprocessing, sentiment-sensitive tree construction, convolution tree kernel classification, aspect extraction and category detection, and visual analytics to gain insights into hotel ratings and reviews. The empirical findings show that our proposed approach outperforms baseline algorithms as well as well-known sentiment classification methods, and achieves high precision (0.95) and recall (0.96). The visual analytics results reveal that Business travelers tend to give lower ratings, while Couples tend to give higher ratings. In general, users tend to rate lowest in July and highest in December. The Business travelers more frequently use negative keywords, such as “rude,” “terrible,” “horrible,” “broken,” and “dirty,” to express their dissatisfied emotions toward their hotel stays in July.  相似文献   

20.
The polarity shift problem is a major factor that affects classification performance of machine-learning-based sentiment analysis systems. In this paper, we propose a three-stage cascade model to address the polarity shift problem in the context of document-level sentiment classification. We first split each document into a set of subsentences and build a hybrid model that employs rules and statistical methods to detect explicit and implicit polarity shifts, respectively. Secondly, we propose a polarity shift elimination method, to remove polarity shift in negations. Finally, we train base classifiers on training subsets divided by different types of polarity shifts, and use a weighted combination of the component classifiers for sentiment classification. The results on a range of experiments illustrate that our approach significantly outperforms several alternative methods for polarity shift detection and elimination.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号