Deriving the sentiment polarity of term senses using dual-step context-aware in-gloss matching期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Deriving the sentiment polarity of term senses using dual-step context-aware in-gloss matching

Institution:	1. Key Laboratory of Computer Vision and System (Ministry of Education), Tianjin University of Technology, Tianjin, China;2. Institute of AI, Shandong Computer Science Center(National Supercomputer Center in Jinan), QILU University of Technology, China;1. The Hong Kong Polytechnic University, Hong Kong, China;2. Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China;3. College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China;4. Bio-Computing Research Center, Harbin Institute of Technology, Shenzhen, China;5. Shenzhen Key Laboratory of Visual Object Detection and Recognition, Shenzhen, China;1. Nanyang Technological University, Singapore;2. Singapore University of Technology and Design, Singapore;3. Massachusetts Institute of Technology, USA;1. Beijing University of Posts and Telecommunications, Beijing, China;2. Singapore Management University, Singapore;3. Worcester Polytechnic Institute, USA;4. Alibaba Group, Hangzhou, China;1. Xianyang Vocational Technical College, Xianyang, P. R. China;2. China Electric Power Research Institute, Beijing, P. R. China;3. GuiZhou University, Guizhou Provincial Key Laboratory of Public Big Data, Guiyang, P. R. China;4. State Key Laboratory of Integrated Service Networks, School of Telecommunications Engineering, Xidian University, Xi’an, P. R. China;5. Pedagogical University of Krakow, Podchorazych 2 St., 30-084 Kraków, Poland

Abstract:	Vital to the task of Sentiment Analysis (SA), or automatically mining sentiment expression from text, is a sentiment lexicon. This fundamental lexical resource comprises the smallest sentiment-carrying units of text, words, annotated for their sentiment properties, and aids in SA tasks on larger pieces of text. Unfortunately, digital dictionaries do not readily include information on the sentiment properties of their entries, and manually compiling sentiment lexicons is tedious in terms of annotator time and effort. This has resulted in the emergence of a large number of research works concentrated on automated sentiment lexicon generation. The dictionary-based approach involves leveraging digital dictionaries, while the corpus-based approach involves exploiting co-occurrence statistics embedded in text corpora. Although the former approach has been exhaustively investigated, the majority of works focus on terms. The few state-of-the-art models concentrated on the finer-grained term sense level remain to exhibit several prominent limitations, e.g., the proposed semantic relations algorithm retrieves only senses that are at a close proximity to the seed senses in the semantic network, thus prohibiting the retrieval of remote sentiment-carrying senses beyond the reach of the ‘radius’ defined by number of iterations of semantic relations expansion. The proposed model aims to overcome the issues inherent in dictionary-based sense-level sentiment lexicon generation models using: (1) null seed sets, and a morphological approach inspired by the Marking Theory in Linguistics to populate them automatically; (2) a dual-step context-aware gloss expansion algorithm that ‘mines’ human defined gloss information from a digital dictionary, ensuring senses overlooked by the semantic relations expansion algorithm are identified; and (3) a fully-unsupervised sentiment categorization algorithm on the basis of the Network Theory. The results demonstrate that context-aware in-gloss matching successfully retrieves senses beyond the reach of the semantic relations expansion algorithm used by prominent, well-known models. Evaluation of the proposed model to accurately assign senses with polarity demonstrates that it is on par with state-of-the-art models against the same gold standard benchmarks. The model has theoretical implications in future work to effectively exploit the readily-available human-defined gloss information in a digital dictionary, in the task of assigning polarity to term senses. Extrinsic evaluation in a real-world sentiment classification task on multiple publically-available varying-domain datasets demonstrates its practical implication and application in sentiment analysis, as well as in other related fields such as information science, opinion retrieval and computational linguistics.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏