首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Multilingual opinion mining on YouTube – A convolutional N-gram BiLSTM word embedding
Authors:Huy Tien Nguyen  Minh Le Nguyen
Institution:Japan Advanced Institute of Science and Technology (JAIST), Japan
Abstract:Opinion mining in a multilingual and multi-domain environment as YouTube requires models to be robust across domains as well as languages, and not to rely on linguistic resources (e.g. syntactic parsers, POS-taggers, pre-defined dictionaries) which are not always available in many languages. In this work, we i) proposed a convolutional N-gram BiLSTM (CoNBiLSTM) word embedding which represents a word with semantic and contextual information in short and long distance periods; ii) applied CoNBiLSTM word embedding for predicting the type of a comment, its polarity sentiment (positive, neutral or negative) and whether the sentiment is directed toward the product or video; iii) evaluated the efficiency of our model on the SenTube dataset, which contains comments from two domains (i.e. automobile, tablet) and two languages (i.e. English, Italian). According to the experimental results, CoNBiLSTM generally outperforms the approach using SVM with shallow syntactic structures (STRUCT) – the current state-of-the-art sentiment analysis on the SenTube dataset. In addition, our model achieves more robustness across domains than the STRUCT (e.g. 7.47% of the difference in performance between the two domains for our model vs. 18.8% for the STRUCT)
Keywords:Sentiment analysis  Multilingual opinion mining  Convolutional  N-gram word embedding  BiLSTM
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号