Opinion spam detection: Using multi-iterative graph-based model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Opinion spam detection: Using multi-iterative graph-based model

Institution:	1. School of Computing, Faculty of Engineering, Universiti Teknologi Malaysia, UTM, 81300, Johor, Malaysia;2. Department of Computer Science and Engineering, College of Engineering, Komar University of Science and Technology, KUST, Sulaimani, Iraq;1. College of Hotel and Tourism Management, Kyung Hee University, Kyung Hee Dearo 26, Dongdeamun-Gu, Seoul 130-701, South Korea;2. Faculty of Communication Sciences, Università della Svizzera italiana, Lugano, Switzerland;1. School of Economics and Management, Beijing University of Technology, Beijing 100124, PR China;2. Research Center on Big Data Sciences, Beijing University of Chemical Technology, Beijing 100029, PR China;3. School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Ashahidai, Nomi City, Ishikawa 923-1292, Japan;4. State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, PR China

Abstract:	The demand to detect opinionated spam, using opinion mining applications to prevent their damaging effects on e-commerce reputations is on the rise in many business sectors globally. The existing spam detection techniques in use nowadays, only consider one or two types of spam entities such as review, reviewer, group of reviewers, and product. Besides, they use a limited number of features related to behaviour, content and the relation of entities which reduces the detection's accuracy. Accordingly, these techniques mostly exploit synthetic datasets to analyse their model and are not able to be applied in the context of the real-world environment. As such, a novel graph-based model called “Multi-iterative Graph-based opinion Spam Detection” (MGSD) in which all various types of entities are considered simultaneously within a unified structure is proposed. Using this approach, the model reveals both implicit (i.e., similar entity's) and explicit (i.e., different entities’) relationships. The MGSD model is able to evaluate the ‘spamicity’ effects of entities more efficiently given it applies a novel multi-iterative algorithm which considers different sets of factors to update the spamicity score of entities. To enhance the accuracy of the MGSD detection model, a higher number of existing weighted features along with the novel proposed features from different categories were selected using a combination of feature fusion techniques and machine learning (ML) algorithms. The MGSD model can also be generalised and applied in various opinionated documents due to employing domain independent features. The output of the MGSD model showed that our feature selection and feature fusion techniques showed a remarkable improvement in detecting spam. The findings of this study showed that MGSD could improve the accuracy of state-of-the-art ML and graph-based techniques by around 5.6% and 4.8%, respectively, also achieving an accuracy of 93% for the detection of spam detection in our synthetic crowdsourced dataset and 95.3% for Ott's crowdsourced dataset.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏