Detecting collusive spamming activities in community question answering

Liu, Yuli and Liu, Yiqun and Zhou, Ke and Zhang, Min and Ma, Shaoping (2017) Detecting collusive spamming activities in community question answering. In: 26th International Conference on World Wide Web, 3-7 April 2017, Perth, Australia.

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Available under Licence Creative Commons Attribution.
Download (1MB) | Preview

Abstract

Community Question Answering (CQA) portals provide rich sources of information on a variety of topics. However, the authenticity and quality of questions and answers (Q&As) has proven hard to control. In a troubling direction, the widespread growth of crowdsourcing websites has created a large-scale, potentially difficult-to-detect workforce to manipulate malicious contents in CQA. The crowd workers who join the same crowdsourcing task about promotion campaigns in CQA collusively manipulate deceptive Q&As for promoting a target (product or service). The collusive spamming group can fully control the sentiment of the target. How to utilize the structure and the attributes for detecting manipulated Q&As? How to detect the collusive group and leverage the group information for the detection task?

To shed light on these research questions, we propose a unified framework to tackle the challenge of detecting collusive spamming activities of CQA. First, we interpret the questions and answers in CQA as two independent networks. Second, we detect collusive question groups and answer groups from these two networks respectively by measuring the similarity of the contents posted within a short duration. Third, using attributes (individual-level and group-level) and correlations (user-based and content-based), we proposed a combined factor graph model to detect deceptive Q&As simultaneously by combining two independent factor graphs. With a large-scale practical data set, we find that the proposed framework can detect deceptive contents at early stage, and outperforms a number of competitive baselines.

Item Type: Conference or Workshop Item (Paper)
Keywords: Community Question Answering; Crowdsourcing Manipulation; Spam Detection; Factor Graph
Schools/Departments: University of Nottingham, UK > Faculty of Science > School of Computer Science
Identification Number: 10.1145/3038912.3052594
Related URLs:
Depositing User: Eprints, Support
Date Deposited: 22 Aug 2017 08:17
Last Modified: 23 Aug 2017 00:30
URI: http://eprints.nottingham.ac.uk/id/eprint/45045

Actions (Archive Staff Only)

Edit View Edit View