PUED: A Social Spammer Detection Method Based on PU Learning and Ensemble Learning

Song, Y; Gao, M; Yu, J; Li, W; Yu, L; Xiao, X

PUED: A Social Spammer Detection Method Based on PU Learning and Ensemble Learning

Song, Y Gao, M Yu, J Li, W

Yu, L Xiao, X

Permalink

Publication Type:: Conference Proceeding
Citation:: Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 2018, 252 pp. 143 - 152
Issue Date:: 2018-01-01

Closed Access

	Filename	Description	Size
	PUED - A Social Spammer Detection Method Based on PU Learning and Ensemble Learning.pdf	Published version	544.07 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Song, Y	en_US
dc.contributor.author	Gao, M	en_US
dc.contributor.author	Yu, J	en_US
dc.contributor.author	Li, W https://orcid.org/0000-0003-4941-8814	en_US
dc.contributor.author	Yu, L	en_US
dc.contributor.author	Xiao, X	en_US
dc.date.issued	2018-01-01	en_US
dc.identifier.citation	Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 2018, 252 pp. 143 - 152	en_US
dc.identifier.isbn	9783030009151	en_US
dc.identifier.issn	1867-8211	en_US
dc.identifier.uri	http://hdl.handle.net/10453/134088
dc.description.abstract	© 2018, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. In social network, people generally tend to share information with others, thus, those who have frequent access to the social network are more likely to be affected by the interest and opinions of other people. This characteristic is exploited by spammers, who spread spam information in network to disturb normal users for interest motives seriously. Numerous notable studies have been done to detect social spammers, and these methods can be categorized into three types: unsupervised, supervised and semi-supervised methods. While the performance of supervised and semi-supervised methods is superior in terms of detection accuracy, these methods usually suffer from the dilemma of imbalanced data since the number of unlabeled normal users is far more than spammers’ in real situations. To address the problem, we propose a novel method only relying on normal users to detect spammers exactly. We present two steps: one picks out reliable spammers from unlabeled samples which is imposed on a voting classifier; while the other trains a random forest detector from the normal users and reliable spammers. We conduct experiments on two real-world social datasets and show that our method outperforms other supervised methods.	en_US
dc.relation.ispartof	Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST	en_US
dc.relation.isbasedon	10.1007/978-3-030-00916-8_14	en_US
dc.title	PUED: A Social Spammer Detection Method Based on PU Learning and Ensemble Learning	en_US
dc.type	Conference Proceeding
utslib.citation.volume	252	en_US
utslib.for	0805 Distributed Computing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	252	en_US

Abstract:

© 2018, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. In social network, people generally tend to share information with others, thus, those who have frequent access to the social network are more likely to be affected by the interest and opinions of other people. This characteristic is exploited by spammers, who spread spam information in network to disturb normal users for interest motives seriously. Numerous notable studies have been done to detect social spammers, and these methods can be categorized into three types: unsupervised, supervised and semi-supervised methods. While the performance of supervised and semi-supervised methods is superior in terms of detection accuracy, these methods usually suffer from the dilemma of imbalanced data since the number of unlabeled normal users is far more than spammers’ in real situations. To address the problem, we propose a novel method only relying on normal users to detect spammers exactly. We present two steps: one picks out reliable spammers from unlabeled samples which is imposed on a voting classifier; while the other trains a random forest detector from the normal users and reliable spammers. We conduct experiments on two real-world social datasets and show that our method outperforms other supervised methods.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/134088