Feature driven and point process approaches for popularity prediction

Mishra, S; Rizoiu, MA; Xie, L

Feature driven and point process approaches for popularity prediction

Mishra, S Rizoiu, MA

Xie, L

Permalink

Publication Type:: Conference Proceeding
Citation:: International Conference on Information and Knowledge Management, Proceedings, 2016, 24-28-October-2016 pp. 1069 - 1078
Issue Date:: 2016-10-24

Closed Access

	Filename	Description	Size
	1608.04862v2.pdf	Accepted Manuscript version	1.34 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Mishra, S	en_US
dc.contributor.author	Rizoiu, MA https://orcid.org/0000-0003-0381-669X	en_US
dc.contributor.author	Xie, L	en_US
dc.date.issued	2016-10-24	en_US
dc.identifier.citation	International Conference on Information and Knowledge Management, Proceedings, 2016, 24-28-October-2016 pp. 1069 - 1078	en_US
dc.identifier.isbn	9781450340731	en_US
dc.identifier.uri	http://hdl.handle.net/10453/135852
dc.description.abstract	© 2016 ACM. Predicting popularity, or the total volume of information outbreaks, is an important subproblem for understanding collective behavior in networks. Each of the two main types of recent approaches to the problem, feature-driven and generative models, have desired qualities and clear limitations. This paper bridges the gap between these solutions with a new hybrid approach and a new performance benchmark. We model each social cascade with a marked Hawkes self-exciting point process, and estimate the content virality, memory decay, and user influence. We then learn a predictive layer for popularity prediction using a collection of cascade history. To our surprise, Hawkes process with a predictive overlay outperform recent feature-driven and generative approaches on existing tweet data [44] and a new public benchmark on news tweets. We also found that a basic set of user features and event time summary statistics performs competitively in both classification and regression tasks, and that adding point process information to the feature set further improves predictions. From these observations, we argue that future work on popularity prediction should compare across feature-driven and generative modeling approaches in both classification and regression tasks.	en_US
dc.relation.ispartof	International Conference on Information and Knowledge Management, Proceedings	en_US
dc.relation.isbasedon	10.1145/2983323.2983812	en_US
dc.title	Feature driven and point process approaches for popularity prediction	en_US
dc.type	Conference Proceeding
utslib.citation.volume	24-28-October-2016	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	24-28-October-2016	en_US

Abstract:

© 2016 ACM. Predicting popularity, or the total volume of information outbreaks, is an important subproblem for understanding collective behavior in networks. Each of the two main types of recent approaches to the problem, feature-driven and generative models, have desired qualities and clear limitations. This paper bridges the gap between these solutions with a new hybrid approach and a new performance benchmark. We model each social cascade with a marked Hawkes self-exciting point process, and estimate the content virality, memory decay, and user influence. We then learn a predictive layer for popularity prediction using a collection of cascade history. To our surprise, Hawkes process with a predictive overlay outperform recent feature-driven and generative approaches on existing tweet data [44] and a new public benchmark on news tweets. We also found that a basic set of user features and event time summary statistics performs competitively in both classification and regression tasks, and that adding point process information to the feature set further improves predictions. From these observations, we argue that future work on popularity prediction should compare across feature-driven and generative modeling approaches in both classification and regression tasks.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/135852