Probabilistic non-negative matrix factorization and its robust extensions for topic modeling

Luo, M; Nie, F; Chang, X; Yang, Y; Hauptmann, A; Zheng, Q

Probabilistic non-negative matrix factorization and its robust extensions for topic modeling

Luo, M Nie, F Chang, X

Yang, Y

Hauptmann, A Zheng, Q

Permalink

Publication Type:: Conference Proceeding
Citation:: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 2308 - 2314
Issue Date:: 2017-01-01

Closed Access

	Filename	Description	Size
	14469-66897-1-PB.pdf	Published version	811.62 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Luo, M	en_US
dc.contributor.author	Nie, F	en_US
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Hauptmann, A	en_US
dc.contributor.author	Zheng, Q	en_US
dc.date.issued	2017-01-01	en_US
dc.identifier.citation	31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 2308 - 2314	en_US
dc.identifier.uri	http://hdl.handle.net/10453/127243
dc.description.abstract	Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Traditional topic model with maximum likelihood estimate inevitably suffers from the conditional independence of words given the documents topic distribution. In this paper, we follow the generative procedure of topic model and learn the topic-word distribution and topics distribution via directly approximating the word-document co-occurrence matrix with matrix decomposition technique. These methods include: (1) Approximating the normalized document-word conditional distribution with the documents probability matrix and words probability matrix based on probabilistic non-negative matrix factorization (NMF); (2) Since the standard NMF is well known to be non-robust to noises and outliers, we extended the probabilistic NMF of the topic model to its robust versions using ℓ2, 1 -norm and capped ℓ2, 1 -norm based loss functions, respectively. The proposed framework inherits the explicit probabilistic meaning of factors in topic models and simultaneously makes the conditional independence assumption on words unnecessary. Straightforward and efficient algorithms are exploited to solve the corresponding non-smooth and non-convex problems. Experimental results over several benchmark datasets illustrate the effectiveness and superiority of the proposed methods.	en_US
dc.relation.ispartof	31st AAAI Conference on Artificial Intelligence, AAAI 2017	en_US
dc.title	Probabilistic non-negative matrix factorization and its robust extensions for topic modeling	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Traditional topic model with maximum likelihood estimate inevitably suffers from the conditional independence of words given the documents topic distribution. In this paper, we follow the generative procedure of topic model and learn the topic-word distribution and topics distribution via directly approximating the word-document co-occurrence matrix with matrix decomposition technique. These methods include: (1) Approximating the normalized document-word conditional distribution with the documents probability matrix and words probability matrix based on probabilistic non-negative matrix factorization (NMF); (2) Since the standard NMF is well known to be non-robust to noises and outliers, we extended the probabilistic NMF of the topic model to its robust versions using ℓ2, 1 -norm and capped ℓ2, 1 -norm based loss functions, respectively. The proposed framework inherits the explicit probabilistic meaning of factors in topic models and simultaneously makes the conditional independence assumption on words unnecessary. Straightforward and efficient algorithms are exploited to solve the corresponding non-smooth and non-convex problems. Experimental results over several benchmark datasets illustrate the effectiveness and superiority of the proposed methods.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/127243