Bag-of-Discriminative-Words (BoDW) Representation via Topic Modeling

Zhuang, Y; Wang, H; Xiao, J; Wu, F; Yang, Y; Lu, W; Zhang, Z

Bag-of-Discriminative-Words (BoDW) Representation via Topic Modeling

Zhuang, Y Wang, H Xiao, J Wu, F Yang, Y

Lu, W Zhang, Z

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (5), pp. 977 - 990
Issue Date:: 2017-05-01

Closed Access

	Filename	Description	Size
	b.pdf	Published Version	2.09 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhuang, Y	en_US
dc.contributor.author	Wang, H	en_US
dc.contributor.author	Xiao, J	en_US
dc.contributor.author	Wu, F	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Lu, W	en_US
dc.contributor.author	Zhang, Z	en_US
dc.date.issued	2017-05-01	en_US
dc.identifier.citation	IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (5), pp. 977 - 990	en_US
dc.identifier.issn	1041-4347	en_US
dc.identifier.uri	http://hdl.handle.net/10453/123632
dc.description.abstract	© 2017 IEEE. Many of the words in a given document either deliver facts (objective) or express opinions (subjective), respectively, depending on the topics they are involved in. For example, given a bunch of documents, the word "bug" assigned to the topic "order Hemiptera" apparently remarks one object (i.e., one kind of insects), while the same word assigned to the topic "software" probably conveys a negative opinion. Motivated by the intuitive assumption that different words have varying degrees of discriminative power in delivering the objective sense or the subjective sense with respect to their assigned topics, a model named as discriminatively objective-subjective LDA (dosLDA) is proposed in this paper. The essential idea underlying the proposed dosLDA is that a pair of objective and subjective selection variables are explicitly employed to encode the interplay between topics and discriminative power for the words in documents in a supervised manner. As a result, each document is appropriately represented as "bag-of-discriminative-words" (BoDW). The experiments reported on documents and images demonstrate that dosLDA not only performs competitively over traditional approaches in terms of topic modeling and document classification, but also has the ability to discern the discriminative power of each word in terms of its objective or subjective sense with respect to its assigned topic.	en_US
dc.relation.ispartof	IEEE Transactions on Knowledge and Data Engineering	en_US
dc.relation.isbasedon	10.1109/TKDE.2017.2658571	en_US
dc.subject.classification	Information Systems	en_US
dc.title	Bag-of-Discriminative-Words (BoDW) Representation via Topic Modeling	en_US
dc.type	Journal Article
utslib.citation.volume	5	en_US
utslib.citation.volume	29	en_US
utslib.for	0899 Other Information and Computing Sciences	en_US
utslib.for	08 Information and Computing Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	5	en_US
pubs.publication-status	Published	en_US
pubs.volume	29	en_US

Abstract:

© 2017 IEEE. Many of the words in a given document either deliver facts (objective) or express opinions (subjective), respectively, depending on the topics they are involved in. For example, given a bunch of documents, the word "bug" assigned to the topic "order Hemiptera" apparently remarks one object (i.e., one kind of insects), while the same word assigned to the topic "software" probably conveys a negative opinion. Motivated by the intuitive assumption that different words have varying degrees of discriminative power in delivering the objective sense or the subjective sense with respect to their assigned topics, a model named as discriminatively objective-subjective LDA (dosLDA) is proposed in this paper. The essential idea underlying the proposed dosLDA is that a pair of objective and subjective selection variables are explicitly employed to encode the interplay between topics and discriminative power for the words in documents in a supervised manner. As a result, each document is appropriately represented as "bag-of-discriminative-words" (BoDW). The experiments reported on documents and images demonstrate that dosLDA not only performs competitively over traditional approaches in terms of topic modeling and document classification, but also has the ability to discern the discriminative power of each word in terms of its objective or subjective sense with respect to its assigned topic.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/123632