Identifying Objective and Subjective Words via Topic Modeling

Publication Type:
Journal Article
Citation:
IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (3), pp. 718 - 730
Issue Date:
2018-03-01
Filename Description Size
07820202.pdfPublished Version4.25 MB
Adobe PDF
Full metadata record
© 2016 IEEE. It is observed that distinct words in a given document have either strong or weak ability in delivering facts (i.e., the objective sense) or expressing opinions (i.e., the subjective sense) depending on the topics they associate with. Motivated by the intuitive assumption that different words have varying degree of discriminative power in delivering the objective sense or the subjective sense with respect to their assigned topics, a model named as identified objective-subjective latent Dirichlet allocation (LDA) (iosLDA) is proposed in this paper. In the iosLDA model, the simple Pólya urn model adopted in traditional topic models is modified by incorporating it with a probabilistic generative process, in which the novel 'Bag-of-Discriminative-Words' (BoDW) representation for the documents is obtained; each document has two different BoDW representations with regard to objective and subjective senses, respectively, which are employed in the joint objective and subjective classification instead of the traditional Bag-of-Topics representation. The experiments reported on documents and images demonstrate that: 1) the BoDW representation is more predictive than the traditional ones; 2) iosLDA boosts the performance of topic modeling via the joint discovery of latent topics and the different objective and subjective power hidden in every word; and 3) iosLDA has lower computational complexity than supervised LDA, especially under an increasing number of topics.
Please use this identifier to cite or link to this item: