Concept-based topic model improvement
- Publisher:
- Springer Berlin Heidelberg
- Publication Type:
- Conference Proceeding
- Citation:
- Studies in Computational Intelligence, 2011, 369, pp. 133-142
- Issue Date:
- 2011-10-24
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
We propose a system which employs conceptual knowledge to improve topic models by removing unrelated words from the simplified topic description. We use WordNet to detect which topical words are not conceptually similar to the others and then test our assumptions against human judgment. Results obtained on two different corpora in different test conditions show that the words detected as unrelated had a much greater probability than the others to be chosen by human evaluators as not being part of the topic at all. We prove that there is a strong correlation between the said probability and an automatically calculated topical fitness and we discuss the variation of the correlation depending on the method and data used. © 2011 Springer-Verlag Berlin Heidelberg.
Please use this identifier to cite or link to this item: