Coupled term-term relation analysis for document clustering

Publication Type:
Conference Proceeding
Proceedings of the International Joint Conference on Neural Networks, 2013
Issue Date:
Filename Description Size
2013001883OK.pdf464.8 kB
Adobe PDF
Full metadata record
Traditional document clustering approaches are usually based on the Bag of Words model, which is limited by its assumption of the independence among terms. Recent strategies have been proposed to capture the relation between terms based on statistical analysis, and they estimate the relation between terms purely by their co-occurrence across the documents. However, the implicit interactions with other link terms are overlooked, which leads to the discovery of incomplete information. This paper proposes a coupled term-term relation model for document representation, which considers both the intra-relation (i.e. co-occurrence of terms) and inter-relation (i.e. dependency of terms via link terms) between a pair of terms. The coupled relation for each pair of terms is further used to map a document onto a new feature space, which includes more semantic information. Substantial experiments verify that the document clustering incorporated with our proposed relation achieves a significant performance improvement compared to the state-of-the-art techniques. © 2013 IEEE.
Please use this identifier to cite or link to this item: