ASTM: An Attentional Segmentation Based Topic Model for Short Texts

Publication Type:
Conference Proceeding
Citation:
Proceedings - IEEE International Conference on Data Mining, ICDM, 2018, 2018-November pp. 577 - 586
Issue Date:
2018-12-27
Filename Description Size
08594882.pdfPublished version1.03 MB
Adobe PDF
Full metadata record
© 2018 IEEE. To address the data sparsity problem in short text understanding, various alternative topic models leveraging word embeddings as background knowledge have been developed recently. However, existing models combine auxiliary information and topic modeling in a straightforward way without considering human reading habits. In contrast, extensive studies have proven that it is full of potential in textual analysis by taking into account human attention. Therefore, we propose a novel model, Attentional Segmentation based Topic Model (ASTM), to integrate both word embeddings as supplementary information and an attention mechanism that segments short text documents into fragments of adjacent words receiving similar attention. Each segment is assigned to a topic and each document can have multiple topics. We evaluate the performance of our model on three real-world short text datasets. The experimental results demonstrate that our model outperforms the state-of-the-art in terms of both topic coherence and text classification.
Please use this identifier to cite or link to this item: