ASTM: An Attentional Segmentation Based Topic Model for Short Texts
- Publication Type:
- Conference Proceeding
- Proceedings - IEEE International Conference on Data Mining, ICDM, 2018, 2018-November pp. 577 - 586
- Issue Date:
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
© 2018 IEEE. To address the data sparsity problem in short text understanding, various alternative topic models leveraging word embeddings as background knowledge have been developed recently. However, existing models combine auxiliary information and topic modeling in a straightforward way without considering human reading habits. In contrast, extensive studies have proven that it is full of potential in textual analysis by taking into account human attention. Therefore, we propose a novel model, Attentional Segmentation based Topic Model (ASTM), to integrate both word embeddings as supplementary information and an attention mechanism that segments short text documents into fragments of adjacent words receiving similar attention. Each segment is assigned to a topic and each document can have multiple topics. We evaluate the performance of our model on three real-world short text datasets. The experimental results demonstrate that our model outperforms the state-of-the-art in terms of both topic coherence and text classification.
Please use this identifier to cite or link to this item: