A Temperature-Modified Dynamic Embedded Topic Model

Publisher:
Springer Nature
Publication Type:
Conference Proceeding
Citation:
Communications in Computer and Information Science, 2022, 1741 CCIS, pp. 15-27
Issue Date:
2022-01-01
Full metadata record
Topic models are natural language processing models that can parse large collections of documents and automatically discover their main topics. However, conventional topic models fail to capture how such topics change as the collections evolve. To amend this, various researchers have proposed dynamic versions which are able to extract sequences of topics from timestamped document collections. Moreover, a recently-proposed model, the dynamic embedded topic model (DETM), joins such a dynamic analysis with the representational power of word and topic embeddings. In this paper, we propose modifying its word probabilities with a temperature parameter that controls the smoothness/sharpness trade-off of the distributions in an attempt to increase the coherence of the extracted topics. Experimental results over a selection of the COVID-19 Open Research Dataset (CORD-19), the United Nations General Debate Corpus, and the ACL Title and Abstract dataset show that the proposed model – nicknamed DETM-tau after the temperature parameter – has been able to improve the model’s perplexity and topic coherence for all datasets.
Please use this identifier to cite or link to this item: