Vol.64, No.1, 2020, pp.455-469, doi:10.32604/cmc.2020.09780
OPEN ACCESS
ARTICLE
A Phrase Topic Model Based on Distributed Representation
  • Jialin Ma1, *, Jieyi Cheng1, Lin Zhang1, Lei Zhou1, Bolun Chen1, 2
1 Jiangsu Internet of Things and Moblie Internet Technology Engineering Laboratory, Huaiyin Institute of Technology, Huai’an, 223003, China.
2 University of Fribourg, Fribourg, 1700, Switzerland.
* Corresponding Author: Jialin Ma. Email: majl@hyit.edu.cn.
Received 18 January 2020; Accepted 30 March 2020; Issue published 20 May 2020
Abstract
Traditional topic models have been widely used for analyzing semantic topics from electronic documents. However, the obvious defects of topic words acquired by them are poor in readability and consistency. Only the domain experts are possible to guess their meaning. In fact, phrases are the main unit for people to express semantics. This paper presents a Distributed Representation-Phrase Latent Dirichlet Allocation (DRPhrase LDA) which is a phrase topic model. Specifically, we reasonably enhance the semantic information of phrases via distributed representation in this model. The experimental results show the topics quality acquired by our model is more readable and consistent than other similar topic models.
Keywords
Phrase, topic model, LDA, distributed representation, Gibbs sampling.
Cite This Article
Ma, J., Cheng, J., Zhang, L., Zhou, L., Chen, B. (2020). A Phrase Topic Model Based on Distributed Representation. CMC-Computers, Materials & Continua, 64(1), 455–469.