Vol.124, No.2, 2020, pp.747-764, doi:10.32604/cmes.2020.010579
Enhancing Embedding-Based Chinese Word Similarity Evaluation with Concepts and Synonyms Knowledge
  • Fulian Yin, Yanyan Wang, Jianbo Liu*, Meiqi Ji
Communication University of China, Beijing, 100024, China
* Corresponding Author: Jianbo Liu. Email: ljbcuc@163.com
(This article belongs to this Special Issue: Information Hiding and Multimedia Security)
Received 13 March 2020; Accepted 08 May 2020; Issue published 20 July 2020
Word similarity (WS) is a fundamental and critical task in natural language processing. Existing approaches to WS are mainly to calculate the similarity or relatedness of word pairs based on word embedding obtained by massive and high-quality corpus. However, it may suffer from poor performance for insuf- ficient corpus in some specific fields, and cannot capture rich semantic and sentimental information. To address these above problems, we propose an enhancing embedding-based word similarity evaluation with character-word concepts and synonyms knowledge, namely EWS-CS model, which can provide extra semantic information to enhance word similarity evaluation. The core of our approach contains knowledge encoder and word encoder. In knowledge encoder, we incorporate the semantic knowledge extracted from knowledge resources, including character-word concepts, synonyms and sentiment lexicons, to obtain knowledge representation. Word encoder is to learn enhancing embedding-based word representation from pre-trained model and knowledge representation based on similarity task. Finally, compared with baseline models, the experiments on four similarity evaluation datasets validate the effectiveness of our EWS-CS model in WS task.
Word representation; concepts and synonyms knowledge; word similarity; information security
Cite This Article
Yin, F., Wang, Y., Liu, J., Ji, M. (2020). Enhancing Embedding-Based Chinese Word Similarity Evaluation with Concepts and Synonyms Knowledge. CMES-Computer Modeling in Engineering & Sciences, 124(2), 747–764.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.