Table of Content

Open Access iconOpen Access

ARTICLE

crossmark

The Method for Extracting New Login Sentiment Words from Chinese Micro-Blog Basedf on Improved Mutual Information

by Guangli Zhu, Wenting Liu, Shunxiang Zhang, Xiang Chen, Chang Yin

State Key Laboratory of Mining Response and Disaster Prevention and Control in Deep Coal Mines (Anhui University of Science and Technology), Anhui 232001, China

* Corresponding Author: email

Computer Systems Science and Engineering 2020, 35(3), 223-232. https://doi.org/10.32604/csse.2020.35.223

Abstract

The current method of extracting new login sentiment words not only ignores the diversity of patterns constituted by new multi-character words (the number of words is greater than two), but also disregards the influence of other new words co-occurring with a new word connoting sentiment. To solve this problem, this paper proposes a method for extracting new login sentiment words from Chinese micro-blog based on improved mutual information. First, micro-blog data are preprocessed, taking into consideration some nonsense signals such as web links and punctuation. Based on preprocessed data, the candidate strings are obtained by applying the N-gram segmentation method. Then, the extraction algorithm for new login words is proposed, which combines multi-character mutual information (MMI) and left and right adjacent entropy. In this algorithm, the MMI describes the internal cohesion of the candidate string of multiple words in a variety of constituted patterns. Then, the candidate strings are extended and filtered according to frequency, MMI, and right and left adjacency entropy, to extract new login words. Finally, the algorithm for the extraction of new login sentiment words is proposed. In this algorithm, the Sentiment Similarity between words (SW) is determined in order to measure the sentiment similarity of a new login word to other sentiment words and other new login sentiment words. Then, the sentiment tendency values of new login words are obtained by calculating the SW to extract new login sentiment words. Experimental results show that this method is very effective for the extraction of new login sentiment words.

Keywords


Cite This Article

APA Style
Zhu, G., Liu, W., Zhang, S., Chen, X., Yin, C. (2020). The method for extracting new login sentiment words from chinese micro-blog basedf on improved mutual information. Computer Systems Science and Engineering, 35(3), 223-232. https://doi.org/10.32604/csse.2020.35.223
Vancouver Style
Zhu G, Liu W, Zhang S, Chen X, Yin C. The method for extracting new login sentiment words from chinese micro-blog basedf on improved mutual information. Comput Syst Sci Eng. 2020;35(3):223-232 https://doi.org/10.32604/csse.2020.35.223
IEEE Style
G. Zhu, W. Liu, S. Zhang, X. Chen, and C. Yin, “The Method for Extracting New Login Sentiment Words from Chinese Micro-Blog Basedf on Improved Mutual Information,” Comput. Syst. Sci. Eng., vol. 35, no. 3, pp. 223-232, 2020. https://doi.org/10.32604/csse.2020.35.223

Citations




cc Copyright © 2020 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2252

    View

  • 1345

    Download

  • 2

    Like

Share Link