Zeyu Xiong1,*, Qiangqiang Shen1, Yijie Wang1, Chenyang Zhu2
CMC-Computers, Materials & Continua, Vol.55, No.2, pp. 213-227, 2018, DOI:10.3970/cmc.2018.01762
Abstract Document processing in natural language includes retrieval, sentiment analysis, theme extraction, etc. Classical methods for handling these tasks are based on models of probability, semantics and networks for machine learning. The probability model is loss of semantic information in essential, and it influences the processing accuracy. Machine learning approaches include supervised, unsupervised, and semi-supervised approaches, labeled corpora is necessary for semantics model and supervised learning. The method for achieving a reliably labeled corpus is done manually, it is costly and time-consuming because people have to read each document and annotate the label of each document.… More >