Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (4)
  • Open Access

    ARTICLE

    Analyzing COVID-19 Discourse on Twitter: Text Clustering and Classification Models for Public Health Surveillance

    Pakorn Santakij1, Samai Srisuay2,*, Pongporn Punpeng1

    Computer Systems Science and Engineering, Vol.48, No.3, pp. 665-689, 2024, DOI:10.32604/csse.2024.045066 - 20 May 2024

    Abstract Social media has revolutionized the dissemination of real-life information, serving as a robust platform for sharing life events. Twitter, characterized by its brevity and continuous flow of posts, has emerged as a crucial source for public health surveillance, offering valuable insights into public reactions during the COVID-19 pandemic. This study aims to leverage a range of machine learning techniques to extract pivotal themes and facilitate text classification on a dataset of COVID-19 outbreak-related tweets. Diverse topic modeling approaches have been employed to extract pertinent themes and subsequently form a dataset for training text classification models.… More >

  • Open Access

    ARTICLE

    Unsupervised Graph-Based Tibetan Multi-Document Summarization

    Xiaodong Yan1,2, Yiqin Wang1,2, Wei Song1,2,*, Xiaobing Zhao1,2, A. Run3, Yang Yanxing4

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 1769-1781, 2022, DOI:10.32604/cmc.2022.027301 - 18 May 2022

    Abstract Text summarization creates subset that represents the most important or relevant information in the original content, which effectively reduce information redundancy. Recently neural network method has achieved good results in the task of text summarization both in Chinese and English, but the research of text summarization in low-resource languages is still in the exploratory stage, especially in Tibetan. What’s more, there is no large-scale annotated corpus for text summarization. The lack of dataset severely limits the development of low-resource text summarization. In this case, unsupervised learning approaches are more appealing in low-resource languages as they… More >

  • Open Access

    ARTICLE

    Analysis of Semi-Supervised Text Clustering Algorithm on Marine Data

    Yu Jiang1, 2, Dengwen Yu1, Mingzhao Zhao1, 2, Hongtao Bai1, 2, Chong Wang1, 2, 3, Lili He1, 2, *

    CMC-Computers, Materials & Continua, Vol.64, No.1, pp. 207-216, 2020, DOI:10.32604/cmc.2020.09861 - 20 May 2020

    Abstract Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning. This paper implements and compares unsupervised and semi-supervised clustering analysis of BOAArgo ocean text data. Unsupervised K-Means and Affinity Propagation (AP) are two classical clustering algorithms. The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range. Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition, and use this data for semi-supervised… More >

  • Open Access

    ARTICLE

    The Analysis of China’s Integrity Situation Based on Big Data

    Wangdong Jiang1, Taian Yang1, *, Guang Sun1, 3, Yucai Li1, Yixuan Tang2, Hongzhang Lv1, Wenqian Xiang1

    Journal on Big Data, Vol.1, No.3, pp. 117-134, 2019, DOI:10.32604/jbd.2019.08454

    Abstract In order to study deeply the prominent problems faced by China’s clean government work, and put forward effective coping strategies, this article analyzes the network information of anti-corruption related news events, which is based on big data technology. In this study, we take the news report from the website of the Communist Party of China (CPC) Central Commission for Discipline Inspection (CCDI) as the source of data. Firstly, the obtained text data is converted to word segmentation and stop words under preprocessing, and then the pre-processed data is improved by vectorization and text clustering, finally,… More >

Displaying 1-10 on page 1 of 4. Per Page