Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (3)
  • Open Access

    ARTICLE

    LLM-Based Enhanced Clustering for Low-Resource Language: An Empirical Study

    Talha Farooq Khan1, Majid Hussain1, Muhammad Arslan2, Muhammad Saeed1, Lal Khan3,*, Hsien-Tsung Chang4,5,6,*

    CMES-Computer Modeling in Engineering & Sciences, Vol.145, No.3, pp. 3883-3911, 2025, DOI:10.32604/cmes.2025.073021 - 23 December 2025

    Abstract Text clustering is an important task because of its vital role in NLP-related tasks. However, existing research on clustering is mainly based on the English language, with limited work on low-resource languages, such as Urdu. Low-resource language text clustering has many drawbacks in the form of limited annotated collections and strong linguistic diversity. The primary aim of this paper is twofold: (1) By introducing a clustering dataset named UNC-2025 comprises 100k Urdu news documents, and (2) a detailed empirical standard of Large Language Model (LLM) improved clustering methods for Urdu text. We explicitly evaluate the… More >

  • Open Access

    ARTICLE

    Sentiment Analysis of Low-Resource Language Literature Using Data Processing and Deep Learning

    Aizaz Ali1, Maqbool Khan1,2, Khalil Khan3, Rehan Ullah Khan4, Abdulrahman Aloraini4,*

    CMC-Computers, Materials & Continua, Vol.79, No.1, pp. 713-733, 2024, DOI:10.32604/cmc.2024.048712 - 25 April 2024

    Abstract Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understanding public opinion and user sentiment across diverse languages. While numerous scholars conduct sentiment analysis in widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grappling with resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language, characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu, Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguistic features,… More >

  • Open Access

    ARTICLE

    Cross-Language Transfer Learning-based Lhasa-Tibetan Speech Recognition

    Zhijie Wang1, Yue Zhao1,*, Licheng Wu1, Xiaojun Bi1, Zhuoma Dawa2, Qiang Ji3

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 629-639, 2022, DOI:10.32604/cmc.2022.027092 - 18 May 2022

    Abstract As one of Chinese minority languages, Tibetan speech recognition technology was not researched upon as extensively as Chinese and English were until recently. This, along with the relatively small Tibetan corpus, has resulted in an unsatisfying performance of Tibetan speech recognition based on an end-to-end model. This paper aims to achieve an accurate Tibetan speech recognition using a small amount of Tibetan training data. We demonstrate effective methods of Tibetan end-to-end speech recognition via cross-language transfer learning from three aspects: modeling unit selection, transfer learning method, and source language selection. Experimental results show that the More >

Displaying 1-10 on page 1 of 3. Per Page