Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

Leveraging Unlabeled Corpus for Arabic Dialect Identification

Mohammed Abdelmajeed^1,*, Jiangbin Zheng¹, Ahmed Murtadha¹, Youcef Nafa¹, Mohammed Abaker², Muhammad Pervez Akhter³

CMC-Computers, Materials & Continua, Vol.83, No.2, pp. 3471-3491, 2025, DOI:10.32604/cmc.2025.059870 - 16 April 2025

Abstract Arabic Dialect Identification (DID) is a task in Natural Language Processing (NLP) that involves determining the dialect of a given piece of text in Arabic. The state-of-the-art solutions for DID are built on various deep neural networks that commonly learn the representation of sentences in response to a given dialect. Despite the effectiveness of these solutions, the performance heavily relies on the amount of labeled examples, which is labor-intensive to attain and may not be readily available in real-world scenarios. To alleviate the burden of labeling data, this paper introduces a novel solution that leverages… More >

Open Access

ARTICLE

Arabic Dialect Identification in Social Media: A Comparative Study of Deep Learning and Transformer Approaches

Enas Yahya Alqulaity¹, Wael M.S. Yafooz^1,*, Abdullah Alourani², Ayman Jaradat³

Intelligent Automation & Soft Computing, Vol.39, No.5, pp. 907-928, 2024, DOI:10.32604/iasc.2024.055470 - 31 October 2024

Abstract Arabic dialect identification is essential in Natural Language Processing (NLP) and forms a critical component of applications such as machine translation, sentiment analysis, and cross-language text generation. The difficulties in differentiating between Arabic dialects have garnered more attention in the last 10 years, particularly in social media. These difficulties result from the overlapping vocabulary of the dialects, the fluidity of online language use, and the difficulties in telling apart dialects that are closely related. Managing dialects with limited resources and adjusting to the ever-changing linguistic trends on social media platforms present additional challenges. A strong… More >

Open Access

ARTICLE

Developing Lexicons for Enhanced Sentiment Analysis in Software Engineering: An Innovative Multilingual Approach for Social Media Reviews

Zohaib Ahmad Khan¹, Yuanqing Xia^1,*, Ahmed Khan², Muhammad Sadiq², Mahmood Alam³, Fuad A. Awwad⁴, Emad A. A. Ismail⁴

CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2771-2793, 2024, DOI:10.32604/cmc.2024.046897 - 15 May 2024

Abstract Sentiment analysis is becoming increasingly important in today’s digital age, with social media being a significant source of user-generated content. The development of sentiment lexicons that can support languages other than English is a challenging task, especially for analyzing sentiment analysis in social media reviews. Most existing sentiment analysis systems focus on English, leaving a significant research gap in other languages due to limited resources and tools. This research aims to address this gap by building a sentiment lexicon for local languages, which is then used with a machine learning algorithm for efficient sentiment analysis.… More >

Open Access

ARTICLE

Analyzing Arabic Twitter-Based Patient Experience Sentiments Using Multi-Dialect Arabic Bidirectional Encoder Representations from Transformers

Sarab AlMuhaideb^*, Yasmeen AlNegheimish, Taif AlOmar, Reem AlSabti, Maha AlKathery, Ghala AlOlyyan

CMC-Computers, Materials & Continua, Vol.76, No.1, pp. 195-220, 2023, DOI:10.32604/cmc.2023.038368 - 08 June 2023

Abstract Healthcare organizations rely on patients’ feedback and experiences to evaluate their performance and services, thereby allowing such organizations to improve inadequate services and address any shortcomings. According to the literature, social networks and particularly Twitter are effective platforms for gathering public opinions. Moreover, recent studies have used natural language processing to measure sentiments in text segments collected from Twitter to capture public opinions about various sectors, including healthcare. The present study aimed to analyze Arabic Twitter-based patient experience sentiments and to introduce an Arabic patient experience corpus. The authors collected 12,400 tweets from Arabic patients… More >

Open Access

ARTICLE

Recognition of Handwritten Words from Digital Writing Pad Using MMU-SNet

V. Jayanthi^*, S. Thenmalar

Intelligent Automation & Soft Computing, Vol.36, No.3, pp. 3551-3564, 2023, DOI:10.32604/iasc.2023.036599 - 15 March 2023

Abstract In this paper, Modified Multi-scale Segmentation Network (MMU-SNet) method is proposed for Tamil text recognition. Handwritten texts from digital writing pad notes are used for text recognition. Handwritten words recognition for texts written from digital writing pad through text file conversion are challenging due to stylus pressure, writing on glass frictionless surfaces, and being less skilled in short writing, alphabet size, style, carved symbols, and orientation angle variations. Stylus pressure on the pad changes the words in the Tamil language alphabet because the Tamil alphabets have a smaller number of lines, angles, curves, and bends.… More >

Open Access

ARTICLE

Research on Tibetan Speech Recognition Based on the Am-do Dialect

Kuntharrgyal Khysru^1,*, Jianguo Wei^1,2, Jianwu Dang³

CMC-Computers, Materials & Continua, Vol.73, No.3, pp. 4897-4907, 2022, DOI:10.32604/cmc.2022.027591 - 28 July 2022

Abstract In China, Tibetan is usually divided into three major dialects: the Am-do, Khams and Lhasa dialects. The Am-do dialect evolved from ancient Tibetan and is a local variant of modern Tibetan. Although this dialect has its own specific historical and social conditions and development, there have been different degrees of communication with other ethnic groups, but all the abovementioned dialects developed from the same language: Tibetan. This paper uses the particularity of Tibetan suffixes in pronunciation and proposes a lexicon for the Am-do language, which optimizes the problems existing in previous research. Audio data of… More >

Open Access

ARTICLE

Speak-Correct: A Computerized Interface for the Analysis of Mispronounced Errors

Kamal Jambi^1,*, Hassanin Al-Barhamtoshy¹, Wajdi Al-Jedaibi¹, Mohsen Rashwan², Sherif Abdou³

Computer Systems Science and Engineering, Vol.43, No.3, pp. 1155-1173, 2022, DOI:10.32604/csse.2022.024967 - 09 May 2022

Abstract Any natural language may have dozens of accents. Even though the equivalent phonemic formation of the word, if it is properly called in different accents, humans do have audio signals that are distinct from one another. Among the most common issues with speech, the processing is discrepancies in pronunciation, accent, and enunciation. This research study examines the issues of detecting, fixing, and summarising accent defects of average Arabic individuals in English-speaking speech. The article then discusses the key approaches and structure that will be utilized to address both accent flaws and pronunciation issues. The proposed… More >

Open Access

ARTICLE

Emotional Analysis of Arabic Saudi Dialect Tweets Using a Supervised Learning Approach

Abeer A. AlFutamani, Heyam H. Al-Baity^*

Intelligent Automation & Soft Computing, Vol.29, No.1, pp. 89-109, 2021, DOI:10.32604/iasc.2021.016555 - 12 May 2021

Abstract Social media sites produce a large amount of data and offer a highly competitive advantage for companies when they can benefit from and address data, as data provides a deeper understanding of clients and their needs. This understanding of clients helps in effectively making the correct decisions within the company, based on data obtained from social media websites. Thus, sentiment analysis has become a key tool for understanding that data. Sentiment analysis is a research area that focuses on analyzing people’s emotions and opinions to identify the polarity (e.g., positive or negative) of a given… More >

Open Access

ARTICLE

Classification d’aires de dispersion à l’aide d’un facteur géographique

Application à la dialectologie

Clément Chagnaud^1,3, Philippe Garat², Paule-Annick Davoine^1,3, Guylaine Brun-Trigaud⁴

Revue Internationale de Géomatique, Vol.30, No.1, pp. 67-83, 2020, DOI:10.3166/rig.2020.00107

Abstract Nous proposons une procédure d’analyse statistique multidimensionnelle couplant des méthodes de projection et de classification pour identifier des ensembles cohérents au sein d’un corpus d’entités géographiques surfaciques que l’on appelle aires de dispersion. La méthodologie intègre un facteur géographique dans la construction de l’espace de représentation pour la projection des données. En appliquant ces méthodes sur des données géolinguistiques, nous pouvons identifier et expliquer de nouvelles structures spatiales au sein d’un corpus d’aires de dispersion de traits linguistiques. More >

Open Access

ARTICLE

Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode

Yue Zhao¹, Jianjian Yue¹, Wei Song^1,*, Xiaona Xu¹, Xiali Li¹, Licheng Wu¹, Qiang Ji²

Journal on Internet of Things, Vol.1, No.1, pp. 17-23, 2019, DOI:10.32604/jiot.2019.05866

Abstract We proposed a method using latent regression Bayesian network (LRBN) to extract the shared speech feature for the input of end-to-end speech recognition model. The structure of LRBN is compact and its parameter learning is fast. Compared with Convolutional Neural Network, it has a simpler and understood structure and less parameters to learn. Experimental results show that the advantage of hybrid LRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classification architecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN is helpful to differentiate among multiple language speech sets. More >

Displaying 1-10 on page 1 of 11. Per Page

View

1214

Download

447

View

1568

Download

666

View

1890

Download

908

View

2833

Download

1100

Like

2

View

1321

Download

839

View

1966

Download

1280

View

1993

Download

1279

View

3160

Download

1985

View

1202

Download

961

View

4392

Download

2390

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: