Tech Science Press - Publisher of Open Access Journals

News & Announcements

30 January 2026
Tech Science Press Shares Integrity Insights on AI-Enabled Paper Mills at Charleston Asia Conference
27 January 2026
SDHM-Recommended: I3CSE 2026 in Guangzhou
26 January 2026
TSP Establishes Strategic Cooperation with Chinese Medical Association Publishing House (CMAPH)
05 January 2026
Prof. Lin Lu Appointed Editor-in-Chief of Energy Engineering
29 December 2025
Two More Tech Science Press Journals Now Indexed in Chemical Abstracts Service (CAS) Databases
24 December 2025
Oncologie Welcomes Dr. Lei Zheng as Editor-in-Chief

Title/Keywords
Author/Affliations
Journal
Article Type
Start Year
End Year

Update Searching Clear

Show export options

Articles
Online

Search Results (7)

Open Access

ARTICLE

Deep Learning-Based Lip-Reading for Vocal Impaired Patient Rehabilitation

Chiara Innocente^1,*, Matteo Boemio², Gianmarco Lorenzetti², Ilaria Pulito², Diego Romagnoli², Valeria Saponaro², Giorgia Marullo¹, Luca Ulrich¹, Enrico Vezzetti¹

CMES-Computer Modeling in Engineering & Sciences, Vol.143, No.2, pp. 1355-1379, 2025, DOI:10.32604/cmes.2025.063186 - 30 May 2025

Abstract Lip-reading technology, based on visual speech decoding and automatic speech recognition, offers a promising solution to overcoming communication barriers, particularly for individuals with temporary or permanent speech impairments. However, most Visual Speech Recognition (VSR) research has primarily focused on the English language and general-purpose applications, limiting its practical applicability in medical and rehabilitative settings. This study introduces the first Deep Learning (DL) based lip-reading system for the Italian language designed to assist individuals with vocal cord pathologies in daily interactions, facilitating communication for patients recovering from vocal cord surgeries, whether temporarily or permanently impaired. To… More >

View
2154

Download
901
Open Access

ARTICLE

Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications

Shuting Ge^1,2, Jin Ren^2,3,*, Yihua Shi⁴, Yujun Zhang¹, Shunzhi Yang², Jinfeng Yang²

CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3215-3245, 2024, DOI:10.32604/cmc.2023.046746 - 26 March 2024

Abstract In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues,… More >

View
1847

Download
846

Like
1
Open Access

ARTICLE

Joint On-Demand Pruning and Online Distillation in Automatic Speech Recognition Language Model Optimization

Soonshin Seo^1,2, Ji-Hwan Kim^2,*

CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 2833-2856, 2023, DOI:10.32604/cmc.2023.042816 - 26 December 2023

Abstract Automatic speech recognition (ASR) systems have emerged as indispensable tools across a wide spectrum of applications, ranging from transcription services to voice-activated assistants. To enhance the performance of these systems, it is important to deploy efficient models capable of adapting to diverse deployment conditions. In recent years, on-demand pruning methods have obtained significant attention within the ASR domain due to their adaptability in various deployment scenarios. However, these methods often confront substantial trade-offs, particularly in terms of unstable accuracy when reducing the model size. To address challenges, this study introduces two crucial empirical findings. Firstly,… More >

View
1568

Download
829
Open Access

ARTICLE

A Robust Conformer-Based Speech Recognition Model for Mandarin Air Traffic Control

Peiyuan Jiang¹, Weijun Pan^1,*, Jian Zhang¹, Teng Wang¹, Junxiang Huang²

CMC-Computers, Materials & Continua, Vol.77, No.1, pp. 911-940, 2023, DOI:10.32604/cmc.2023.041772 - 31 October 2023

Abstract
This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition (ASR) technology in the Air Traffic Control (ATC) field. This paper presents a novel cascaded model architecture, namely Conformer-CTC/Attention-T5 (CCAT), to build a highly accurate and robust ATC speech recognition model. To tackle the challenges posed by noise and fast speech rate in ATC, the Conformer model is employed to extract robust and discriminative speech representations from raw waveforms. On the decoding side, the Attention mechanism is integrated to facilitate precise alignment between input features and
… More >

View
1367

Download
779

Like
1
Open Access

ARTICLE

Speech Recognition via CTC-CNN Model

Wen-Tsai Sung¹, Hao-Wei Kang¹, Sung-Jung Hsiao^2,*

CMC-Computers, Materials & Continua, Vol.76, No.3, pp. 3833-3858, 2023, DOI:10.32604/cmc.2023.040024 - 08 October 2023

Abstract In the speech recognition system, the acoustic model is an important underlying model, and its accuracy directly affects the performance of the entire system. This paper introduces the construction and training process of the acoustic model in detail and studies the Connectionist temporal classification (CTC) algorithm, which plays an important role in the end-to-end framework, established a convolutional neural network (CNN) combined with an acoustic model of Connectionist temporal classification to improve the accuracy of speech recognition. This study uses a sound sensor, ReSpeaker Mic Array v2.0.1, to convert the collected speech signals into text… More >

View
1437

Download
986
Open Access

ARTICLE

An End-to-End Transformer-Based Automatic Speech Recognition for Qur’an Reciters

Mohammed Hadwan^1,2,*, Hamzah A. Alsayadi^3,4, Salah AL-Hagree⁵

CMC-Computers, Materials & Continua, Vol.74, No.2, pp. 3471-3487, 2023, DOI:10.32604/cmc.2023.033457 - 31 October 2022

Abstract The attention-based encoder-decoder technique, known as the trans-former, is used to enhance the performance of end-to-end automatic speech recognition (ASR). This research focuses on applying ASR end-to-end transformer-based models for the Arabic language, as the researchers’ community pays little attention to it. The Muslims Holy Qur’an book is written using Arabic diacritized text. In this paper, an end-to-end transformer model to building a robust Qur’an vs. recognition is proposed. The acoustic model was built using the transformer-based model as deep learning by the PyTorch framework. A multi-head attention mechanism is utilized to represent the encoder and… More >

View
2728

Download
1311
Open Access

REVIEW

Challenges and Limitations in Speech Recognition Technology: A Critical Review of Speech Signal Processing Algorithms, Tools and Systems

Sneha Basak¹, Himanshi Agrawal¹, Shreya Jena¹, Shilpa Gite^2,*, Mrinal Bachute², Biswajeet Pradhan^3,4,5,*, Mazen Assiri⁴

CMES-Computer Modeling in Engineering & Sciences, Vol.135, No.2, pp. 1053-1089, 2023, DOI:10.32604/cmes.2022.021755 - 27 October 2022

Abstract Speech recognition systems have become a unique human-computer interaction (HCI) family. Speech is one of the most naturally developed human abilities; speech signal processing opens up a transparent and hand-free computation experience. This paper aims to present a retrospective yet modern approach to the world of speech recognition systems. The development journey of ASR (Automatic Speech Recognition) has seen quite a few milestones and breakthrough technologies that have been highlighted in this paper. A step-by-step rundown of the fundamental stages in developing speech recognition systems has been presented, along with a brief discussion of various More >

View
7436

Download
2935

Displaying 1-10 on page 1 of 7. Per Page

Deep Learning-Based Lip-Reading for Vocal Impaired Patient Rehabilitation

View

2154

Download

901

Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications

View

1847

Download

846

Like

1

Joint On-Demand Pruning and Online Distillation in Automatic Speech Recognition Language Model Optimization

View

1568

Download

829

A Robust Conformer-Based Speech Recognition Model for Mandarin Air Traffic Control

View

1367

Download

779

Like

1

Speech Recognition via CTC-CNN Model

View

1437

Download

986

An End-to-End Transformer-Based Automatic Speech Recognition for Qur’an Reciters

View

2728

Download

1311

Challenges and Limitations in Speech Recognition Technology: A Critical Review of Speech Signal Processing Algorithms, Tools and Systems

View

7436

Download

2935

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: