Open Access
ARTICLE
Comprehensive Analysis of Gender Classification Accuracy across Varied Geographic Regions through the Application of Deep Learning Algorithms to Speech Signals
Department of Electronics and Communication Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Delhi–NCR Campus, Ghaziabad, 201204, India
* Corresponding Author: Abhishek Singhal. Email:
Computer Systems Science and Engineering 2024, 48(3), 609-625. https://doi.org/10.32604/csse.2023.046730
Received 13 October 2023; Accepted 12 December 2023; Issue published 20 May 2024
Abstract
This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions, employing a deep learning classification algorithm for speech signal analysis. In this study, speech samples are categorized for both training and testing purposes based on their geographical origin. Category 1 comprises speech samples from speakers outside of India, whereas Category 2 comprises live-recorded speech samples from Indian speakers. Testing speech samples are likewise classified into four distinct sets, taking into consideration both geographical origin and the language spoken by the speakers. Significantly, the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas. Indian speakers, utilizing 52 Hindi and 26 English phonemes in their speech, demonstrate a notably higher gender identification accuracy of 85.75% compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers. The gender identification accuracy of the proposed model reaches 83.20% when the system is trained using speech samples from speakers outside of India. In the analysis of speech signals, Mel Frequency Cepstral Coefficients (MFCCs) serve as relevant features for the speech data. The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory (BiLSTM) architecture within a Recurrent Neural Network (RNN) model.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.