Arfat Ahmad Khan1, Rashid Jahangir2,*, Roobaea Alroobaea3, Saleh Yahya Alyahyan4, Ahmed H. Almulhi3, Majed Alsafyani3, Chitapong Wechtaisong5
CMC-Computers, Materials & Continua, Vol.75, No.2, pp. 4085-4100, 2023, DOI:10.32604/cmc.2023.036797
- 31 March 2023
Abstract Automatic Speaker Identification (ASI) involves the process of distinguishing an audio stream associated with numerous speakers’ utterances. Some common aspects, such as the framework difference, overlapping of different sound events, and the presence of various sound sources during recording, make the ASI task much more complicated and complex. This research proposes a deep learning model to improve the accuracy of the ASI system and reduce the model training time under limited computation resources. In this research, the performance of the transformer model is investigated. Seven audio features, chromagram, Mel-spectrogram, tonnetz, Mel-Frequency Cepstral Coefficients (MFCCs), delta More >