Open Access
ARTICLE
Cross-Language Transfer Learning-based Lhasa-Tibetan Speech Recognition
1 School of Information Engineering, Minzu University of China, Beijing, 100081, China
2 School of Chinese Ethnic Minority Languages and Literatures, Minzu University of China, Beijing, 100081, China
3 Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180-3590, USA
* Corresponding Author: Yue Zhao. Email:
Computers, Materials & Continua 2022, 73(1), 629-639. https://doi.org/10.32604/cmc.2022.027092
Received 10 January 2022; Accepted 30 March 2022; Issue published 18 May 2022
Abstract
As one of Chinese minority languages, Tibetan speech recognition technology was not researched upon as extensively as Chinese and English were until recently. This, along with the relatively small Tibetan corpus, has resulted in an unsatisfying performance of Tibetan speech recognition based on an end-to-end model. This paper aims to achieve an accurate Tibetan speech recognition using a small amount of Tibetan training data. We demonstrate effective methods of Tibetan end-to-end speech recognition via cross-language transfer learning from three aspects: modeling unit selection, transfer learning method, and source language selection. Experimental results show that the Chinese-Tibetan multi-language learning method using multi-language character set as the modeling unit yields the best performance on Tibetan Character Error Rate (CER) at 27.3%, which is reduced by 26.1% compared to the language-specific model. And our method also achieves the 2.2% higher accuracy using less amount of data compared with the method using Tibetan multi-dialect transfer learning under the same model structure and data set.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.