Open Access
ARTICLE
Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode
School of Information and Engineering, Minzu University of China , Beijing, 100081, China.
Rensselaer Polytechnic Institute, 110 Eighth Street, Troy NY 12180-3590, USA.
*Corresponding Author: Wei Song. Email: .
Journal on Internet of Things 2019, 1(1), 17-23. https://doi.org/10.32604/jiot.2019.05866
Abstract
We proposed a method using latent regression Bayesian network (LRBN) to extract the shared speech feature for the input of end-to-end speech recognition model. The structure of LRBN is compact and its parameter learning is fast. Compared with Convolutional Neural Network, it has a simpler and understood structure and less parameters to learn. Experimental results show that the advantage of hybrid LRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classification architecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN is helpful to differentiate among multiple language speech sets.Keywords
Cite This Article
Y. Zhao, J. Yue, W. Song, X. Xu, X. Li et al., "Tibetan multi-dialect speech recognition using latent regression bayesian network and end-to-end mode," Journal on Internet of Things, vol. 1, no.1, pp. 17–23, 2019.