Yue Zhao1, Jianjian Yue1, Wei Song1,*, Xiaona Xu1, Xiali Li1, Licheng Wu1, Qiang Ji2
CMC-Computers, Materials & Continua, Vol.60, No.3, pp. 1223-1235, 2019, DOI:10.32604/cmc.2019.05636
Abstract Tibetan language has very limited resource for conventional automatic speech recognition so far. It lacks of enough data, sub-word unit, lexicons and word inventories for some dialects. And speech content recognition and dialect classification have been treated as two independent tasks and modeled respectively in most prior works. But the two tasks are highly correlated. In this paper, we present a multi-task WaveNet model to perform simultaneous Tibetan multi-dialect speech recognition and dialect identification. It avoids processing the pronunciation dictionary and word segmentation for new dialects, while, in the meantime, allows training speech recognition and More >