Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (13)
  • Open Access

    ARTICLE

    Research on Tibetan Speech Recognition Based on the Am-do Dialect

    Kuntharrgyal Khysru1,*, Jianguo Wei1,2, Jianwu Dang3

    CMC-Computers, Materials & Continua, Vol.73, No.3, pp. 4897-4907, 2022, DOI:10.32604/cmc.2022.027591 - 28 July 2022

    Abstract In China, Tibetan is usually divided into three major dialects: the Am-do, Khams and Lhasa dialects. The Am-do dialect evolved from ancient Tibetan and is a local variant of modern Tibetan. Although this dialect has its own specific historical and social conditions and development, there have been different degrees of communication with other ethnic groups, but all the abovementioned dialects developed from the same language: Tibetan. This paper uses the particularity of Tibetan suffixes in pronunciation and proposes a lexicon for the Am-do language, which optimizes the problems existing in previous research. Audio data of… More >

  • Open Access

    ARTICLE

    Tibetan Sorting Method Based on Hash Function

    AnJian-CaiRang1,2, Dawei Song3,4,*

    Journal on Artificial Intelligence, Vol.4, No.2, pp. 85-98, 2022, DOI:10.32604/jai.2022.029141 - 18 July 2022

    Abstract Sorting the Tibetan language quickly and accurately requires first identifying the component elements that make up Tibetan syllables and then sorting by the priority of the component. Based on the study of Tibetan text structure, grammatical rules and syllable structure, we present a structure-based Tibetan syllable recognition method that uses syllable structure instead of grammar. This method avoids complicated Tibetan grammar and recognizes the components of Tibetan syllables simply and quickly. On the basis of identifying the components of Tibetan syllables, a Tibetan syllable sorting algorithm that conforms to the language sorting rules is proposed.… More >

  • Open Access

    ARTICLE

    Unsupervised Graph-Based Tibetan Multi-Document Summarization

    Xiaodong Yan1,2, Yiqin Wang1,2, Wei Song1,2,*, Xiaobing Zhao1,2, A. Run3, Yang Yanxing4

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 1769-1781, 2022, DOI:10.32604/cmc.2022.027301 - 18 May 2022

    Abstract Text summarization creates subset that represents the most important or relevant information in the original content, which effectively reduce information redundancy. Recently neural network method has achieved good results in the task of text summarization both in Chinese and English, but the research of text summarization in low-resource languages is still in the exploratory stage, especially in Tibetan. What’s more, there is no large-scale annotated corpus for text summarization. The lack of dataset severely limits the development of low-resource text summarization. In this case, unsupervised learning approaches are more appealing in low-resource languages as they… More >

  • Open Access

    ARTICLE

    Cross-Language Transfer Learning-based Lhasa-Tibetan Speech Recognition

    Zhijie Wang1, Yue Zhao1,*, Licheng Wu1, Xiaojun Bi1, Zhuoma Dawa2, Qiang Ji3

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 629-639, 2022, DOI:10.32604/cmc.2022.027092 - 18 May 2022

    Abstract As one of Chinese minority languages, Tibetan speech recognition technology was not researched upon as extensively as Chinese and English were until recently. This, along with the relatively small Tibetan corpus, has resulted in an unsatisfying performance of Tibetan speech recognition based on an end-to-end model. This paper aims to achieve an accurate Tibetan speech recognition using a small amount of Tibetan training data. We demonstrate effective methods of Tibetan end-to-end speech recognition via cross-language transfer learning from three aspects: modeling unit selection, transfer learning method, and source language selection. Experimental results show that the More >

  • Open Access

    ARTICLE

    Tibetan Question Generation Based on Sequence to Sequence Model

    Yuan Sun1,2,*, Chaofan Chen1,2, Andong Chen3, Xiaobing Zhao1,2

    CMC-Computers, Materials & Continua, Vol.68, No.3, pp. 3203-3213, 2021, DOI:10.32604/cmc.2021.016517 - 06 May 2021

    Abstract As the dual task of question answering, question generation (QG) is a significant and challenging task that aims to generate valid and fluent questions from a given paragraph. The QG task is of great significance to question answering systems, conversational systems, and machine reading comprehension systems. Recent sequence to sequence neural models have achieved outstanding performance in English and Chinese QG tasks. However, the task of Tibetan QG is rarely mentioned. The key factor impeding its development is the lack of a public Tibetan QG dataset. Faced with this challenge, the present paper first collects… More >

  • Open Access

    ARTICLE

    Chemical Constituents of Pedicularis longiflora var. tubiformis (Orobanchaceae), a Common Hemiparasitic Medicinal Herb from the Qinghai Lake Basin, China

    Feng Liu1,2,3, Zilan Ma1,2,3, Marcos A. Caraballo-Ortiz4, Hui Zhang5, Xu Su1,2,3, Yuping Liu1,2,3,*

    Phyton-International Journal of Experimental Botany, Vol.89, No.4, pp. 1083-1090, 2020, DOI:10.32604/phyton.2020.011239 - 09 November 2020

    Abstract Pedicularis longiflora var. tubiformis (Orobanchaceae) is an abundant parasitic herb mainly found in the Xiaopohu wetland of the Qinghai Lake Basin in Northwestern China. The species has an important local medicinal value, and in this study, we evaluated the chemical profile of its stems, leaves and seeds using mass spectrometry. Dried samples of stems, leaves and seeds were grinded, weighted, and used for a series of extractions with an ultrasonic device at room temperature. The chemical profiles for each tissue were determined using Gas Chromatography-Mass Spectrometry (GC-MS) and Liquid ChromatographyMass Spectrometry (LC-MS). Twenty-seven amino acids and organic… More >

  • Open Access

    ARTICLE

    Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode

    Yue Zhao1, Jianjian Yue1, Wei Song1,*, Xiaona Xu1, Xiali Li1, Licheng Wu1, Qiang Ji2

    Journal on Internet of Things, Vol.1, No.1, pp. 17-23, 2019, DOI:10.32604/jiot.2019.05866

    Abstract We proposed a method using latent regression Bayesian network (LRBN) to extract the shared speech feature for the input of end-to-end speech recognition model. The structure of LRBN is compact and its parameter learning is fast. Compared with Convolutional Neural Network, it has a simpler and understood structure and less parameters to learn. Experimental results show that the advantage of hybrid LRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classification architecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN is helpful to differentiate among multiple language speech sets. More >

  • Open Access

    ARTICLE

    Readability Assessment of Textbooks in Low Resource Languages

    Zhijuan Wang1,2, Xiaobin Zhao1,2, Wei Song1,*, Antai Wang3

    CMC-Computers, Materials & Continua, Vol.61, No.1, pp. 213-225, 2019, DOI:10.32604/cmc.2019.05690

    Abstract Readability is a fundamental problem in textbooks assessment. For low re-sources languages (LRL), however, little investigation has been done on the readability of textbook. In this paper, we proposed a readability assessment method for Tibetan textbook (a low resource language). We extract features based on the information that are gotten by Tibetan segmentation and named entity recognition. Then, we calculate the correlation of different features using Pearson Correlation Coefficient and select some feature sets to design the readability formula. Fit detection, F test and T test are applied on these selected features to generate a More >

  • Open Access

    ARTICLE

    Tibetan Multi-Dialect Speech and Dialect Identity Recognition

    Yue Zhao1, Jianjian Yue1, Wei Song1,*, Xiaona Xu1, Xiali Li1, Licheng Wu1, Qiang Ji2

    CMC-Computers, Materials & Continua, Vol.60, No.3, pp. 1223-1235, 2019, DOI:10.32604/cmc.2019.05636

    Abstract Tibetan language has very limited resource for conventional automatic speech recognition so far. It lacks of enough data, sub-word unit, lexicons and word inventories for some dialects. And speech content recognition and dialect classification have been treated as two independent tasks and modeled respectively in most prior works. But the two tasks are highly correlated. In this paper, we present a multi-task WaveNet model to perform simultaneous Tibetan multi-dialect speech recognition and dialect identification. It avoids processing the pronunciation dictionary and word segmentation for new dialects, while, in the meantime, allows training speech recognition and More >

  • Open Access

    ARTICLE

    Tibetan Sentiment Classification Method Based on Semi-Supervised Recursive Autoencoders

    Xiaodong Yan1,2, Wei Song1,2,*, Xiaobing Zhao1,2, Anti Wang3

    CMC-Computers, Materials & Continua, Vol.60, No.2, pp. 707-719, 2019, DOI:10.32604/cmc.2019.05157

    Abstract We apply the semi-supervised recursive autoencoders (RAE) model for the sentiment classification task of Tibetan short text, and we obtain a better classification effect. The input of the semi-supervised RAE model is the word vector. We crawled a large amount of Tibetan text from the Internet, got Tibetan word vectors by using Word2vec, and verified its validity through simple experiments. The values of parameter α and word vector dimension are important to the model effect. The experiment results indicate that when α is 0.3 and the word vector dimension is 60, the model works best. More >

Displaying 1-10 on page 1 of 13. Per Page