Open Access
ARTICLE
Tibetan Sorting Method Based on Hash Function
1 College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
2 The Computer College, Qinghai Minzu University, Qinghai, 810007, China
3 School of Computer Science, Beijing Institute of Technology, Beijing, 10008, China
4 School of Computing and Communications, The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom
* Corresponding Author: Dawei Song. Email:
Journal on Artificial Intelligence 2022, 4(2), 85-98. https://doi.org/10.32604/jai.2022.029141
Received 26 February 2022; Accepted 21 April 2022; Issue published 18 July 2022
Abstract
Sorting the Tibetan language quickly and accurately requires first identifying the component elements that make up Tibetan syllables and then sorting by the priority of the component. Based on the study of Tibetan text structure, grammatical rules and syllable structure, we present a structure-based Tibetan syllable recognition method that uses syllable structure instead of grammar. This method avoids complicated Tibetan grammar and recognizes the components of Tibetan syllables simply and quickly. On the basis of identifying the components of Tibetan syllables, a Tibetan syllable sorting algorithm that conforms to the language sorting rules is proposed. The core of the Tibetan syllable sorting algorithm is a hash function. Research has found that the sorting of all legal Tibetan syllables requires eight components of information. The hash function is based on this discovery and can be assigned corresponding weights according to different sorting verify the effectiveness of the Tibetan sorting algorithm, we established an experimental corpus using the Tibetan sorting standard document recognized by the majority of Tibetan users, namely the New Tibetan Orthographic Dictionary. Experiments show that this method produces results completely consistent with standard reference works, with an accuracy of 100%, and with minimal computational time.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.