Open Access
ARTICLE
Tibetan Question Generation Based on Sequence to Sequence Model
1 School of Information Engineering, Minzu University of China, Beijing, 100081, China
2 Minority Languages Branch, National Language Resource and Monitoring Research Center
3 Queen Mary University of London, London, E1 4NS, UK
* Corresponding Author: Yuan Sun. Email:
Computers, Materials & Continua 2021, 68(3), 3203-3213. https://doi.org/10.32604/cmc.2021.016517
Received 04 January 2021; Accepted 06 March 2021; Issue published 06 May 2021
Abstract
As the dual task of question answering, question generation (QG) is a significant and challenging task that aims to generate valid and fluent questions from a given paragraph. The QG task is of great significance to question answering systems, conversational systems, and machine reading comprehension systems. Recent sequence to sequence neural models have achieved outstanding performance in English and Chinese QG tasks. However, the task of Tibetan QG is rarely mentioned. The key factor impeding its development is the lack of a public Tibetan QG dataset. Faced with this challenge, the present paper first collects 425 articles from the Tibetan Wikipedia website and constructs 7,234 question–answer pairs through crowdsourcing. Next, we propose a Tibetan QG model based on the sequence to sequence framework to generate Tibetan questions from given paragraphs. Secondly, in order to generate answer-aware questions, we introduce an attention mechanism that can capture the key semantic information related to the answer. Meanwhile, we adopt a copy mechanism to copy some words in the paragraph to avoid generating unknown or rare words in the question. Finally, experiments show that our model achieves higher performance than baseline models. We also further explore the attention and copy mechanisms, and prove their effectiveness through experiments.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.