Open Access
ARTICLE
Information Extraction Based on Multi-turn Question Answering for Analyzing Korean Research Trends
1 School of Computer Science and Engineering, KOREATECH, Cheonan, 31253, Korea
2 Korea Institute of Science and Technology Information, KISTI, Daejeon, 34141, Korea
* Corresponding Author: Heung-Seon Oh. Email:
Computers, Materials & Continua 2023, 74(2), 2967-2980. https://doi.org/10.32604/cmc.2023.031983
Received 02 May 2022; Accepted 01 August 2022; Issue published 31 October 2022
Abstract
Analyzing Research and Development (R&D) trends is important because it can influence future decisions regarding R&D direction. In typical trend analysis, topic or technology taxonomies are employed to compute the popularities of the topics or codes over time. Although it is simple and effective, the taxonomies are difficult to manage because new technologies are introduced rapidly. Therefore, recent studies exploit deep learning to extract pre-defined targets such as problems and solutions. Based on the recent advances in question answering (QA) using deep learning, we adopt a multi-turn QA model to extract problems and solutions from Korean R&D reports. With the previous research, we use the reports directly and analyze the difficulties in handling them using QA style on Information Extraction (IE) for sentence-level benchmark dataset. After investigating the characteristics of Korean R&D, we propose a model to deal with multiple and repeated appearances of targets in the reports. Accordingly, we propose a model that includes an algorithm with two novel modules and a prompt. A newly proposed methodology focuses on reformulating a question without a static template or pre-defined knowledge. We show the effectiveness of the proposed model using a Korean R&D report dataset that we constructed and presented an in-depth analysis of the benefits of the multi-turn QA model.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.