Open Access
ARTICLE
Enhancing Relational Triple Extraction in Specific Domains: Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models
School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, China
* Corresponding Author: Jianpeng Hu. Email:
Computers, Materials & Continua 2024, 79(2), 2481-2503. https://doi.org/10.32604/cmc.2024.050005
Received 24 January 2024; Accepted 27 March 2024; Issue published 15 May 2024
Abstract
In the process of constructing domain-specific knowledge graphs, the task of relational triple extraction plays a critical role in transforming unstructured text into structured information. Existing relational triple extraction models face multiple challenges when processing domain-specific data, including insufficient utilization of semantic interaction information between entities and relations, difficulties in handling challenging samples, and the scarcity of domain-specific datasets. To address these issues, our study introduces three innovative components: Relation semantic enhancement, data augmentation, and a voting strategy, all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks. We first propose an innovative attention interaction module. This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information from relation labels. Second, we propose a voting strategy that effectively combines the strengths of large language models (LLMs) and fine-tuned small pre-trained language models (SLMs) to reevaluate challenging samples, thereby improving the model’s adaptability in specific domains. Additionally, we explore the use of LLMs for data augmentation, aiming to generate domain-specific datasets to alleviate the scarcity of domain data. Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects, with F1 scores exceeding the State of the Art models by 2%, 1.6%, and 0.6%, respectively, validating the effectiveness and generalizability of our approach.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.