MMCSD: Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation

Huansha Wang; Ruiyang Huang; Qinrang Liu; Shaomei Li; Jianpeng Zhang

doi:10.32604/cmc.2025.060395

Open Access icon Open Access

ARTICLE

MMCSD: Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation

Huansha Wang^*, Ruiyang Huang^*, Qinrang Liu, Shaomei Li, Jianpeng Zhang

National Digital Switching System Engineering & Technological R&D Center, Information Engineering University, Zhengzhou, 450001, China

* Corresponding Authors: Huansha Wang. Email: email ; Ruiyang Huang. Email: email

Computers, Materials & Continua 2025, 83(1), 761-783. https://doi.org/10.32604/cmc.2025.060395

Received 31 October 2024; Accepted 20 January 2025; Issue published 26 March 2025

Abstract

Multi-modal knowledge graph completion (MMKGC) aims to complete missing entities or relations in multi-modal knowledge graphs, thereby discovering more previously unknown triples. Due to the continuous growth of data and knowledge and the limitations of data sources, the visual knowledge within the knowledge graphs is generally of low quality, and some entities suffer from the issue of missing visual modality. Nevertheless, previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing. In this case, mainstream MMKGC models only use pre-trained visual encoders to extract features and transfer the semantic information to the joint embeddings through modal fusion, which inevitably suffers from problems such as error propagation and increased uncertainty. To address these problems, we propose a Multi-modal knowledge graph Completion model based on Super-resolution and Detailed Description Generation (MMCSD). Specifically, we leverage a pre-trained residual network to enhance the resolution and improve the quality of the visual modality. Moreover, we design multi-level visual semantic extraction and entity description generation, thereby further extracting entity semantics from structural triples and visual images. Meanwhile, we train a variational multi-modal auto-encoder and utilize a pre-trained multi-modal language model to complement the missing visual features. We conducted experiments on FB15K-237 and DB13K, and the results showed that MMCSD can effectively perform MMKGC and achieve state-of-the-art performance.

Keywords

Multi-modal knowledge graph; knowledge graph completion; multi-modal fusion

Cite This Article

APA Style

Wang, H., Huang, R., Liu, Q., Li, S., Zhang, J. (2025). MMCSD: multi-modal knowledge graph completion based on super-resolution and detailed description generation. Computers, Materials & Continua, 83(1), 761–783. https://doi.org/10.32604/cmc.2025.060395

Vancouver Style

Wang H, Huang R, Liu Q, Li S, Zhang J. MMCSD: multi-modal knowledge graph completion based on super-resolution and detailed description generation. Comput Mater Contin. 2025;83(1):761–783. https://doi.org/10.32604/cmc.2025.060395

IEEE Style

H. Wang, R. Huang, Q. Liu, S. Li, and J. Zhang, “MMCSD: Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation,” Comput. Mater. Contin., vol. 83, no. 1, pp. 761–783, 2025. https://doi.org/10.32604/cmc.2025.060395

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

MMCSD: Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation

Abstract

Keywords

Cite This Article

285

128

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link