Open Access
ARTICLE
Smart Contract Vulnerability Detection Using Large Language Models and Graph Structural Analysis
1 Department of Artificial Intelligence Application, Kwangwoon University, Seoul, 01897, Republic of Korea
2 Department of Artificial Intelligence Convergence, Kwangwoon University, Seoul, 01897, Republic of Korea
3 Department of Big Data Analytics, KyungHee University, Seoul, 02447, Republic of Korea
4 Department of Management Information Systems, Jeju National University, Jeju, 63243, Republic of Korea
5 School of Information Convergence, Kwangwoon University, Seoul, 01897, Republic of Korea
* Corresponding Authors: Jinhyun Ahn. Email: ; Dong-Hyuk Im. Email:
(This article belongs to the Special Issue: Advances in AI Techniques in Convergence ICT)
Computers, Materials & Continua 2025, 83(1), 785-801. https://doi.org/10.32604/cmc.2025.061185
Received 19 November 2024; Accepted 12 February 2025; Issue published 26 March 2025
Abstract
Smart contracts are self-executing programs on blockchains that manage complex business logic with transparency and integrity. However, their immutability after deployment makes programming errors particularly critical, as such errors can be exploited to compromise blockchain security. Existing vulnerability detection methods often rely on fixed rules or target specific vulnerabilities, limiting their scalability and adaptability to diverse smart contract scenarios. Furthermore, natural language processing approaches for source code analysis frequently fail to capture program flow, which is essential for identifying structural vulnerabilities. To address these limitations, we propose a novel model that integrates textual and structural information for smart contract vulnerability detection. Our approach employs the CodeBERT NLP model for textual analysis, augmented with structural insights derived from control flow graphs created using the abstract syntax tree and opcode of smart contracts. Each graph node is embedded using Sent2Vec, and centrality analysis is applied to highlight critical paths and nodes within the code. The extracted features are normalized and combined into a prompt for a large language model to detect vulnerabilities effectivel. Experimental results demonstrate the superiority of our model, achieving an accuracy of 86.70%, a recall of 84.87%, a precision of 85.24%, and an F1-score of 84.46%. These outcomes surpass existing methods, including CodeBERT alone (accuracy: 81.26%, F1-score: 79.84%) and CodeBERT combined with abstract syntax tree analysis (accuracy: 83.48%, F1-score: 79.65%). The findings underscore the effectiveness of incorporating graph structural information alongside text-based analysis, offering improved scalability and performance in detecting diverse vulnerabilities.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.