Open Access
ARTICLE
Binary Program Vulnerability Mining Based on Neural Network
1 School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
2 School of Electrical and Computer Engineering, University of Waterloo, Waterloo, N2L 3G1, Canada
* Corresponding Author: Zhiguo Wang. Email:
Computers, Materials & Continua 2024, 78(2), 1861-1879. https://doi.org/10.32604/cmc.2023.046595
Received 08 October 2023; Accepted 15 December 2023; Issue published 27 February 2024
Abstract
Software security analysts typically only have access to the executable program and cannot directly access the source code of the program. This poses significant challenges to security analysis. While it is crucial to identify vulnerabilities in such non-source code programs, there exists a limited set of generalized tools due to the low versatility of current vulnerability mining methods. However, these tools suffer from some shortcomings. In terms of targeted fuzzing, the path searching for target points is not streamlined enough, and the completely random testing leads to an excessively large search space. Additionally, when it comes to code similarity analysis, there are issues with incomplete code feature extraction, which may result in information loss. In this paper, we propose a cross-platform and cross-architecture approach to exploit vulnerabilities using neural network obfuscation techniques. By leveraging the Angr framework, a deobfuscation technique is introduced, along with the adoption of a VEX-IR-based intermediate language conversion method. This combination allows for the unified handling of binary programs across various architectures, compilers, and compilation options. Subsequently, binary programs are processed to extract multi-level spatial features using a combination of a skip-gram model with self-attention mechanism and a bidirectional Long Short-Term Memory (LSTM) network. Finally, the graph embedding network is utilized to evaluate the similarity of program functionalities. Based on these similarity scores, a target function is determined, and symbolic execution is applied to solve the target function. The solved content serves as the initial seed for targeted fuzzing. The binary program is processed by using the de-obfuscation technique and intermediate language transformation method, and then the similarity of program functions is evaluated by using a graph embedding network, and symbolic execution is performed based on these similarity scores. This approach facilitates cross-architecture analysis of executable programs without their source codes and concurrently reduces the risk of symbolic execution path explosion.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.