Open Access
ARTICLE
Graph-Based Feature Learning for Cross-Project Software Defect Prediction
1
School of Software, Northwestern Polytechnical University, Xi’an, China
2
School of Computer Science, Northwestern Polytechnical University, Xi’an, China
3 Department of Computer Science, Hodeidah University, PO Box 3114, Al-Hudaydah, Yemen
4 Research Institute of Engineering and Technology, Hanyang University, Ansan, Korea
5 Department of Robotics, Hanyang University, Ansan, Korea
* Corresponding Authors: Redhwan Algabri. Email: ; Sungon Lee. Email:
Computers, Materials & Continua 2023, 77(1), 161-180. https://doi.org/10.32604/cmc.2023.043680
Received 09 July 2023; Accepted 06 September 2023; Issue published 31 October 2023
Abstract
Cross-project software defect prediction (CPDP) aims to enhance defect prediction in target projects with limited or no historical data by leveraging information from related source projects. The existing CPDP approaches rely on static metrics or dynamic syntactic features, which have shown limited effectiveness in CPDP due to their inability to capture higher-level system properties, such as complex design patterns, relationships between multiple functions, and dependencies in different software projects, that are important for CPDP. This paper introduces a novel approach, a graph-based feature learning model for CPDP (GB-CPDP), that utilizes NetworkX to extract features and learn representations of program entities from control flow graphs (CFGs) and data dependency graphs (DDGs). These graphs capture the structural and data dependencies within the source code. The proposed approach employs Node2Vec to transform CFGs and DDGs into numerical vectors and leverages Long Short-Term Memory (LSTM) networks to learn predictive models. The process involves graph construction, feature learning through graph embedding and LSTM, and defect prediction. Experimental evaluation using nine open-source Java projects from the PROMISE dataset demonstrates that GB-CPDP outperforms state-of-the-art CPDP methods in terms of F1-measure and Area Under the Curve (AUC). The results showcase the effectiveness of GB-CPDP in improving the performance of cross-project defect prediction.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.