Software Reliability Prediction Using Ensemble Learning on Selected Features in Imbalanced and Balanced Datasets: A Review

Rath, Suneel Kumar; Sahu, Madhusmita; Das, Shom Prasad; Jena, Junali Jasmine; Jena, Chitralekha; Khan, Baseem; Ali, Ahmed; Bokoro, Pitshou

doi:10.32604/csse.2024.057067

Open Access icon Open Access

REVIEW

Software Reliability Prediction Using Ensemble Learning on Selected Features in Imbalanced and Balanced Datasets: A Review

by Suneel Kumar Rath¹, Madhusmita Sahu¹, Shom Prasad Das², Junali Jasmine Jena³, Chitralekha Jena⁴, Baseem Khan^5,6,7,*, Ahmed Ali⁷, Pitshou Bokoro⁷

1 Department of Computer Science and Engineering, C.V. Raman Global University, Bhubaneswar, 752054, India
2 Department of Computer Science and Engineering, Birla Global University, Bhubaneswar, 751029, India
3 School of Computer Engineering, KIIT (Deemed to be) University, Bhubaneswar, 751024, India
4 School of Electrical Engineering, KIIT (Deemed to be) University, Bhubaneswar, 751024, India
5 Department of Electrical and Computer Engineering, Hawassa University, Hawassa, P. O. Box 05, Ethiopia
6 Center for Renewable Energy and Microgrids, Huanjiang Laboratory, Zhejiang University, Shaoxing, 311816, China
7 Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Johannesburg, 2028, South Africa

* Corresponding Author: Baseem Khan. Email: email

Computer Systems Science and Engineering 2024, 48(6), 1513-1536. https://doi.org/10.32604/csse.2024.057067

Received 07 August 2024; Accepted 31 August 2024; Issue published 22 November 2024

Abstract

Redundancy, correlation, feature irrelevance, and missing samples are just a few problems that make it difficult to analyze software defect data. Additionally, it might be challenging to maintain an even distribution of data relating to both defective and non-defective software. The latter software class’s data are predominately present in the dataset in the majority of experimental situations. The objective of this review study is to demonstrate the effectiveness of combining ensemble learning and feature selection in improving the performance of defect classification. Besides the successful feature selection approach, a novel variant of the ensemble learning technique is analyzed to address the challenges of feature redundancy and data imbalance, providing robustness in the classification process. To overcome these problems and lessen their impact on the fault classification performance, authors carefully integrate effective feature selection with ensemble learning models. Forward selection demonstrates that a significant area under the receiver operating curve (ROC) can be attributed to only a small subset of features. The Greedy forward selection (GFS) technique outperformed Pearson’s correlation method when evaluating feature selection techniques on the datasets. Ensemble learners, such as random forests (RF) and the proposed average probability ensemble (APE), demonstrate greater resistance to the impact of weak features when compared to weighted support vector machines (W-SVMs) and extreme learning machines (ELM). Furthermore, in the case of the NASA and Java datasets, the enhanced average probability ensemble model, which incorporates the Greedy forward selection technique with the average probability ensemble model, achieved remarkably high accuracy for the area under the ROC. It approached a value of 1.0, indicating exceptional performance. This review emphasizes the importance of meticulously selecting attributes in a software dataset to accurately classify damaged components. In addition, the suggested ensemble learning model successfully addressed the aforementioned problems with software data and produced outstanding classification performance.

Keywords

Ensemble classifier; hybrid classifier; software reliability prediction

Cite This Article

APA Style

Rath, S.K., Sahu, M., Das, S.P., Jena, J.J., Jena, C. et al. (2024). Software reliability prediction using ensemble learning on selected features in imbalanced and balanced datasets: A review. Computer Systems Science and Engineering, 48(6), 1513-1536. https://doi.org/10.32604/csse.2024.057067

Vancouver Style

Rath SK, Sahu M, Das SP, Jena JJ, Jena C, Khan B, et al. Software reliability prediction using ensemble learning on selected features in imbalanced and balanced datasets: A review. Comput Syst Sci Eng. 2024;48(6):1513-1536 https://doi.org/10.32604/csse.2024.057067

IEEE Style

S. K. Rath et al., “Software Reliability Prediction Using Ensemble Learning on Selected Features in Imbalanced and Balanced Datasets: A Review,” Comput. Syst. Sci. Eng., vol. 48, no. 6, pp. 1513-1536, 2024. https://doi.org/10.32604/csse.2024.057067

BibTex EndNote RIS

Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Software Reliability Prediction Using Ensemble Learning on Selected Features in Imbalanced and Balanced Datasets: A Review

Abstract

Keywords

Cite This Article

324

101

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link