Open Access
ARTICLE
A New Hybrid Feature Selection Sequence for Predicting Breast Cancer Survivability Using Clinical Datasets
Centre for Information Technology and Engineering, Manonmaniam Sundaranar University, Tirunelveli, India
* Corresponding Author: E. Jenifer Sweetlin. Email:
Intelligent Automation & Soft Computing 2023, 37(1), 343-367. https://doi.org/10.32604/iasc.2023.036742
Received 11 October 2022; Accepted 06 January 2023; Issue published 29 April 2023
Abstract
This paper proposes a hybrid feature selection sequence complemented with filter and wrapper concepts to improve the accuracy of Machine Learning (ML) based supervised classifiers for classifying the survivability of breast cancer patients into classes, living and deceased using METABRIC and Surveillance, Epidemiology and End Results (SEER) datasets. The ML-based classifiers used in the analysis are: Multiple Logistic Regression, K-Nearest Neighbors, Decision Tree, Random Forest, Support Vector Machine and Multilayer Perceptron. The workflow of the proposed ML algorithm sequence comprises the following stages: data cleaning, data balancing, feature selection via a filter and wrapper sequence, cross validation-based training, testing and performance evaluation. The results obtained are compared in terms of the following classification metrics: Accuracy, Precision, F1 score, True Positive Rate, True Negative Rate, False Positive Rate, False Negative Rate, Area under the Receiver Operating Characteristics curve, Area under the Precision-Recall curve and Mathews Correlation Coefficient. The comparison shows that the proposed feature selection sequence produces better results from all supervised classifiers than all other feature selection sequences considered in the analysis.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.