Open Access
ARTICLE
A Study on Outlier Detection and Feature Engineering Strategies in Machine Learning for Heart Disease Prediction
1 Department of Computer Science & Engineering (AIML), MLR Institute of Technology, Hyderabad, 500043, India
2 Department of Computer Science and Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada, 520007, India
3 Department of Computer Science & Engineering, Sir C. R. Reddy College of Engineering, Eluru, 534001, India
4 Amrita School of Computing, Amrita Vishwa Vidyapeetham, Amaravati, 522503, India
5 Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, 60455-970, Brazil
* Corresponding Author: Parvathaneni Naga Srinivasu. Email:
Computer Systems Science and Engineering 2024, 48(5), 1085-1112. https://doi.org/10.32604/csse.2024.053603
Received 06 May 2024; Accepted 25 July 2024; Issue published 13 September 2024
Abstract
This paper investigates the application of machine learning to develop a response model to cardiovascular problems and the use of AdaBoost which incorporates an application of Outlier Detection methodologies namely; Z-Score incorporated with Grey Wolf Optimization (GWO) as well as Interquartile Range (IQR) coupled with Ant Colony Optimization (ACO). Using a performance index, it is shown that when compared with the Z-Score and GWO with AdaBoost, the IQR and ACO, with AdaBoost are not very accurate (89.0% vs. 86.0%) and less discriminative (Area Under the Curve (AUC) score of 93.0% vs. 91.0%). The Z-Score and GWO methods also outperformed the others in terms of precision, scoring 89.0%; and the recall was also found to be satisfactory, scoring 90.0%. Thus, the paper helps to reveal various specific benefits and drawbacks associated with different outlier detection and feature selection techniques, which can be important to consider in further improving various aspects of diagnostics in cardiovascular health. Collectively, these findings can enhance the knowledge of heart disease prediction and patient treatment using enhanced and innovative machine learning (ML) techniques. These findings when combined improve patient therapy knowledge and cardiac disease prediction through the use of cutting-edge and improved machine learning approaches. This work lays the groundwork for more precise diagnosis models by highlighting the benefits of combining multiple optimization methodologies. Future studies should focus on maximizing patient outcomes and model efficacy through research on these combinations.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.