Open Access
ARTICLE
Chimp Optimization Algorithm Based Feature Selection with Machine Learning for Medical Data Classification
1 Department of Mathematics, College of Education, Al-Zahraa University for Women, Karbala, Iraq
2 Biomedical Engineering Department, College of Engineering, University of Warith Al-Anbiyaa, Karbala, Iraq
3 Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
4 Computer Science Department, Security Engineering Lab, Prince Sultan University, Riyadh, 11586, Saudi Arabia
5 Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf, 32952, Egypt
6 College of Information Technology, Imam Jaafar Al-Sadiq University, Al-Muthanna, 66002, Iraq
7 Department of Medical Instrumentation Techniques Engineering, Al-Mustaqbal University College, Hillah, 51001, Iraq
8 Computer Engineering Department, Mazaya University College, Dhi Qar, Iraq
9 College of Technical Engineering, the Islamic University, Najaf, Iraq
* Corresponding Author: Naglaa F. Soliman. Email:
Computer Systems Science and Engineering 2023, 47(3), 2791-2814. https://doi.org/10.32604/csse.2023.038762
Received 28 December 2022; Accepted 11 July 2023; Issue published 09 November 2023
Abstract
Data mining plays a crucial role in extracting meaningful knowledge from large-scale data repositories, such as data warehouses and databases. Association rule mining, a fundamental process in data mining, involves discovering correlations, patterns, and causal structures within datasets. In the healthcare domain, association rules offer valuable opportunities for building knowledge bases, enabling intelligent diagnoses, and extracting invaluable information rapidly. This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System (MLARMC-HDMS). The MLARMC-HDMS technique integrates classification and association rule mining (ARM) processes. Initially, the chimp optimization algorithm-based feature selection (COAFS) technique is employed within MLARMC-HDMS to select relevant attributes. Inspired by the foraging behavior of chimpanzees, the COA algorithm mimics their search strategy for food. Subsequently, the classification process utilizes stochastic gradient descent with a multilayer perceptron (SGD-MLP) model, while the Apriori algorithm determines attribute relationships. We propose a COA-based feature selection approach for medical data classification using machine learning techniques. This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set. We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers. Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods, achieving higher accuracy and precision rates in medical data classification tasks. The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features, thereby enhancing the diagnosis and treatment of various diseases. To provide further validation, we conduct detailed experiments on a benchmark medical dataset, revealing the superiority of the MLARMC-HDMS model over other methods, with a maximum accuracy of 99.75%. Therefore, this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis. The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.