Imbalanced Classification in Diabetics Using Ensembled Machine Learning

M. Kumar; Mohammad Khan; Sukumar Rajendran; Ayman Noor; A. Dass; J. Prabhu

doi:10.32604/cmc.2022.025865

Open Access icon Open Access

ARTICLE

Imbalanced Classification in Diabetics Using Ensembled Machine Learning

M. Sandeep Kumar¹, Mohammad Zubair Khan^2,*, Sukumar Rajendran¹, Ayman Noor³, A. Stephen Dass¹, J. Prabhu¹

1 School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India
2 Department of Computer Science and Information, Taibah University, Medina, Saudi Arabia
3 College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia

* Corresponding Author: Mohammad Zubair Khan. Email: email

Computers, Materials & Continua 2022, 72(3), 4397-4409. https://doi.org/10.32604/cmc.2022.025865

Received 07 December 2021; Accepted 16 February 2022; Issue published 21 April 2022

Abstract

Diabetics is one of the world’s most common diseases which are caused by continued high levels of blood sugar. The risk of diabetics can be lowered if the diabetic is found at the early stage. In recent days, several machine learning models were developed to predict the diabetic presence at an early stage. In this paper, we propose an embedded-based machine learning model that combines the split-vote method and instance duplication to leverage an imbalanced dataset called PIMA Indian to increase the prediction of diabetics. The proposed method uses both the concept of over-sampling and under-sampling along with model weighting to increase the performance of classification. Different measures such as Accuracy, Precision, Recall, and F1-Score are used to evaluate the model. The results we obtained using K-Nearest Neighbor (kNN), Naïve Bayes (NB), Support Vector Machines (SVM), Random Forest (RF), Logistic Regression (LR), and Decision Trees (DT) were 89.32%, 91.44%, 95.78%, 89.3%, 81.76%, and 80.38% respectively. The SVM model is more efficient than other models which are 21.38% more than exiting machine learning-based works.

Keywords

Diabetics classification; imbalanced data; split-vote; instance duplication

Cite This Article

APA Style

Sandeep Kumar, M., Khan, M.Z., Rajendran, S., Noor, A., Stephen Dass, A. et al. (2022). Imbalanced Classification in Diabetics Using Ensembled Machine Learning. Computers, Materials & Continua, 72(3), 4397–4409. https://doi.org/10.32604/cmc.2022.025865

Vancouver Style

Sandeep Kumar M, Khan MZ, Rajendran S, Noor A, Stephen Dass A, Prabhu J. Imbalanced Classification in Diabetics Using Ensembled Machine Learning. Comput Mater Contin. 2022;72(3):4397–4409. https://doi.org/10.32604/cmc.2022.025865

IEEE Style

M. Sandeep Kumar, M. Z. Khan, S. Rajendran, A. Noor, A. Stephen Dass, and J. Prabhu, “Imbalanced Classification in Diabetics Using Ensembled Machine Learning,” Comput. Mater. Contin., vol. 72, no. 3, pp. 4397–4409, 2022. https://doi.org/10.32604/cmc.2022.025865

BibTex EndNote RIS

Copyright © 2022 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Imbalanced Classification in Diabetics Using Ensembled Machine Learning

Abstract

Keywords

Cite This Article

2688

1382

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link