Open Access iconOpen Access

ARTICLE

crossmark

Feature Engineering Methods for Analyzing Blood Samples for Early Diagnosis of Hepatitis Using Machine Learning Approaches

Mohamed A.G. Hazber1,*, Ebrahim Mohammed Senan2,3, Hezam Saud Alrashidi1

1 Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha’il, Hail, 81481, Saudi Arabia
2 Department of Computer Science, College of Applied Sciences, Hajjah University, Hajjah, 9677, Yemen
3 Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Al-Razi University, Sana’a, 9671, Yemen

* Corresponding Author: Mohamed A.G. Hazber. Email: email

(This article belongs to the Special Issue: Exploring the Impact of Artificial Intelligence on Healthcare: Insights into Data Management, Integration, and Ethical Considerations)

Computer Modeling in Engineering & Sciences 2025, 142(3), 3229-3254. https://doi.org/10.32604/cmes.2025.062302

Abstract

Hepatitis is an infection that affects the liver through contaminated foods or blood transfusions, and it has many types, from normal to serious. Hepatitis is diagnosed through many blood tests and factors; Artificial Intelligence (AI) techniques have played an important role in early diagnosis and help physicians make decisions. This study evaluated the performance of Machine Learning (ML) algorithms on the hepatitis data set. The dataset contains missing values that have been processed and outliers removed. The dataset was counterbalanced by the Synthetic Minority Over-sampling Technique (SMOTE). The features of the data set were processed in two ways: first, the application of the Recursive Feature Elimination (RFE) algorithm to arrange the percentage of contribution of each feature to the diagnosis of hepatitis, then selection of important features using the t-distributed Stochastic Neighbor Embedding (t-SNE) and Principal Component Analysis (PCA) algorithms. Second, the SelectKBest function was applied to give scores for each attribute, followed by the t-SNE and PCA algorithms. Finally, the classification algorithms K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Artificial Neural Network (ANN), Decision Tree (DT), and Random Forest (RF) were fed by the dataset after processing the features in different methods are RFE with t-SNE and PCA and SelectKBest with t-SNE and PCA). All algorithms yielded promising results for diagnosing hepatitis data sets. The RF with RFE and PCA methods achieved accuracy, Precision, Recall, and AUC of 97.18%, 96.72%, 97.29%, and 94.2%, respectively, during the training phase. During the testing phase, it reached accuracy, Precision, Recall, and AUC by 96.31%, 95.23%, 97.11%, and 92.67%, respectively.

Keywords

Hepatitis; machine learning; PCA; RFE; SelectKBest; t-SNE

Cite This Article

APA Style
Hazber, M.A., Senan, E.M., Alrashidi, H.S. (2025). Feature Engineering Methods for Analyzing Blood Samples for Early Diagnosis of Hepatitis Using Machine Learning Approaches. Computer Modeling in Engineering & Sciences, 142(3), 3229–3254. https://doi.org/10.32604/cmes.2025.062302
Vancouver Style
Hazber MA, Senan EM, Alrashidi HS. Feature Engineering Methods for Analyzing Blood Samples for Early Diagnosis of Hepatitis Using Machine Learning Approaches. Comput Model Eng Sci. 2025;142(3):3229–3254. https://doi.org/10.32604/cmes.2025.062302
IEEE Style
M. A. Hazber, E. M. Senan, and H. S. Alrashidi, “Feature Engineering Methods for Analyzing Blood Samples for Early Diagnosis of Hepatitis Using Machine Learning Approaches,” Comput. Model. Eng. Sci., vol. 142, no. 3, pp. 3229–3254, 2025. https://doi.org/10.32604/cmes.2025.062302



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 292

    View

  • 233

    Download

  • 0

    Like

Share Link