Vol.66, No.2, 2021, pp.1921-1936, doi:10.32604/cmc.2020.012151
OPEN ACCESS
ARTICLE
Performance Estimation of Machine Learning Algorithms in the Factor Analysis of COVID-19 Dataset
  • Ashutosh Kumar Dubey1,*, Sushil Narang1, Abhishek Kumar1, Satya Murthy Sasubilli2, Vicente García-Díaz3
1 Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
2 Workday Integration Architect Huntington, Columbus, OH, USA
3 Department of Computer Science, University of Oviedo, Oviedo, Spain
* Corresponding Author: Ashutosh Kumar Dubey. Email:
(This article belongs to this Special Issue: Machine Learning and Computational Methods for COVID-19 Disease Detection and Prediction)
Received 16 June 2020; Accepted 25 July 2020; Issue published 26 November 2020
Abstract
Novel Coronavirus Disease (COVID-19) is a communicable disease that originated during December 2019, when China officially informed the World Health Organization (WHO) regarding the constellation of cases of the disease in the city of Wuhan. Subsequently, the disease started spreading to the rest of the world. Until this point in time, no specific vaccine or medicine is available for the prevention and cure of the disease. Several research works are being carried out in the fields of medicinal and pharmaceutical sciences aided by data analytics and machine learning in the direction of treatment and early detection of this viral disease. The present report describes the use of machine learning algorithms [Linear and Logistic Regression, Decision Tree (DT), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and SVM with Grid Search] for the prediction and classification in relation to COVID-19. The data used for experimentation was the COVID-19 dataset acquired from the Center for Systems Science and Engineering (CSSE), Johns Hopkins University (JHU). The assimilated results indicated that the risk period for the patients is 12–14 days, beyond which the probability of survival of the patient may increase. In addition, it was also indicated that the probability of death in COVID cases increases with age. The death probability was found to be higher in males as compared to females. SVM with Grid search methods demonstrated the highest accuracy of approximately 95%, followed by the decision tree algorithm with an accuracy of approximately 94%. The present study and analysis pave a way in the direction of attribute correlation, estimation of survival days, and the prediction of death probability. The findings of the present study clearly indicate that machine learning algorithms have strong capabilities of prediction and classification in relation to COVID-19 as well.
Keywords
COVID-19; linear and logistic regression; DT; KNN; SVM; SVM with grid search
Cite This Article
A. K. Dubey, S. Narang, A. Kumar, S. M. Sasubilli and V. García-Díaz, "Performance estimation of machine learning algorithms in the factor analysis of covid-19 dataset," Computers, Materials & Continua, vol. 66, no.2, pp. 1921–1936, 2021.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.