Open Access iconOpen Access

ARTICLE

crossmark

Machine Learning Approach for COVID-19 Detection on Twitter

Samina Amin1,*, M. Irfan Uddin1, Heyam H. Al-Baity2, M. Ali Zeb1, M. Abrar Khan1

1 Institute of Computing, Kohat University of Science and Technology, Kohat, 26000, Pakistan
2 Department of Information Technology, College of Computer and Information Sciences, King Saud University, Riyadh, 11543, Saudi Arabia

* Corresponding Author: Samina Amin. Email: email

(This article belongs to the Special Issue: Deep Learning and Parallel Computing for Intelligent and Efficient IoT)

Computers, Materials & Continua 2021, 68(2), 2231-2247. https://doi.org/10.32604/cmc.2021.016896

Abstract

Social networking services (SNSs) provide massive data that can be a very influential source of information during pandemic outbreaks. This study shows that social media analysis can be used as a crisis detector (e.g., understanding the sentiment of social media users regarding various pandemic outbreaks). The novel Coronavirus Disease-19 (COVID-19), commonly known as coronavirus, has affected everyone worldwide in 2020. Streaming Twitter data have revealed the status of the COVID-19 outbreak in the most affected regions. This study focuses on identifying COVID-19 patients using tweets without requiring medical records to find the COVID-19 pandemic in Twitter messages (tweets). For this purpose, we propose herein an intelligent model using traditional machine learning-based approaches, such as support vector machine (SVM), logistic regression (LR), naïve Bayes (NB), random forest (RF), and decision tree (DT) with the help of the term frequency inverse document frequency (TF-IDF) to detect the COVID-19 pandemic in Twitter messages. The proposed intelligent traditional machine learning-based model classifies Twitter messages into four categories, namely, confirmed deaths, recovered, and suspected. For the experimental analysis, the tweet data on the COVID-19 pandemic are analyzed to evaluate the results of traditional machine learning approaches. A benchmark dataset for COVID-19 on Twitter messages is developed and can be used for future research studies. The experiments show that the results of the proposed approach are promising in detecting the COVID-19 pandemic in Twitter messages with overall accuracy, precision, recall, and F1 score between 70% and 80% and the confusion matrix for machine learning approaches (i.e., SVM, NB, LR, RF, and DT) with the TF-IDF feature extraction technique.

Keywords


Cite This Article

APA Style
Amin, S., Uddin, M.I., Al-Baity, H.H., Zeb, M.A., Khan, M.A. (2021). Machine learning approach for COVID-19 detection on twitter. Computers, Materials & Continua, 68(2), 2231-2247. https://doi.org/10.32604/cmc.2021.016896
Vancouver Style
Amin S, Uddin MI, Al-Baity HH, Zeb MA, Khan MA. Machine learning approach for COVID-19 detection on twitter. Comput Mater Contin. 2021;68(2):2231-2247 https://doi.org/10.32604/cmc.2021.016896
IEEE Style
S. Amin, M.I. Uddin, H.H. Al-Baity, M.A. Zeb, and M.A. Khan, “Machine Learning Approach for COVID-19 Detection on Twitter,” Comput. Mater. Contin., vol. 68, no. 2, pp. 2231-2247, 2021. https://doi.org/10.32604/cmc.2021.016896

Citations




cc Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 3535

    View

  • 1936

    Download

  • 0

    Like

Share Link