Open Access
ARTICLE
Machine Learning-based USD/PKR Exchange Rate Forecasting Using Sentiment Analysis of Twitter Data
1 Department of Computer Science & IT, Glim Institute of Modern Studies, Bahawalpur, 63100, Pakistan
2 Institute of Numerical Sciences, Kohat University of Science & Technology, Kohat, 26000, Pakistan
3 Department of Computer Science, Concordia College Bahawalpur, Bahawalpur, 63100, Pakistan
4 Institute of Computing, Kohat University of Science and Technology, Kohat, 26000, Pakistan
5 Faculty of Applied Studies, King Abdulaziz University, Jeddah, 21577, Saudi Arabia
6 Department of Statistics, The Islamia University of Bahawalpur, Bahawalpur, 63100, Pakistan
7 Department of Mathematics, Université de Caen, LMNO, Campus II, Science 3, Caen, 14032, France
* Corresponding Author: Wali Khan Mashwani. Email:
(This article belongs to the Special Issue: Deep Learning and Parallel Computing for Intelligent and Efficient IoT)
Computers, Materials & Continua 2021, 67(3), 3451-3461. https://doi.org/10.32604/cmc.2021.015872
Received 11 December 2020; Accepted 14 January 2021; Issue published 01 March 2021
Abstract
This study proposes an approach based on machine learning to forecast currency exchange rates by applying sentiment analysis to messages on Twitter (called tweets). A dataset of the exchange rates between the United States Dollar (USD) and the Pakistani Rupee (PKR) was formed by collecting information from a forex website as well as a collection of tweets from the business community in Pakistan containing finance-related words. The dataset was collected in raw form, and was subjected to natural language processing by way of data preprocessing. Response variable labeling was then applied to the standardized dataset, where the response variables were divided into two classes: “1” indicated an increase in the exchange rate and “ −1” indicated a decrease in it. To better represent the dataset, we used linear discriminant analysis and principal component analysis to visualize the data in three-dimensional vector space. Clusters that were obtained using a sampling approach were then used for data optimization. Five machine learning classifiers—the simple logistic classifier, the random forest, bagging, naïve Bayes, and the support vector machine—were applied to the optimized dataset. The results show that the simple logistic classifier yielded the highest accuracy of 82.14% for the USD and the PKR exchange rates forecasting.Keywords
Cite This Article
Citations
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.