Open Access
ARTICLE
Emotional Analysis of Arabic Saudi Dialect Tweets Using a Supervised Learning Approach
Information Technology Department, College of Computer and Information Sciences, King Saud University, Riyadh, 11543, Saudi Arabia
* Corresponding Author: Heyam H. Al-Baity. Email:
Intelligent Automation & Soft Computing 2021, 29(1), 89-109. https://doi.org/10.32604/iasc.2021.016555
Received 05 January 2021; Accepted 18 March 2021; Issue published 12 May 2021
Abstract
Social media sites produce a large amount of data and offer a highly competitive advantage for companies when they can benefit from and address data, as data provides a deeper understanding of clients and their needs. This understanding of clients helps in effectively making the correct decisions within the company, based on data obtained from social media websites. Thus, sentiment analysis has become a key tool for understanding that data. Sentiment analysis is a research area that focuses on analyzing people’s emotions and opinions to identify the polarity (e.g., positive or negative) of a given text. Since we need to analyze emotions and opinions more deeply, emotional analysis (EA) has emerged. This analysis means deeply categorizing words into emotional categories, such as anger, disgust, fear, joy, sadness and surprise, rather than positive or negative. Studies on EA field for the Arabic language are limited, and our research is a contribution to this area. We built a system that classifies the emotions of Arabic tweets (mainly Saudi-based tweets) under the appropriate emotional categories using a supervised machine learning (ML) approach. The six basic emotion categories are anger, disgust, fear, joy, sadness and surprise. The multinomial naïve bayes (MNB), support vector machine (SVM) and logistic regression classifiers were used as the classification methods. A comprehensive comparison between these classifiers was performed in terms of accuracy, precision, recall and F-measure. Saudi tweets were collected and used as the dataset. A corpus of Saudi dialect tweets was created from this dataset as part of this study. The experimental results indicate that SVM and logistic regression achieved the best results, with an overall accuracy of 73.39%.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.