Leveraging Transformers for Detection of Arabic Cyberbullying on Social Media: Hybrid Arabic Transformers

Amjad A. Alsuwaylimi^1,*, Zaid S. Alenezi²
1 Department of Computer Science, College of Science, Northern Border University, Arar, 91431, Saudi Arabia
2 Information Technology Management, Northern Border University, Arar, 91431, Saudi Arabia
* Corresponding Author: Amjad A. Alsuwaylimi. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.061674

Received 30 November 2024; Accepted 20 February 2025; Published online 25 March 2025

Download PDF

Abstract

Cyberbullying is a remarkable issue in the Arabic-speaking world, affecting children, organizations, and businesses. Various efforts have been made to combat this problem through proposed models using machine learning (ML) and deep learning (DL) approaches utilizing natural language processing (NLP) methods and by proposing relevant datasets. However, most of these endeavors focused predominantly on the English language, leaving a substantial gap in addressing Arabic cyberbullying. Given the complexities of the Arabic language, transfer learning techniques and transformers present a promising approach to enhance the detection and classification of abusive content by leveraging large and pretrained models that use a large dataset. Therefore, this study proposes a hybrid model using transformers trained on extensive Arabic datasets. It then fine-tunes the hybrid model on a newly curated Arabic cyberbullying dataset collected from social media platforms, in particular Twitter. Additionally, the following two hybrid transformer models are introduced: the first combines CAmelid Morphologically-aware pre-trained Bidirectional Encoder Representations from Transformers (CAMeLBERT) with Arabic Generative Pre-trained Transformer 2 (AraGPT2) and the second combines Arabic BERT (AraBERT) with Cross-lingual Language Model - RoBERTa (XLM-R). Two strategies, namely, feature fusion and ensemble voting, are employed to improve the model performance accuracy. Experimental results, measured through precision, recall, F1-score, accuracy, and Area Under the Curve-Receiver Operating Characteristic (AUC-ROC), demonstrate that the combined CAMeLBERT and AraGPT2 models using feature fusion outperformed traditional DL models, such as Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (BiLSTM), as well as other independent Arabic-based transformer models.

Keywords

Cyberbullying; transformers; pre-trained models; arabic cyberbullying detection; deep learning

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

202

View
28

Download
0

Like

Improved VGG Model for Road Traffic Sign Recognition
Shuren Zhou, Wenlong Liang, Junguo...
Snow Cover Mapping for Mountainous Areas by Fusion of MODIS L1B and Geographic Data Based on Stacked Denoising Auto-Encoders
Xi Kan, Yonghong Zhang, Linglong...
Rare Bird Sparse Recognition via Part-Based Gist Feature Fusion and Regularized Intraclass Dictionary Learning
Jixin Liu, Ning Sun, Xiaofei Li,...
Real-Time Visual Tracking with Compact Shape and Color Feature
Zhenguo Gao, Shixiong Xia, Yikun...
Paragraph Vector Representation Based on Word to Vector and CNN Learning
Zeyu Xiong, Qiangqiang Shen, Yijie...

All issues

Online First

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Leveraging Transformers for Detection of Arabic Cyberbullying on Social Media: Hybrid Arabic Transformers

Abstract

Keywords

202

28

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link