Cyberbullying Sexism Harassment Identification by Metaheurustics-Tuned eXtreme Gradient Boosting

Milos Dobrojevic; Luka Jovanovic; Lepa Babic; Miroslav Cajic; Tamara Zivkovic; Miodrag Zivkovic; Suresh Muthusamy; Milos Antonijevic; Nebojsa Bacanin

doi:10.32604/cmc.2024.054459

Open Access icon Open Access

ARTICLE

Cyberbullying Sexism Harassment Identification by Metaheurustics-Tuned eXtreme Gradient Boosting

Milos Dobrojevic^1,4, Luka Jovanovic¹, Lepa Babic³, Miroslav Cajic⁵, Tamara Zivkovic⁶, Miodrag Zivkovic², Suresh Muthusamy⁷, Milos Antonijevic², Nebojsa Bacanin^2,4,8,9,*

1 Technical Faculty, Singidunum University, Belgrade, 11000, Serbia
2 Informatics and Computing, Singidunum University, Belgrade, 11000, Serbia
3 Business Economics, Singidunum University, Belgrade, 11000, Serbia
4 Computing and Informatics, Sinergija University, Bijeljina, 76300, Bosnia and Herzegovina
5 Department for Information Systems and Technologies, University “Union Nikola Tesla”, Cara Dusana, Belgrade, 11080, Serbia
6 Department for Computer Science and Informatics, School of Electrical Engineering, University of Belgrade, Belgrade, 11000, Serbia
7 Department of Electrical and Electronics Engineering, Kongu Engineering College (Autonomous), Perundurai, Erode, 638060, India
8 Department of Mathematics, Saveetha School of Engineering (Deemed to be University), SIMATS Thandalam, Chennai, 602105, India
9 MEU Research Unit, Middle East University, Amman, 11831, Jordan

* Corresponding Author: Nebojsa Bacanin. Email: email

Computers, Materials & Continua 2024, 80(3), 4997-5027. https://doi.org/10.32604/cmc.2024.054459

Received 28 May 2024; Accepted 29 August 2024; Issue published 12 September 2024

Abstract

Cyberbullying is a form of harassment or bullying that takes place online or through digital devices like smartphones, computers, or tablets. It can occur through various channels, such as social media, text messages, online forums, or gaming platforms. Cyberbullying involves using technology to intentionally harm, harass, or intimidate others and may take different forms, including exclusion, doxing, impersonation, harassment, and cyberstalking. Unfortunately, due to the rapid growth of malicious internet users, this social phenomenon is becoming more frequent, and there is a huge need to address this issue. Therefore, the main goal of the research proposed in this manuscript is to tackle this emerging challenge. A dataset of sexist harassment on Twitter, containing tweets about the harassment of people on a sexual basis, for natural language processing (NLP), is used for this purpose. Two algorithms are used to transform the text into a meaningful representation of numbers for machine learning (ML) input: Term frequency inverse document frequency (TF-IDF) and Bidirectional encoder representations from transformers (BERT). The well-known eXtreme gradient boosting (XGBoost) ML model is employed to classify whether certain tweets fall into the category of sexual-based harassment or not. Additionally, with the goal of reaching better performance, several XGBoost models were devised conducting hyperparameter tuning by metaheuristics. For this purpose, the recently emerging Coyote optimization algorithm (COA) was modified and adjusted to optimize the XGBoost model. Additionally, other cutting-edge metaheuristics approach for this challenge were also implemented, and rigid comparative analysis of the captured classification metrics (accuracy, Cohen kappa score, precision, recall, and F1-score) was performed. Finally, the best-generated model was interpreted by Shapley additive explanations (SHAP), and useful insights were gained about the behavioral patterns of people who perform social harassment.

Keywords

Coyote optimization algorithm; NLP; TF-IDF; BERT; XGBoost; online harassment and cyberbullying; metaheuristics

Cite This Article

APA Style

Dobrojevic, M., Jovanovic, L., Babic, L., Cajic, M., Zivkovic, T. et al. (2024). Cyberbullying Sexism Harassment Identification by Metaheurustics-Tuned eXtreme Gradient Boosting. Computers, Materials & Continua, 80(3), 4997–5027. https://doi.org/10.32604/cmc.2024.054459

Vancouver Style

Dobrojevic M, Jovanovic L, Babic L, Cajic M, Zivkovic T, Zivkovic M, et al. Cyberbullying Sexism Harassment Identification by Metaheurustics-Tuned eXtreme Gradient Boosting. Comput Mater Contin. 2024;80(3):4997–5027. https://doi.org/10.32604/cmc.2024.054459

IEEE Style

M. Dobrojevic et al., “Cyberbullying Sexism Harassment Identification by Metaheurustics-Tuned eXtreme Gradient Boosting,” Comput. Mater. Contin., vol. 80, no. 3, pp. 4997–5027, 2024. https://doi.org/10.32604/cmc.2024.054459

BibTex EndNote RIS

Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Cyberbullying Sexism Harassment Identification by Metaheurustics-Tuned eXtreme Gradient Boosting

Abstract

Keywords

Cite This Article

1782

586

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link