Open Access
ARTICLE
Spotted Hyena Optimizer with Deep Learning Driven Cybersecurity for Social Networks
1 Department of Electrical and Computer Engineering, International Islamic University Malaysia, Kuala Lumpur, 53100, Malaysia
2 Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, Al-Kharj, 16278, Saudi Arabia
3 Department of Electrical Engineering, College of Engineering, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh, 11671, Saudi Arabia
4 Department of Computer Science, College of Computers and Information Technology, Tabuk University, Tabuk, 47512, Saudi Arabia
5 Department of Computer Sciences, College of Computing and Information System, Umm Al-Qura University, Mecca, 24382, Saudi Arabia
6 Research Centre, Future University in Egypt, New Cairo, 11845, Egypt
7 Department of Information Systems, College of Computer and Information Sciences, Prince Sultan University, Riyadh, 12435, Saudi Arabia
* Corresponding Author: Anwer Mustafa Hilal. Email:
Computer Systems Science and Engineering 2023, 45(2), 2033-2047. https://doi.org/10.32604/csse.2023.031181
Received 12 April 2022; Accepted 22 June 2022; Issue published 03 November 2022
Abstract
Recent developments on Internet and social networking have led to the growth of aggressive language and hate speech. Online provocation, abuses, and attacks are widely termed cyberbullying (CB). The massive quantity of user generated content makes it difficult to recognize CB. Current advancements in machine learning (ML), deep learning (DL), and natural language processing (NLP) tools enable to detect and classify CB in social networks. In this view, this study introduces a spotted hyena optimizer with deep learning driven cybersecurity (SHODLCS) model for OSN. The presented SHODLCS model intends to accomplish cybersecurity from the identification of CB in the OSN. For achieving this, the SHODLCS model involves data pre-processing and TF-IDF based feature extraction. In addition, the cascaded recurrent neural network (CRNN) model is applied for the identification and classification of CB. Finally, the SHO algorithm is exploited to optimally tune the hyperparameters involved in the CRNN model and thereby results in enhanced classifier performance. The experimental validation of the SHODLCS model on the benchmark dataset portrayed the better outcomes of the SHODLCS model over the recent approaches.Keywords
Due to the expansion of the Internet, security is considered a significant factor. Though Web 2.0 offers interactive, simple, anywhere, and anytime accessibilities to the online societies, it additionally offers platform for cybercrimes namely cyberbullying (CB) [1]. Aggravating CB encounters amid adolescent persons was stated globally, therefore drafting interest in its negative influences. In the United States, the footprints of CB are extremely rising and it was formally recognized as a social risk [2]. CB has the same, if not a greater, adverse effect on the victims about conventional bullying since the predators generally assault a person concerning factors which an individual does not make any variation (e.g., physical appearance, religion, skin color, and ethnic background), allowing deep and long-lasting effects on the sufferer [3,4]. In a few cases, the related humiliation is adequate which may force the sufferer to harm or suicidal activities. Suicidal intention tends to raise in youth because of the exposure to various types of CB [5]. Still, preventive steps are conducted, and the rehabilitation of sufferers of CB cases is assumed as a challenging one for societies and families. Hypersensitivity, self-hate, and isolation prevailing in the socialization procedure results in depressed adults. In addition to this, the psychological disparity may build forthcoming bullies [6]. Amongst various difficulties which make the identification of CB in OSN very complicated, existing solutions to CB identification cannot indicate the scope of bullying forms in its identification method. Provided the various kinds of CB which could arise on the website, it is impossible to consider that a similar identification method will be effective in identifying each kind of bullying.
The computational identification CB can be performed on the basis of several classes of methods in the domains of machine learning (ML). Natural language processing (NLP) is assumed as another tool for social textual interaction examination [7]. The criteria of social interaction examination and sociolinguistics impose a focus on uniqueness, the presence of the effect, specificity, and the personality of the persons, and ascription to the society and their language utilization; whereas statistical supervised and unsupervised techniques highlight abstraction, generalization, and exploitation of patterns in the data [8]. The domains of social interaction interpretation and sociolinguistics have an important chasm and dissonance with the domains of ML and NLP. Various conventional ML methods needed clear feature extraction from input data [9]. NLP has extensive applications in this field, as authors have used various feature extraction methods for textual content. Fundamental attempts involve supervised categorization by utilizing bag-of-words at character-level representation through numerous conventional ML methods [10]. Deep learning (DL) methods were used for defeating the restrictions of conventional ML, reducing the manual feature extraction stage, and getting superior outcomes on large scale datasets.
Lu et al. [11] present a Character-level Convolutional Neural Network with Shortcuts (Char-CNNS) technique for identifying if the text from social media comprises CB. It can utilize character as the minimum unit of learning, allowing the method for overcoming spelling errors and intentional obfuscation from the real world corpora. The shortcuts were employed to stitch distinct levels of features for learning further granular bullying signals, and the focal loss function was implemented for overcoming the class imbalance problems. The authors in [12] purpose for addressing the computational challenge linked with harassment finding from the social media by establishing an ML structure with 3 distinguishing features. Chen et al. [13] presented a new deep method, HEterogeneous Neural Interaction Network (HENIN), to explainable CB recognition. The HENIN comprises the subsequent modules: comment encoded, post-comment co-attention sub-network, session-session, and post-post communication extractor. The authors in [14] discovered the issue of CB forecast and present MIIL-DNN, a multi-input integrative learning method on deep neural networks (DNNs). The MIIL-DNN integrates data in 3 sub-networks for detecting and classifying bully contents from the real time code-mix data. The authors in [15–18] tried for exploring this problem by compiling a global data set of 37,373 unique tweets on Twitter using seven ML models.
This study introduces a spotted hyena optimizer with deep learning driven cybersecurity (SHODLCS) model for OSN. The presented SHODLCS model intends to accomplish cybersecurity from the identification of CB in the OSN. For achieving this, the SHODLCS model involves data pre-processing and TF-IDF based feature extraction. In addition, the cascaded recurrent neural network (CRNN) model is applied for the identification and classification of CB. Finally, the SHO algorithm is exploited to optimally adjust the hyperparameters involved in the CRNN model and thereby resulting in enhanced classifier performance. The experimental validation of the SHODLCS model on the benchmark dataset portrayed the better outcomes of the SHODLCS model over the recent approaches.
In this study, a novel SHODLCS model has been developed to accomplish cybersecurity from the identification of CB in the OSN. The SHODLCS model involves data pre-processing and TF-IDF based feature extraction. Also, the SHO-CRNN model is applied for the identification and classification of CB. Fig. 1 offers the overall process of the SHODLCS technique.
In this study, data preprocessing take place in different ways such as
• Discard empty rows,
• Convert characters into lowercase,
• Remove punctuation marks,
• Remove special characters,
• Remove numeral,
• Remove stopword,
• Tokenization, and
• Stemmization
Once the data is pre-processed, the term frequency-inverse document frequency (TF-IDF) model gets executed [19]. It is a statistical method which utilizes the occurrence of words as a measure to extract textual features. For a term
At this point,
At this time,
2.3 Data Classification Module
Next to feature extraction, the CRNN model is applied for the identification and classification of CB [20]. For a provided dataset
whereas
In which
Here,
Especially, we split the spectral sequence
At that time, we feed each sub-sequences into the initial-layer RNN correspondingly. This RNN has a shared parameter and similar architecture, thereby reducing the amount of variables for training. In sub-sequence
2.4 SHO Based Hyperparameter Optimization
At the final stage, the SHO algorithm is exploited to tune the hyperparameter related to the CRNN model and thereby results in enhanced classifier performance [21–25]. SHO is stimulated by the social behavior of spotted hyenas. The major phases of the SHO procedure are from the hunting behavior. The mathematical model of the newly established SHO process is thoroughly discussed in the following.
Encircling prey: The mathematically modeling of these behaviors is given as follows:
whereas
Now,
Here,
In which
2) Hunting: The subsequent equation is suggested for hunting method:
Here
Now
Attack prey (exploitation): The mathematical modeling for prey attack is probable for reducing the value of vector
In the equation,
Searching for prey (exploration): purposefully requires vector
The performance validation of the SHODLCS model is tested using the Wikipedia Attack Dataset [26] which contains 115,864 samples with 13,590 CB and 102,274 non-CB (NCB).
Fig. 3 exemplifies the confusion matrices formed by the SHODLCS model on test dataset. With run-1, the SHODLCS model has recognized 12634 samples into CB and 101623 samples into NCB. Meanwhile, with run-3, the SHODLCS technique has recognized 12786 samples into CB and 101604 samples into NCB. Moreover, with run-5, the SHODLCS methodology has recognized 12805 samples into CB and 101562 samples into NCB. At the same time, with run-6, the SHODLCS approach has recognized 12858 samples into CB and 101487 samples into NCB.
Tab. 1 and Fig. 4 report the overall classification results of the SHODLCS model under distinct runs. On run-1, the SHODLCS model has offered average
The training accuracy (TA) and validation accuracy (VA) attained by the SHODLCS model on test dataset is demonstrated in Fig. 5. The figure implied that the SHODLCS model has gained maximum values of TA and VA. In specific, the VA seemed to be higher than TA.
The training loss (TL) and validation loss (VL) achieved by the SHODLCS model on test dataset are established in Fig. 6. The results inferred that the SHODLCS model has been able least values of TL and VL. In specific, the VL seemed to be lower than TL.
A brief precision-recall examination of the SHODLCS model on test dataset is portrayed in Fig. 7. By observing the figure, it is noticed that the SHODLCS model has accomplished maximum precision-recall performance under all classes.
A detailed receiver operating characteristic (ROC) curve investigation of the SHODLCS technique on test dataset is represented in Fig. 8. The results indicated that the SHODLCS model has exhibited its ability in categorizing two different classes such as cyberbullying and non-cyberbullying on the test dataset.
Tab. 2 reports a detailed comparative examination of the SHODLCS model with recent models. Fig. 9 illustrates a brief
Fig. 10 depicts a brief
In this study, a novel SHODLCS model has been developed to accomplish cybersecurity from the identification of CB in the OSN. For achieving this, the SHODLCS model involves data pre-processing and TF-IDF based feature extraction. In addition, the CRNN model is applied for the identification and classification of CB. Finally, the SHO algorithm is exploited to effectually tune the hyperparameter related to the CRNN approach and thereby results in enhanced classifier performance. The experimental validation of the SHODLCS model on benchmark dataset portrayed the better outcomes of the SHODLCS model over the recent approaches. Thus, the SHODLCS model can be utilized as an effectual tool for CB detection and classification. In future, hybrid DL models can be exploited to improve the overall classification performance.
Funding Statement: Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R140), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: 22UQU4310373DSR15.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
References
1. Y. Fang, S. Yang, B. Zhao and C. Huang, “Cyberbullying detection in social networks using bi-gru with self-attention mechanism,” Information, vol. 12, no. 4, pp. 171, 2021. [Google Scholar]
2. A. Bozyiğit, S. Utku and E. Nasibov, “Cyberbullying detection: Utilizing social media features,” Expert Systems with Applications, vol. 179, pp. 115001, 2021. [Google Scholar]
3. M. Dadvar and K. Eckert, “Cyberbullying detection in social networks using deep learning based models,” in Int. Conf. on Big Data Analytics and Knowledge Discovery, DaWaK 2020: Big Data Analytics and Knowledge Discovery, Lecture Notes in Computer Science book series, Cham, Springer, vol.12393, pp. 245–255, 2020. [Google Scholar]
4. A. Abdulrahman Albraikan, S. Ben Haj Hassine, S. Mohamed Fati, F. N. Al-Wesabi, A. Mustafa Hilal et al., “Optimal deep learning-based cyberattack detection and classification technique on social networks,” Computers, Materials & Continua, vol. 72, no.1, pp. 907–923, 2022. [Google Scholar]
5. H. Rosa, J. P. Carvalho, P. Calado, B. Martins, R. Ribeiro et al., “Using fuzzy fingerprints for cyberbullying detection in social networks,” in 2018 IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), Rio de Janeiro, pp. 1–7, 2018. [Google Scholar]
6. H. Rosa, N. Pereira, R. Ribeiro, P. C. Ferreira, J. P. Carvalho et al., “Automatic cyberbullying detection: A systematic review,” Computers in Human Behavior, vol. 93, pp. 333–345, 2019. [Google Scholar]
7. A. Kumar and N. Sachdeva, “Multimodal cyberbullying detection using capsule network with dynamic routing and deep convolutional neural network,” Multimedia Systems, vol. 15, no. 1, pp. 1–14, 2021. [Google Scholar]
8. C. Iwendi, G. Srivastava, S. Khan and P. K. R. Maddikunta, “Cyberbullying detection solutions based on deep learning architectures,” Multimedia Systems, vol. 25, no. 1, pp. 1–1, 2020. [Google Scholar]
9. M. Alotaibi, B. Alotaibi and A. Razaque, “A multichannel deep learning framework for cyberbullying detection on social media,” Electronics, vol. 10, no. 21, pp. 2664, 2021. [Google Scholar]
10. A. Kumar and N. Sachdeva, “A Bi-GRU with attention and CapsNet hybrid model for cyberbullying detection on social media,” World Wide Web, vol. 15, no. 1, pp. 1–14, 2021. [Google Scholar]
11. N. Lu, G. Wu, Z. Zhang, Y. Zheng, Y. Ren et al., “Cyberbullying detection in social media text based on character-level convolutional neural network with shortcuts,” Concurrency and Computation: Practice and Experience, vol. 32, no. 23, pp. 2999, 2020. [Google Scholar]
12. E. Raisi and B. Huang, “Weakly supervised cyberbullying detection using co-trained ensembles of embedding models,” in 2018 IEEE/ACM Int. Conf. on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, pp. 479–486, 2018. [Google Scholar]
13. H. Y. Chen and C. T. Li, “HENIN: Learning heterogeneous neural interaction networks for explainable cyberbullying detection on social media,” in Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing (EMNLP), Dominican Republic, pp. 2543–2552, 2020. [Google Scholar]
14. A. Kumar and N. Sachdeva, “Multi-input integrative learning using deep neural networks and transfer learning for cyberbullying detection in real-time code-mix data,” Multimedia Systems, vol. 32, no. 1, pp. 1–15, 2020. [Google Scholar]
15. A. Muneer and S. M. Fati, “A comparative analysis of machine learning techniques for cyberbullying detection on twitter,” Future Internet, vol. 12, no. 11, pp. 187, 2020. [Google Scholar]
16. F. Alrowais, A. S. Almasoud, R. Marzouk, F. N. Al-Wesabi, A. M. Hilal et al., “Artificial intelligence based data offloading technique for secure mec systems,” Computers, Materials & Continua, vol. 72, no. 2, pp. 2783–2795, 2022. [Google Scholar]
17. A. A. Albraikan, S. B. Haj Hassine, S. M. Fati, F. N. Al-Wesabi, A. M. Hilal et al., “Optimal deep learning-based cyberattack detection and classification technique on social networks,” Computers, Materials & Continua, vol. 72, no. 1, pp. 907–923, 2022. [Google Scholar]
18. M. A. Hamza, S. B. Haj Hassine, I. Abunadi, F. N. Al-Wesabi, H. Alsolai et al., “Feature selection with optimal stacked sparse autoencoder for data mining,” Computers, Materials & Continua, vol. 72, no. 2, pp. 2581–2596, 2022. [Google Scholar]
19. C. Raj, A. Agarwal, G. Bharathy, B. Narayan and M. Prasad, “Cyberbullying detection: Hybrid models based on machine learning and natural language processing techniques,” Electronics, vol. 10, no. 22, pp. 2810, 2021. [Google Scholar]
20. R. Hang, Q. Liu, D. Hong and P. Ghamisi, “Cascaded recurrent neural networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 8, pp. 5384–5394, 2019. [Google Scholar]
21. G. Dhiman and A. Kaur, “Spotted hyena optimizer for solving engineering design problems,” in 2017 Int. Conf. on Machine Learning and Data Science (MLDS), Noida, India, pp. 114–119, 2017. [Google Scholar]
22. M. N. A. Mhiqani, R. Ahmad, Z. Z. Abidin, K. H. Abdulkareem, M. A. Mohammed et al., “A new intelligent multilayer framework for insider threat detection,” Computers & Electrical Engineering, vol. 97, pp. 107597, 2022. [Google Scholar]
23. R. Gopi, P. Muthusamy, P. Suresh, C. G. G. S. Kumar, I. V. Pustokhina et al., “Optimal confidential mechanisms in smart city healthcare,” Computers, Materials & Continua, vol. 70, no. 3, pp. 4883–4896, 2022. [Google Scholar]
24. A. Muthumari, J. Banumathi, S. Rajasekaran, P. Vijayakarthik, K. Shankar et al., “High security for de-duplicated big data using optimal simon cipher,” Computers, Materials & Continua, vol. 67, no. 2, pp. 1863–1879, 2021. [Google Scholar]
25. I. V. Pustokhina, D. A. Pustokhin, E. L. Lydia, P. Garg, A. Kadian et al., “Hyperparameter search based convolution neural network with Bi-LSTM model for intrusion detection system in multimedia big data environment,” Multimedia Tools and Applications, vol. 13, no. 5, pp. 111, 2021. [Google Scholar]
26. E. Wulczyn, N. Thain and L. Dixon, “Ex machina: Personal attacks seen at scale,” in Proc. of the 26th Int. Conf. on World Wide Web, Perth, Australia, pp. 1391–1399, 2017. [Google Scholar]
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.