Open Access
ARTICLE
Optimal Bottleneck-Driven Deep Belief Network Enabled Malware Classification on IoT-Cloud Environment
1 Department of Information Systems, College of Computer Science, King Khalid University, Abha, Saudi Arabia
2 Department of Information Systems, College of Computer Science, Center of Artificial Intelligence, Unit of Cybersecurity, King Khalid University, Abha, Saudi Arabia
3 Department of Information Systems, College of Computing and Information System, Umm Al-Qura University, Saudi Arabia
4 Department of information systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
5 Department of Information Systems, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Saudi Arabia
6 Department of Information Technology, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif, 21944, Saudi Arabia
7 Department of Digital Media, Faculty of Computers and Information Technology, Future University in Egypt, New Cairo, 11845, Egypt
8 Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam Bin Abdulaziz University, Saudi Arabia
* Corresponding Author: Mesfer Al Duhayyim. Email:
Computers, Materials & Continua 2023, 74(2), 3101-3115. https://doi.org/10.32604/cmc.2023.032969
Received 02 June 2022; Accepted 12 July 2022; Issue published 31 October 2022
Abstract
Cloud Computing (CC) is the most promising and advanced technology to store data and offer online services in an effective manner. When such fast evolving technologies are used in the protection of computer-based systems from cyberattacks, it brings several advantages compared to conventional data protection methods. Some of the computer-based systems that effectively protect the data include Cyber-Physical Systems (CPS), Internet of Things (IoT), mobile devices, desktop and laptop computer, and critical systems. Malicious software (malware) is nothing but a type of software that targets the computer-based systems so as to launch cyber-attacks and threaten the integrity, secrecy, and accessibility of the information. The current study focuses on design of Optimal Bottleneck driven Deep Belief Network-enabled Cybersecurity Malware Classification (OBDDBN-CMC) model. The presented OBDDBN-CMC model intends to recognize and classify the malware that exists in IoT-based cloud platform. To attain this, Z-score data normalization is utilized to scale the data into a uniform format. In addition, BDDBN model is also exploited for recognition and categorization of malware. To effectually fine-tune the hyperparameters related to BDDBN model, Grasshopper Optimization Algorithm (GOA) is applied. This scenario enhances the classification results and also shows the novelty of current study. The experimental analysis was conducted upon OBDDBN-CMC model for validation and the results confirmed the enhanced performance of OBDDBN-CMC model over recent approaches.Keywords
Societies have become dependent upon technology in the past few years while technology is getting complicated day by day. In today’s world, people and devices are heavily connected with each other. Especially, e-government, smart cities, smart homes and such data-driven technologies follow Internet of Things (IoT) model. After the outbreak of COVID-19 pandemic and the circumstances that led to continuous lockdowns, technology acceptance has augmented in multiple folds and through diverse methods. For example, e-health applications have been designed after COVID-19 outbreak to support the already- exhausted healthcare professionals and medical systems [1]. However, a comprehensive connection to the cyber-world, on the other hand, increased the amount of cyberattacks. These cyberattacks might reveal information which is generally categorized as confidential and secure. To be specific, Internet of Medical Things (IoMT) system deals with huge volumes of patient datasets and transmission and storage of medical data experienced severe privacy concerns [2]. As a result, certain benchmarks were developed in the meantime to overcome the shortcomings namely, implementation of secure data transmission protocol and ensuring the privacy of socket layers to avoid the leakage of private data [3]. Cybercrime is defined as any unauthorized activity that occurs upon computer or through conventional crime modes and target individuals or institutions via internet [4]. In this background, it has become inevitable to make one-stop security and trusted solution to handle information privacy and security in resource-constraint devices.
In general, IoT devices possess lesser processing and memory capacity which in turn makes the devices, lightweight [5]. These characteristics limit the predominant application of probable security solutions. When finding malware attacks in IoT environment, three predominant problems are faced [6]. Fig. 1 illustrates the security problems experienced in cloud-IoT. First of all, most of the IoT devices have low computation power which limits the complication of security system [7]. In addition, the intensification of hidden malware attacks that target the IoT systems, necessitates the quick adoption of detection method which in turn is a complicated approach [8,9]. Next, the rapid developments in IoT devices and the resultant security risks must be dealt with extremely strong data protection methods. Signature-based detection method plays a significant role in protecting the system from different types of malwares. This method gained much attention among researchers who conducted numerous studies focused upon its improvement in both academia and industries [10,11]. Signature is the concept built upon this concept of different malware detections. Signature is usually unalterable and identified in the earlier stages of propagation although the quantity of malware examples is constrained [12]. In this methodology, the content of the file is scanned and compared to check whether it matches the known signature [13].
Vasan et al. [14] suggested a new classification model for the detection of variants of malware groups and enhanced the detection of malware with the help of Convolutional Neural Network (CNN) based Deep Learning (DL) architecture named as Image-related Malware Classifier by making use of Finely tuned CNN structure (IMCFN). Being a novel technique, the solution varies from existing ones in terms of being a solution for multiclass classifier issues. Further, the technique converted the raw malware binary images into color images and the fine-tuned CNN structured utilized these images in the detection and identification of malware groups. In the study conducted earlier [15], a combined DL technique was suggested for the detection of malware-infected documents and pirated software over IoT networks. Sudhakar et al. [16] devised a new malware classifier with fine-tuned (MCFT)-CNN method. The proposed MCFT-CNN method identified strange malware samples without any prior knowledge. In this study, reverse engineering method was followed with binary code analysis and even, the enhanced evading approaches were employed to detect the malwares.
Jeon et al. [17] recommended a Dynamic-Analysis-for-IoT-Malware-Detection (DAIMD) method to mitigate the damages caused in IoT gadgets through intelligent detection of familiar IoT malware and new variants of IoT malwares. DAIMD method investigated about IoT malwares with the help of CNN method and analyzed IoT malware vigorously in nested cloud atmosphere. DAIMD performed dynamic scrutinization of IoT malware in nested cloud setting so as to extract the behaviors based on virtual file system, memory, system call network, and process. In the study conducted earlier [18], a model was projected for detection of malware assaults on Industrial Internet of Things (MD-IIOT). For in-depth malware analysis, this study suggested a technique based on deep CNN color and image visualization. The outcomes achieved by the suggested technique were compared with that of the results from other studies in terms of malware detection.
The current study focuses on the design of Optimal Bottleneck driven Deep Belief Network-enabled Cybersecurity Malware Classification (OBDDBN-CMC) model. The aim of the presented OBDDBN-CMC model is to recognize and classify the malware in IoT-based cloud platform. To attain this, Z-score data normalization is utilized to scale the data into a uniform format. In addition, BDDBN model is exploited for both recognition and categorization of the malware. To effectually fine-tune the hyperparameters related to BDDBN model, Grasshopper Optimization Algorithm (GOA) is used which in turn enhances the classification results. The proposed OBDDBN-CMC model was experimentally validated and the results confirmed the enhanced performance of OBDDBN-CMC model over recent approaches.
2 The Proposed OBDDBN-CMC Model
In current study, a novel OBDDBN-CMC model has been developed to recognize and classify the malware in IoT-based cloud platform. To attain this, Z-score data normalization is utilized to scale the data into a uniform format. In addition, BDDBN model is exploited for both recognition and categorization of malware. To effectually fine-tune the hyperparameters involved in BDDBN model, GOA is used which in turn enhances the classification results. Fig. 2 depicts the block diagram of OBDDBN-CMC approach.
At first, Z-score data normalization is utilized to scale the data into a uniform format.
In order to normalize the data using
2.2 BDDBN-Based Classification Process
After data pre-processing, BDDBN model is exploited for both recognition and categorization of malware [20]. DBN is a theoretical model applied in learning mechanism with deep structure. Deep structures denote that it contains multiple layers with non-linear arithmetical units. DBN has strong characterization and modelling ability and it can handle real-time datasets, for instance video, natural speech, and images, than the existing methods used for ‘shallow’ structure. Here, shallow structure denotes the individual layer with non-linear arithmetical units. Though DBN is basically a multi-layer Artificial Neural Network (ANN), it employs a hybrid of unsupervised and supervised training models to obtain the network parameters. This is to resolve the challenges faced by ANN-back propagation (BP) process in terms of getting trapped into local optima. Bottleneck concept is continuously employed for speech detection whereas BDDBN is the result of integrating bottleneck idea with DBN. BDDBN is generally established as a multi-layer ANN with odd number of layers whereas the middle layer is called ‘bottleneck layer’. Bottleneck means the number of neurons in a layer is lower than other layers. BDDBN-based technique for speech feature extraction is executed as described below.
Step 1. Build DBN via fine-tuning and retraining and construct a nerve network.
Compositionally, DBN is a sequence of Restricted Boltzmann Machine (RBM) cascades. In general, RBM is composed of a hidden layer cell
Here,
where
with the abovementioned formula, it is easy to obtain the probability of
Maximize the
The derivation of maximal log likelihood function yields the
A supervised learning mechanism is employed in current study compared to conventional Back Propagation Neural Network (BPNN) to build the whole DBN. In step 2, the bottleneck layer in the network is detached whereas the original bottleneck layer is applied as output layer.
2.3 GOA-Based Parameter Adjustment Process
To effectually fine-tune the hyperparameters related to BDDBN model, GOA is used which in turn enhances the classification results [21–23]. The fundamental basis of GOA is to mimic the behaviour of grasshopper during food search at adulthood and larval stages [24]. The behaviour of grasshopper swarms is statistically modelled as follows.
Let
In Eq. (9),
The function
In this expression,
The distance amongst the grasshoppers is standardized in the range of [1,4]. The
where
In Eq. (12)
If
The initial term of this formula contemplates the position of existing grasshopper with regards to another grasshopper. Next,
In Eq. (15),
GOA system resolves a Fitness Function (FF) to achieve maximum classification efficiency. It resolves a positive integer to portray the best efficiency of candidate results. In this case, minimization classifier error rate, supposedly to be provided by FF is given in Eq. (16).
The presented OBDDBN-CMC model was experimentally validated for its performance using a dataset that contains 9,419 samples under two classes as demonstrated in Tab. 1.
Fig. 3 illustrates the confusion matrices generated by OBDDBN-CMC model on the applied dataset with distinct training (TR) and testing (TS) data. With 70% of TR data, the proposed OBDDBN-CMC model recognized 3,455 samples as benign class and 2926 samples as malware class. Eventually, with 30% of TS data, the proposed OBDDBN-CMC approach categorized 1,476 samples under benign class and 1,276 samples under malware class. Meanwhile, with 80% of TR data, OBDDBN-CMC system classified 4,032 samples under benign class and 3,428 samples under malware class. At last, with 20% of TS data, the proposed OBDDBN-CMC methodology recognized 989 samples as benign class and 877 samples under malware class.
Tab. 2 and Fig. 4 highlight the overall malware classification performance achieved by the proposed OBDDBN-CMC model under distinct aspects. With 70% of TR data, the proposed OBDDBN-CMC model offered an average
A clear precision-recall inspection was conducted for OBDDBN-CMC method upon test dataset and the results are shown in Fig. 5. The figure implies that the proposed OBDDBN-CMC method produced enhanced precision-recall values under all the classes.
A brief Receiver Operating Characteristic (ROC) curve analysis was conducted for OBDDBN-CMC method on test dataset and the results are depicted in Fig. 6. The results represent that the proposed OBDDBN-CMC approach revealed its ability to categorize distinct classes on test dataset.
Tab. 3 illustrates the comparative malware classification performance accomplished by the proposed OBDDBN-CMC model and other existing models [25,26] such as Improved Naïve Bayes (SVM), K-Nearest Neighbor (KNN), Naive Bayes (NB), DBN, TDS-GBAMD, EAMD-NF, and Two-Layer DL-android Malware Detection using Network Traffic (AMDNT).
Fig. 7 highlights the analytical result accomplished by the proposed OBDDBN-CMC model and other existing models in terms of TPR,
A detailed FPR examination was conducted between the proposed OBDDBN-CMC model and other existing models in terms of FPR and the results are shown in Fig. 8. The figure implies that the proposed OBDDBN-CMC model gained low FPR values compared to other models. To be specific, OBDDBN-CMC model reached a low FPR of 0.96%, whereas INB, SVM, KNN, NB, DBN, TDS-GBAMD, EAMD-NF, and Two-layer DL-AMDNT models attained high FPR values such as 1.67%, 5.03%, 12.68%, 18.98%, 14.70%, 18.82%, 11.90%, and 7.35% respectively. Thus, the proposed OBDDBN-CMC model is found to have effectual malware classification efficiency in IoT-enabled cloud environment.
In current study, a novel OBDDBN-CMC model has been developed to recognize and classify the malware in IoT-based cloud platform. To attain this, Z-score data normalization is utilized to scale the data into a uniform format. In addition, BDDBN model is exploited for recognition and categorization of malware. To effectually fine-tune the hyperparameters involved in BDDBN model, GOA is used which in turn enhances the classification results. The proposed OBDDBN-CMC model was experimentally validated and the results confirmed the enhanced performance of OBDDBN-CMC model over recent approaches. Thus, the presented OBDDBN-CMC model can be exploited as an effectual model for malware classification. In future, feature selection approaches can be designed to improve the classification performance.
Funding Statement: The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under grant number (61/43). Princess Nourah Bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R319), Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia. The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4210118DSR24).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
References
1. Ö. Aslan, M. O. Okay and D. Gupta, “A review of cloud-based malware detection system: Opportunities, advances and challenges,” European Journal of Engineering and Technology Research, vol. 6, no. 3, pp. 1–8, 2021. [Google Scholar]
2. S. Zhao, S. Li, L. Qi and L. D. Xu, “Computational intelligence enabled cybersecurity for the Internet of Things,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 4, no. 5, pp. 666–674, 2020. [Google Scholar]
3. E. M. Dovom, A. Azmoodeh, A. Dehghantanha, D. E. Newton, R. M. Parizi et al., “Fuzzy pattern tree for edge malware detection and categorization in IoT,” Journal of Systems Architecture, vol. 97, no. 7, pp. 1–7, 2019. [Google Scholar]
4. J. C. S. Sicato, P. K. Sharma, V. Loia and J. H. Park, “VPNFilter malware analysis on cyber threat in smart home network,” Applied Sciences, vol. 9, no. 13, pp. 2763, 2019. [Google Scholar]
5. Y. Shah and S. Sengupta, “A survey on classification of Cyber-attacks on IoT and IIoT devices,” in 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conf. (UEMCON), New York, NY, USA, pp. 0406–0413, 2020. [Google Scholar]
6. A. Al-Qarafi, F. Alrowais, S. Alotaibi, N. Nemri, F. N. Al-Wesabi et al., “Optimal machine learning based privacy preserving blockchain assisted Internet of Things with smart cities environment,” Applied Sciences, vol. 12, no. 12, pp. 1–17, 2022. [Google Scholar]
7. M. Ficco, “Detecting IoT malware by Markov chain behavioral models,” in 2019 IEEE Int. Conf. on Cloud Engineering (IC2E), Prague, Czech Republic, pp. 229–234, 2019. [Google Scholar]
8. A. A. Albraikan, S. B. H. Hassine, S. M. Fati, F. N. Al-Wesabi, A. M. Hilal et al., “Optimal deep learning-based cyberattack detection and classification technique on social networks,” Computers, Materials & Continua, vol. 72, no. 1, pp. 907–923, 2022. [Google Scholar]
9. M. Chikapa and A. P. Namanya, “Towards a fast off-line static malware analysis framework,” in 2018 6th Int. Conf. on Future Internet of Things and Cloud Workshops (FiCloudW), Barcelona, pp. 182–187, 2018. [Google Scholar]
10. I. Abunadi, M. M. Althobaiti, F. N. Al-Wesabi, A. M. Hilal, M. Medani et al., “Federated learning with blockchain assisted image classification for clustered UAV networks,” Computers, Materials & Continua, vol. 72, no. 1, pp. 1195–1212, 2022. [Google Scholar]
11. U. Inayat, M. F. Zia, S. Mahmood, H. M. Khalid and M. Benbouzid, “Learning-based methods for cyberattacks detection in IoT systems: A survey on methods, analysis, and future prospects,” Electronics, vol. 11, no. 9, pp. 1502, 2022. [Google Scholar]
12. C. C. Uchenna, N. Jamil, R. Ismail, L. K. Yan and M. A. Mohamed, “Malware threat analysis techniques and approaches for IoT applications: A review,” Bulletin of Electrical Engineering and Informatics, vol. 10, no. 3, pp. 1158–1571, 2021. [Google Scholar]
13. P. Ahirao, “Proactive technique for securing smart cities against malware attacks using static and dynamic analysis,” International Research Journal of Innovations in Engineering and Technology, vol. 5, no. 2, pp. 10, 2021. [Google Scholar]
14. D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei et al., “IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture,” Computer Networks, vol. 171, no. 1, pp. 107138, 2020. [Google Scholar]
15. F. Ullah, H. Naeem, S. Jabbar, S. Khalid, M. A. Latif et al., “Cyber security threats detection in Internet of Things using deep learning approach,” IEEE Access, vol. 7, pp. 124379–124389, 2019. [Google Scholar]
16. Sudhakar and S. Kumar, “MCFT-CNN: Malware classification with fine-tune convolution neural networks using traditional and transfer learning in Internet of Things,” Future Generation Computer Systems, vol. 125, no. 5, pp. 334–351, 2021. [Google Scholar]
17. J. Jeon, J. H. Park and Y. S. Jeong, “Dynamic analysis for IoT malware detection with convolution neural network model,” IEEE Access, vol. 8, pp. 96899–96911, 2020. [Google Scholar]
18. H. Naeem, F. Ullah, M. R. Naeem, S. Khalid, D. Vasan et al., “Malware detection in industrial Internet of Things based on hybrid image visualization and deep learning model,” Ad Hoc Networks, vol. 105, no. 1, pp. 102154, 2020. [Google Scholar]
19. E. Walia and A. Pal, “Fusion framework for effective color image retrieval,” Journal of Visual Communication and Image Representation, vol. 25, no. 6, pp. 1335–1348, 2014. [Google Scholar]
20. G. H. de Rosa and J. P. Papa, “Soft-tempering deep belief networks parameters through genetic programming,” Journal of Artificial Intelligence and Systems, vol. 1, no. 1, pp. 43–59, 2019. [Google Scholar]
21. I. V. Pustokhina, D. A. Pustokhin, E. L. Lydia, P. Garg, A. Kadian et al., “Hyperparameter search based convolution neural network with Bi‐LSTM model for intrusion detection system in multimedia big data environment,” Multimedia Tools and Applications, vol. 13, no. 5, pp. 111, 2021. https://doi.org/10.1007/s11042-021-11271-7. [Google Scholar]
22. G. N. Nguyen, N. H. L. Viet, M. Elhoseny, K. Shankar, B. B. Gupta et al., “Secure blockchain enabled Cyber-physical systems in healthcare using deep belief network with ResNet model,” Journal of Parallel and Distributed Computing, vol. 153, no. 2, pp. 150–160, 2021. [Google Scholar]
23. M. N. A. Mhiqani, R. Ahmad, Z. Z. Abidin, K. H. Abdulkareem, M. A. Mohammed et al., “A new intelligent multilayer framework for insider threat detection,” Computers & Electrical Engineering, vol. 97, no. 1, pp. 107597, 2022. [Google Scholar]
24. S. Dwivedi, M. Vardhan and S. Tripathi, “Building an efficient intrusion detection system using grasshopper optimization algorithm for anomaly detection,” Cluster Computing, vol. 24, no. 3, pp. 1881–1900, 2021. [Google Scholar]
25. R. Kumar, X. Zhang, W. Wang, R. U. Khan, J. Kumar et al., “A multimodal malware detection technique for android IoT devices using various features,” IEEE Access, vol. 7, pp. 64411–64430, 2019. [Google Scholar]
26. J. Feng, L. Shen, Z. Chen, Y. Wang and H. Li, “A two-layer deep learning method for android malware detection using network traffic,” IEEE Access, vol. 8, pp. 125786–125796, 2020. [Google Scholar]
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.