Open Access
ARTICLE
Blockchain Assisted Optimal Machine Learning Based Cyberattack Detection and Classification Scheme
1 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
2 Department of Computer Science, College of Science & Art at Mahayil, King Khalid University, Abha, 62529, Saudi Arabia
3 Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam bin Abdulaziz University, Al-Kharj 16278, Saudi Arabia
4 Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University Al-Kharj, 16278, Saudi Arabia
* Corresponding Author: Fahd N. Al-Wesabi. Email:
Computer Systems Science and Engineering 2023, 46(3), 3583-3598. https://doi.org/10.32604/csse.2023.037545
Received 08 November 2022; Accepted 02 February 2023; Issue published 03 April 2023
Abstract
With recent advancements in information and communication technology, a huge volume of corporate and sensitive user data was shared consistently across the network, making it vulnerable to an attack that may be brought some factors under risk: data availability, confidentiality, and integrity. Intrusion Detection Systems (IDS) were mostly exploited in various networks to help promptly recognize intrusions. Nowadays, blockchain (BC) technology has received much more interest as a means to share data without needing a trusted third person. Therefore, this study designs a new Blockchain Assisted Optimal Machine Learning based Cyberattack Detection and Classification (BAOML-CADC) technique. In the BAOML-CADC technique, the major focus lies in identifying cyberattacks. To do so, the presented BAOML-CADC technique applies a thermal equilibrium algorithm-based feature selection (TEA-FS) method for the optimal choice of features. The BAOML-CADC technique uses an extreme learning machine (ELM) model for cyberattack recognition. In addition, a BC-based integrity verification technique is developed to defend against the misrouting attack, showing the innovation of the work. The experimental validation of BAOML-CADC algorithm is tested on a benchmark cyberattack dataset. The obtained values implied the improved performance of the BAOML-CADC algorithm over other techniques.Keywords
Data analysis methods were used extensively in the cyber security field, and the current diffusion of advanced machine learning (ML) methods has permitted to precisely detect cyber-attacks and identify threats [1], both in post-incident analysis and real-time. Both unsupervised and supervised ML techniques were successfully used for supporting prevention systems and intrusion detection (ID), along with identifying security breaches and system misuses [2,3]. The scenarios of interest were generally characterized by a continuous data stream (like application-level or packet-level) that summarizes the conduct of the underlying system or network [4]. The ML methods’ role is either in detecting familiar attacks, anomalous behaviour (unsupervised method) (supervised method), or Anomaly-related approaches that fit normal system functioning status, identifying and isolating anomalies as unexpected behavioural deviations. Therefore, anomaly detection methods are attractive for their capability to identify zero-day attacks, which are attacks using unknown vulnerabilities [5,6]. Fig. 1 represents the overview of the blockchain (BC)-assisted cyberattack detection.
In recent times, deep learning (DL) is becoming a hopeful analytic pattern because of its excellent function in examining huge volumes of data [7]. Dissimilar to the conventional ML method, DL supported effective feature engineering by handling automatic and reliable feature representation and extraction [8,9]. Since a strong analytic device, DL delivered existing latency and accuracy than the conventional ML method, and it is positioned for examining huge amounts of data in the 5G-based Internet of Things (IoT) [10]. This DL disposition supported forecasting future events, recognising assaults, and offering important data for placement and content caching in dynamic cases of 5G-assisted IoT [11].
Since an evolving technology, BC has become a hopeful choice for managing privacy and security in next-generation transmission structures [12,13]. It constitutes a peer-to-peer (P2P) transactions platform where the data is exchanged, recorded, and authenticated in a decentralized way for delivering verification data and security independent from centralized authorities [14]. The significant features of BC involving anonymity, decentralization, and security can apply secure data transactions and overcome centralized server dependence to support security in 5G-based IoT. Additionally, the inimitable property of dispersed data storage, smart contracts, and asset tracking make BC technology required for 5G-based IoT [15].
This study designs a new Blockchain Assisted Optimal Machine Learning based Cyberattack Detection and Classification (BAOML-CADC) technique. In the BAOML-CADC technique, the major focus lies in identifying cyberattacks. To do so, the presented BAOML-CADC technique applies a thermal equilibrium algorithm-based feature selection (TEA-FS) method for the optimal choice of features. The BAOML-CADC technique uses the extreme learning machine (ELM) model for cyberattack recognition. In addition, a BC-based integrity verification technique is developed to protect against the misrouting attack. The experimental validation of BAOML-CADC algorithm is tested on a benchmark cyberattack dataset.
In [16], the authors identify false data injection attack (FDIA) in the microgrid mechanism, Hilbert-Huang transforms technique, along with BC-related ledger technology, was employed to scale up the security in smart direct current (DC)-microgrids by evaluating the voltage and current signals in controller and smart sensor nodes by deriving signal information. Ajayi et al. [17] devised a BC-related solution that guarantees the consistency and integrity of attack features shared in a cooperative ID mechanism. The modelled structure attains by preventing and identifying compromised intrusion detection system (IDS) nodes and fake feature injection. It even enables scalable attack features to exchange amongst IDS nodes, assures heterogeneous IDS node contribution, and is powerful to public IDS node leaving and joining the networks.
Kumar et al. [18] presented a new BC system for secured clinical data management that minimizes the computational and communicational overhead cost than the lightweight BC architecture and the prevailing bitcoin network. Kumar et al. [19] present a Privacy-Preserving and Secure Framework (PPSF) for IoT-driven smart city. This devised method is dependent upon 2 main systems: an ID system and a two-level privacy system. Firstly, in a 2-level privacy method, a BC was devised to transfer information of IoT securely, and the PCA method can be adapted to convert raw IoT data into a new structure. A Gradient Boosting Anomaly Detector (GBAD) was implemented in the ID method to evaluate and train the devised two-level privacy method related to IoT network datasets, such as BoT-IoT and ToN-IoT.
In [20], modelled a new process based on DL and Hilbert-Huang Transform for cyberattack recognition in DC-MGs along with identifying the assaults in distributed generation (DG) sensors and units. Then, an innovative elective group DL approach and Krill Herd Optimization (KHO) was devised. Then, Hilbert-Huang Transform was employed to extract the signals feature. Then such attributes were implemented as multiple deep input basis methods were constituted for capturing sentient traits automatically from raw fluctuation signals. Liang et al. [21] modelled a novel, distributed BC-related security system to improve modern power systems' self-defensive ability against cyberattacks. The authors discussed how BC technology improves the power grid's security and robustness by leveraging meters as nodes in dispersed networks and encapsulating meter measurements as blocks.
In this study, we have developed a new BAOML-CADC technique for Cyberattack Detection and Classification process. In the BAOML-CADC technique, the major focus lies in identifying cyberattacks. Initially, the presented BAOML-CADC technique applied the TEA-FS approach for the optimal choice of features. For cyberattack recognition, the BAOML-CADC technique used the ELM model. In addition, a BC-based integrity verification technique is developed to defend against the misrouting attack.
3.1 Algorithmic Steps of TEA-FS Technique
Primarily, the presented BAOML-CADC technique applied the TEA-FS approach for the optimal choice of features. TEA was simulated by thermodynamic phenomena and is classified as an evolutionary technique [22]. During this approach, the coordinates of all the systems can be transformed into volume or temperature, which are thermodynamic parameters. The study aims to search for an optimum solution based on the problem variable. In genetic algorithm (GA), the group of variables should be enhanced, often known as the chromosome. However, the term can be replaced by the thermodynamic system. In the
In Eq. (1),
Here, the thermodynamic state of all the systems is determined by the following expression:
The cost of every system can be defined using
At first, the initial population can be generated by the number of
In coupling the Thermodynamic System, firstly, every system is combined to the closest system in the function domain. Next, the combined system exchanges heat or implements work together to reach equilibrium progressively. Eventually, the temperature gradient is reduced, the scope of the optimization technique is well analyzed, as well as the optimum value is found. Afterwards, the system is coupled, and heat exchange and work are executed, leading to changes in the volume and temperature of the system. The volume and temperature are distinct. However, the pressure can be similar. Moreover, the piston in the two systems moves freely. Hence every system implements either negative or positive work. In every thermodynamic procedure, the pressure in two systems is equivalent:
In Eq. (5),
The energy change for an ideal gas is evaluated as
In Eq. (7), the subscript
Alternatively, the ideal gas law is represented as:
The overall molar mass at the equilibrium state is equivalent to the sum of molar mass according to the conservation law of mass:
By substituting Eqs. (9) with (10),
By integrating Eqs. (8) and (11), the following expression can attain
In addition,
By using the ideal gas law and the first law of thermodynamics the equilibrium state of hypothetical system is evaluated. If every system attains equilibrium instantly, the state of two systems would be equivalent, and they won't exchange energy. In such cases, the function's domain won't be explored further, and the optimum solution cannot be attained. To avoid that, the subsequent relationship is suggested for overall volume and new temperature:
Now the term
The two systems grow nearer to equilibrium at every phase, which causes the whole domain to be examined progressively and the optimum point to be found. The relationships between entropy and cost function are determined by:
Thus, the study aims to diminish the cost function leads to maximizing the entropy of the system (S):
By transforming entropy into a cost function, the condition is transformed to:
The abovementioned conditions cause the algorithm to move towards the optimization function. When the criterion isn’t satisfied, a counter is determined for counting the number of failed attempts, then the thermodynamic condition is upgraded:
Whereas
Lastly, many coupled systems reached equilibrium and accumulated at any point, excluding a few that might be trapped. This technique has the benefit that only some systems have an equivalent state. Hence the cost of every system differs from the thermodynamic equilibrium method.
The fitness function (FF) employed in the TEA-FS approach was modelled to maintain a balance amongst the classification accuracy (maximum), and the number of selected features in each solution (minimum) acquired through the selected attributes, Eq. (22) signifies the FF for evaluating solutions.
Here
3.2 Cyberattack Detection Using ELM Model
To detect cyberattacks, the BAOML-CADC technique used the ELM model. Huang et al. developed an ELM to increase overall performance and prevent time-consuming backward iterative training [23]. ELM is a Single hidden Layer Feedforward Network (SLFN) with random weight and biases amongst the hidden and input layers. ELM is commonly applied in different areas because of its quick training speed and easier realization. Fig. 2 depicts the framework of ELM. Data of
The following formula can be expressed as output matrix O.
Whereas
In Eq. (25),
3.3 BC-Based Integrity Verification
Finally, a BC-based integrity verification technique is developed to defend against the misrouting attack. Over the last few decades, BC technology has provided privacy and security in different networks [24]. Despite the remarkable features of BC, it is still susceptible to fraudulent activity. The malicious entity might carry out fraudulent and invalid transactions using different technologies, namely double-spending attacks. BC can be fused with ML algorithms to resolve these problems in this work. The dataset of bitcoin transactions has been utilized in the fundamental process, and the presented ML approach has been trained on the database. The pattern of transactions saved in the dataset can be analyzed for added applications. Simultaneously, the transaction can be done on the Ethereum network. The pattern of this transaction is the same as that of bitcoin transactions saved in the bitcoin transaction data. The transaction pattern is analysed and compared to the bitcoin transaction patterns. Once the pattern matches these transactions, the new transaction is categorized as malicious or legitimate. A double-spending attack has been performed in the fundamental process for additional testing of the robustness of the model.
The BC is a building block of the reliable authentication system. The major objective is to provide a solution where each flow made from the controller is stored in an immutable and verifiable dataset. The BC involves a sequence of blocks connected through hash value. In the BC system, the user contains a pair of keys, such as a public key represents the irreplaceable address and a private key to sign the BC transaction. The client sign a transaction using the private key and transfers it to another one in the network for authentication. If the transmission block gets confirmed, then it is added to BC. Once stored, the data in the presented block can't be modified without changing each subsequent block. Also, the data is presented in the host simultaneously. Thus, the modification is rejected by the peer host. Now, a private BC is presented in contradiction to public BC. The private BC determines who should participate in the network and represented action in addition to permission allocated to the identifiable applicant. Hence, the requirement has been limited for a consensus mechanism, including Proof of Work.
In this section, the performance validation of the BAOML-CADC method can be performed through a benchmark dataset [25], which has 1000 distinct classes of events. The dataset comprises multiclass (Attack, No event, and Natural) and binary (Attack and Natural) labels.
Table 1 demonstrates the experimental results offered by the BAOML-CADC method on the binary class dataset. In Fig. 3, a detailed
Fig. 4 presents a comprehensive
In Fig. 5, detailed
In Table 2, the experimental results offered by the BAOML-CADC model on the Multiclass class dataset are represented. In Fig. 6, a brief
In Fig. 7, detailed
In Fig. 8, a complete
Table 3 presents a comparative analysis of the BAOML-CADC with existing approaches [26]. The comprehensive comparative examination of the BAOML-CADC method with existing models on binary class is described in Fig. 9. The outcomes exhibited that the RF technique has reached poor performance with
The comprehensive comparative examination of the BAOML-CADC model with existing methods on multiclass is given in Fig. 10. The outcomes show that the RF method has reached poor performance with
These results demonstrated the supremacy of the BAOML-CADC model on cyberattack classification.
In this study, we have developed a new BAOML-CADC technique for Cyberattack Detection and Classification process. In the BAOML-CADC technique, the major focus lies in identifying cyberattacks. Initially, the presented BAOML-CADC technique applied the TEA-FS approach for the optimal choice of features. For cyberattack recognition, the BAOML-CADC technique used the ELM model. In addition, a BC-based integrity verification technique is developed to protect against the misrouting attack. The experimental validation of BAOML-CADC technique is tested on a benchmark cyberattack dataset. The obtained values implied the improved performance of the BAOML-CADC algorithm over other models with maximum accuracy of 93.17%. In the future, the performance of BOAML-CADC technique can be improved using a hyperparameter tuning process.
Funding Statement: This work was funded by the of Scientific Research at Princess Nourah bint Abdulrahman University, through the Research Groups Program Grant No. (RGP-1443-0051).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
References
1. D. G. Roy and S. N. Srirama, “A blockchain-based cyber attack detection scheme for decentralized internet of things using the software-defined network,” Software: Practice and Experience, vol. 51, no. 7, pp. 1540–1556, 2021. [Google Scholar]
2. M. Abdel-Basset, N. Moustafa and H. Hawash, “Privacy-preserved cyberattack detection in industrial edge of things (IeoTA blockchain-orchestrated federated learning approach,” IEEE Transactions on Industrial Informatics, vol. 18, no. 11, pp. 7920–7934, 2022. [Google Scholar]
3. J. Zhang, L. Pan, Q. L. Han, C. Chen, S. Wen et al., “Deep learning based attack detection for cyber-physical system cybersecurity: A survey,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 3, pp. 377–391, 2021. [Google Scholar]
4. O. Ajayi, M. Cherian and T. Saadawi, “Secured cyber-attack signatures distribution using blockchain technology,” in IEEE Int. Conf. on Computational Science and Engineering (CSE) and IEEE Int. Conf. on Embedded and Ubiquitous Computing (EUC), New York, NY, USA, pp. 482–488, 2019. [Google Scholar]
5. V. Kelli, P. Sarigiannidis, V. Argyriou, T. Lagkas and V. Vitsas, “A cyber resilience framework for NG-IoT healthcare using machine learning and blockchain,” in CC-IEEE Int. Conf. on Communications, Montreal, QC, Canada, pp. 1–6, 2021. [Google Scholar]
6. T. V. Khoa, Y. M. Saputra, D. T. Hoang, N. L. Trung, D. Nguyen et al., “Collaborative learning model for cyberattack detection systems in IoT industry 4.0,” in IEEE Wireless Communications and Networking Conf. (WCNC), Seoul, Korea (Southpp. 1–6, 2020. [Google Scholar]
7. M. Dehghani, T. Niknam, M. Ghiasi, N. Bayati and M. Savaghebi, “Cyber-attack detection in dc microgrids based on deep machine learning and wavelet singular values approach,” Electronics, vol. 10, no. 16, pp. 1914, 2021. [Google Scholar]
8. N. Waheed, X. He, M. Ikram, M. Usman, S. S. Hashmi et al., “Security and privacy in IoT using machine learning and blockchain: Threats and countermeasures,” ACM Computing Surveys (CSUR), vol. 53, no. 6, pp. 1–37, 2020. [Google Scholar]
9. A. Javadpour, P. Pinto, F. Ja’fari and W. Zhang, “DMAIDPS: A distributed multi-agent intrusion detection and prevention system for cloud IoT environments,” Cluster Computing, vol. 7, no. 3, pp. 150.936, 2022. https://doi.org/10.1007/s10586-022-03621-3 [Google Scholar] [CrossRef]
10. M. Hajizadeh, N. Afraz, M. Ruffini and T. Bauschert, “Collaborative cyber attack defense in SDN networks using blockchain technology,” in 6th IEEE Conf. on Network Softwarization (NetSoft), Ghent, Belgium, pp. 487–492, 2020. [Google Scholar]
11. N. Mhaisen, N. Fetais and A. Massoud, “Secure smart contract-enabled control of battery energy storage systems against cyber-attacks,” Alexandria Engineering Journal, vol. 58, no. 4, pp. 1291–1300, 2019. [Google Scholar]
12. M. Komar, V. Dorosh, G. Hladiy and A. Sachenko, “Deep neural network for detection of cyber attacks,” in IEEE First Int. Conf. on System Analysis & Intelligent Computing (SAIC), Kyiv, Ukraine, pp. 1–4, 2018. [Google Scholar]
13. R. M. A. Ujjan, Z. Pervez and K. Dahal, “Snort based collaborative intrusion detection system using blockchain in SDN,” in 13th Int. Conf. on Software, Knowledge, Information Management and Applications (SKIMA), Island of Ulkulhas, Maldives, pp. 1–8, 2019. [Google Scholar]
14. O. O. Malomo, D. B. Rawat and M. Garuba, “Next-generation cybersecurity through a blockchain-enabled federated cloud framework,” The Journal of Supercomputing, vol. 74, no. 10, pp. 5099–5126, 2018. [Google Scholar]
15. D. Said, M. Elloumi and L. Khoukhi, “Cyber-attack on P2P energy transaction between connected electric vehicles: A false data injection detection based machine learning model,” IEEE Access, vol. 10, pp. 63640–63647, 2022. [Google Scholar]
16. M. Ghiasi, M. Dehghani, T. Niknam, A. Kavousi-Fard, P. Siano et al., “Cyber-attack detection and cyber-security enhancement in smart DC-microgrid based on blockchain technology and Hilbert Huang transform,” IEEE Access, vol. 9, pp. 29429–29440, 2021. [Google Scholar]
17. O. Ajayi and T. Saadawi, “Blockchain-based architecture for secured cyber-attack features exchange,” in 7th IEEE Int. Conf. on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE Int. Conf. on Edge Computing and Scalable Cloud (EdgeCom), New York, NY, USA, pp. 100–107, 2020. [Google Scholar]
18. A. Kumar, A. K. Singh, I. Ahmad, P. K. Singh, P. K. Verma et al., “A novel decentralized blockchain architecture for the preservation of privacy and data security against cyberattacks in healthcare,” Sensors, vol. 22, no. 15, pp. 5921, 2022. [Google Scholar] [PubMed]
19. P. Kumar, R. Kumar, G. Srivastava, G. P. Gupta, R. Tripathi et al., “PPSF: A privacy-preserving and secure framework using blockchain-based machine-learning for IoT-driven smart cities,” IEEE Transactions on Network Science and Engineering, vol. 8, no. 3, pp. 2326–2341, 2021. [Google Scholar]
20. H. Cui, X. Dong, H. Deng, M. Dehghani, K. Alsubhi et al., “Cyber attack detection process in sensor of DC micro-grids under electric vehicle based on Hilbert-Huang transform and deep learning,” IEEE Sensors Journal, vol. 21, no. 14, pp. 15885–15894, 2020. [Google Scholar]
21. G. Liang, S. R. Weller, F. Luo, J. Zhao and Z. Y. Dong, “Distributed blockchain-based data protection framework for modern power systems against cyber-attacks,” IEEE Transactions on Smart Grid, vol. 10, no. 3, pp. 3162–3173, 2018. [Google Scholar]
22. W. Pakdee and T. Sakkarangkoon, “Numerical study of an unsteady non-premixed flame in a porous medium based on the thermal equilibrium model,” Journal of Theoretical and Applied Mechanics, vol. 59, no. 3, pp. 401–412, 2021. [Google Scholar]
23. L. Li, Z. Liu, Y. Lu, F. Wang and S. Jeon, “Hard-rock tunnel thrust prediction with TBM construction big data using an improved two-hidden-layer extreme learning machine,” IEEE Access, vol. 10, pp. 112695–112712, 2022. [Google Scholar]
24. A. Derhab, M. Guerroumi, A. Gumaei, L. Maglaras, M. A. Ferrag et al., “Blockchain and random subspace learning-based IDS for SDN-enabled industrial IoT security,” Sensors, vol. 19, no. 14, pp. 3119, 2019. [Google Scholar] [PubMed]
25. S. Abe, Y. Uchida, M. Hori, Y. Hiraoka and S. Horata, “Cyber threat information sharing system for industrial control system (ICS),” in 2018 57th Annual Conf. of the Society of Instrument and Control Engineers of Japan (SICE), Nara, Japan, pp. 374–379, 2018. [Google Scholar]
26. M. Ragab and A. Altalbe, “A blockchain-based architecture for enabling cybersecurity in the internet-of-critical infrastructures,” Computers, Materials & Continua, vol. 72, no. 1, pp. 1579–1592, 2022. [Google Scholar]
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.