In recent years, the application of a smart city in the healthcare sector via loT systems has continued to grow exponentially and various advanced network intrusions have emerged since these loT devices are being connected. Previous studies focused on security threat detection and blocking technologies that rely on testbed data obtained from a single medical IoT device or simulation using a well-known dataset, such as the NSL-KDD dataset. However, such approaches do not reflect the features that exist in real medical scenarios, leading to failure in potential threat detection. To address this problem, we proposed a novel intrusion classification architecture known as a Multi-class Classification based Intrusion Detection Model (M-IDM), which typically relies on data collected by real devices and the use of convolutional neural networks (i.e., it exhibits better performance compared with conventional machine learning algorithms, such as naïve Bayes, support vector machine (SVM)). Unlike existing studies, the proposed architecture employs the actual healthcare IoT environment of National Cancer Center in South Korea and actual network data from real medical devices, such as a patient’s monitors (i.e., electrocardiogram and thermometers). The proposed architecture classifies the data into multiple classes: Critical, informal, major, and minor, for intrusion detection. Further, we experimentally evaluated and compared its performance with those of other conventional machine learning algorithms, including naïve Bayes, SVM, and logistic regression, using neural networks.
Nowadays, information and communication technology is increasingly applied to the the healthcare sector in smart city infrastructure, the foundation of which is network technology for data transmission and reception. Network flows in such infrastructure are also increasing in complexity owing to advanced technologies such as Internet of Things (IoT), cloud computing, big data, mobile, artificial intelligence, and blockchain technologies [
Before the advent of IoT, interactions between patients and medical staff were limited to visits and telephone calls. As such, it was impossible to monitor patients continuously. The application of IoT has enhanced the connectivity of devices related to healthcare and has redefined the interaction space of devices and people when medical services are provided, significantly improving the medical sector. With the emergence of IoT-applied medical services, all members in a city, including healthy people, patients, medical staff, hospitals, and health insurance companies, can now remotely monitor a person’s health status with no distinction between inside and outside a medical institution. This capability has increased the ease and efficiency of interacting with medical staff. It not only shortens hospital stays and prevents re-hospitalization, but also substantially reduces medical costs and improves treatment outcomes [
A vast amount of data in smart city healthcare field has been actively trasferred between people through devices based on edge nodes or edge cloud. There are also various types of connectivity-based equipment. Such an environment, however, contains either directly or indirectly sensitive information, which potentially exposes personal information to attacks. Unlike other fields, healthcare cyberattacks in smart cities can cause physical and logical confusion to individuals and society. Therefore, it should be able to defend against interrupting service requests on the network [
Previous studies mostly focused on security threat detection and blocking technology (based on testbed data composed of a single medical IoT device or simulator) [
Therefore, in this study, machine learning technology was applied to classify network events into four different classes (critical, informal, major, and minor) using data collected by real devices in order to sufficiently reflect the complex network flow and characteristics of the actual healthcare IoT environment. We built real world data-based models using a neural network-based multi-class intrusion classification algorithm for these classes.
To address the above problems in healthcare IoT, we proposed a Multi-class classification based Intrusion Detection Model (M-IDM) for healthcare IoT in a smart city that relies on machine learning techniques. The contributions of this paper are as follows:
We proposed a novel intrusion classification architecture based on machine learning techniques to overcome problems related to the detection of unknown attacks in healthcare IoT. A service scenario is presented to classify the security event in the network as “normal” or “anomaly (critical, major, minor)” based on various features. We experimentally evaluated and analyzed the proposed model architecture using a large amount of data to demonstrate its practicability and feasibility.
The structure of the rest of this paper is as follows. Section 2 discusses related works on intrusion detection and machine learning. Section 3 proposes a prediction model using machine learning algorithms for intelligent network intrusion detection. Section 4 provides analysis and comparison of the existing and proposed models for network intrusion detection. Finally, Section 5 summarizes the main findings of this study and the concluding remarks.
This procedure is divided into the network intrusion detection system (NIDS) and host-based intrusion detection system (HIDS) according to the detection location. The NIDS analyzes the network traffic, and the result is combined with other technologies to increase the performance of the detection and prediction speed. In particular, artificial neural network-based intrusion detection systems can recognize intrusion patterns more efficiently, which helps them analyze large amounts of data. Meanwhile, the HIDS monitors important operating system files and the inbound and outbound packets of the device and also sends alerts in cases of a suspicious activity.
Classification techniques can be divided into signature-based and anomaly-based methods. Signature-based methods search for specific patterns, such as byte sequences of network traffic or sequences of known malicious instructions using malware. In contrast, anomaly-based methods can easily detect known attacks but show poor detection performance in the case of new attacks in which patterns cannot be used. Anomaly-based methods are primarily used to classify unknown attacks due to the rapid development of a malicious code. Essentially, the machine learning algorithm is used to create a reliable model, then, its operations are compared. Although unknown attacks can be detected, this method may also result in false positives. An efficient feature selection algorithm must be used to enhance the reliability of classification [
In theoretical terms, machine learning is a field of artificial intelligence in which algorithms are developed that enable machines to learn and execute operations that are not specified in codes. Representation and generalization are the key elements among the many features that are involved in machine learning. Representation refers to the evaluation of given data, whereas generalization refers to processing of unknown data. In practice, the three key elements of machine learning are the training set, model, and inference. The training set refers to data used for learning, the model is the output obtained through the training set, and the inference is the training output prediction based on input values through actual data [
In a conventional program, data are input and the program presents the results of processing the input data. However, when machine learning processes the data, the model (algorithm) developed from the training dataset provides the prediction results of the input values in the test dataset. Hence, machine learning algorithms are suitable for solving problems where it is difficult to explain the sequence or reasoning clearly [
The machine learning model was selected based on whether the data were labeled or not; if the data are labeled, supervised learning models are used to perform classification and prediction, whereas if the data are unlabeled, unsupervised learning models are used to perform clustering. The two models are different, but when applying actual data to the model, a harmonized methodology is used because labeled data are rare [
Kabir et al. [
Wang et al. [
Farnaaz et al. [
Swarnkar et al. [
For the search strategy, Khammassi et al. [
Caminero et al. [
To identify a variety of unauthorized use, misuse, and abuse of computer systems, Liu et al. [
Handling redundant or irrelevant features in high-dimensional datasets has been a long-term challenge in network anomaly detection. Removing these features through spectral information not only speeds up the classification process but also helps classifiers make accurate decisions during instances of attack recognition.
Salo et al. [
Divyasree et al. [
Al-Jarrah et al. [
Hady et al. [
Gao et al. [
In this paper, we demonstrated that a model created using machine learning based on extracting actual data from the hospital environment can respond to the security threats of IoT medical devices, which are otherwise difficult to manage. Moreover, it is useful to classify detailed risks to enable greater focus on serious events in an IoT medical device mass produced from heterogeneous medical devices, as it shows that it is possible to classify threats of four labels beyond simple binary classification with high accuracy.
In summary, existing studies demonstrated that machine learning is a good approach to support network intrusion detection in communication and distributed infrastructure. Thus, this paper presents an M-IDM to respond to the security threats of IoT medical devices, which are difficult to manage, through a model trained by extracting actual data from the hospital environment. The proposed model shows that it is possible to classify threats of four labels beyond simple binary classification with high accuracy.
The proposed security model M-IDM relies on the concept of intrusion classification in which a machine learning model is trained over the baseline dataset to classify the anomaly behaviors from legitimate ones. Unlike existing studies, the proposed M-IDM uses the actual healthcare IoT environment of the National Cancer Center, South Korea, and actual network data from real medical devices, such as a patient’s monitor, including electrocardiogram and thermometers. Moreover, it employs convolutional neural network (CNN), which exhibits better performance compared with conventional machine learning algorithms such as naïve Bayes and SVM, to classify the data into multiple classes (critical, informal, major, and minor) for intrusion detection. This section describes the architectural design overview of the M-IDM, including major module data description, data preprocessing, and service scenario.
The architectural design overview of the proposed M-IDM is shown in
During the input stage, raw data is accumulated, which includes network traffic, logs, scan from internal medical sources, vulnerability database, threat feeds from technical sources, social media, forums, and dark web from human sources. Preprocessing eliminates some inappropriate, multifunctional, or noisy data that might be present in subsequent raw data. The feature extraction component provides extraction and specification of the relevant features, including network security event data such as the IP, port, protocol, and severity from heterogeneous medical devices to support security threat classification in the healthcare IoT environment. The classification module is responsible for creating a trained model with relevant features from the preprocessed data. It uses various machine learning algorithms for classification purposes.
Here, the processed data is divided into training and test data. The classification model is trained using only the training data. The trained model is then repeatedly validated using the validation data. The process either proceeds to the next stage or corrects the parameters, learning method, etc., based on the validation results, and training is repeated. The model is completed through this process. In the output stage, the actual values are input into the model completed in the previous stage to confirm the classification. The classes are normal and anomaly (critical, major, minor).
The proposed M-IDM architecture uses the actual healthcare IoT environment of National Cancer Center, South Korea, and actual network data from real medical devices, unlike previous studies. The dataset was collected from a total of six medical devices with the same IP band, such as a patient’s monitor, an electrocardiogram, a thermometer, a sphygmomanometer, a hygrometer, and a fall prevention bed with an alarm watch, which is used in an isolated internal-medical-device-only wireless network. There is a network tab device configured using the mirror method for transmitting and receiving all traffic between the medical IoT device and gateway. We obtained monthly logs of all traffic going through this firewall to the gateway. Out of the 300,000 cases collected (12 months), 100,000 cases (4 months, approximately 833/day) were selected in an even distribution. For the data label, four risk labels defined in the firewall were used: Normal, critical, major, and minor.
The network event data consists of 11 features: one target variable and ten explanatory variables for machine learning, as listed in
Type | Variable type | Attributes | Data type | |
---|---|---|---|---|
Severity | Target | Normal, critical, major, minor | Nominal | |
Working hour | Explanatory | Day: 09:00–18:00 |
Binary | |
Date | 2017-01-01 00:00:00 | Redefine to working hour | ||
Type of source/destination IP | Private, public | Binary | ||
Source/destination IP | 000.000.000.000 | Redefine to type of source/destination IP | ||
Source/destination port | 1–65535 | Numeric | ||
Protocol | dns, kerberos, http, https, ssh, telnet, imap, smtp, pop3, tftp, ftp, smb, smb2, icmp, ntp, tcp, udp | Nominal | ||
Flag | URG, ACK, PSH, RST, SYN, FIN, N/A | Nominal |
There are two types of dataset attributes in the proposed M-IDM: Symbolic and numeric. The data set attribute is numeric. However, the data of symbolic properties cannot be directly processed. Thus, it is necessary to convert symbolic data to numeric data.
Symbolic attributes | Symbolic values | Number of distinct values |
---|---|---|
Working hour | 1 and 0 | 2 |
Type of source/destination IP | 1 and 0 | 2 |
Protocol | dns, kerberos, http, https, ssh, telnet, imap, smtp, pop3, tftp, ftp, smb, smb2, icmp, ntp, tcp, udp | 17 |
Flag | URG, ACK, PSH, RST, SYN, FIN, N/A | 6 |
The protocol attribute has 17 unique values; similarly, the flag attribute is defined with 6 unique values. Many approaches have been proposed for handling symbolic attributes. In an experiment conducted as part of this study, we employed a method that uses conditional probability and dummy indicator variables to process protocol and flag properties [
Symbolic attributes | Protocol type | Description |
---|---|---|
PR1 | dns | Service belongs to names server |
PR2 | kerberos | Service belongs to authentication |
PR3 | http, https, |
Service belongs to web applications |
PR4 | ssh, telnet, |
Service for remote access to other machines |
PR5 | imap, smtp | Service for mail transfer |
PR6 | tftp, ftp, smb | Service for file transfer |
PR7 | Remaining protocols | All other services |
Symbolic attributes | Flag type | Description |
---|---|---|
FL1 | SYN | Connection request and in TCP |
FL2 | ACK | Response in TCP |
FL3 | RST | Connection reset in TCP |
FL4 | PSH | Message push in TCP |
FL5 | URG | Urgent message in TCP |
FL6 | FIN | Connection termination in TCP |
FL7 | N/A | All other flags or blank |
In this experimental evaluation of the proposed M-IDM architecture, the selected data (i.e., 100,000 cases or instances) were randomly sampled and divided into training or labeled data and testing or unlabeled data. The ratio of training and testing dataset was 90:10, where 90% (i.e., 90,000 instances) is training data and the remaining 10% (i.e., 10,000) is testing data.
This section describes the service scenario of the proposed M-IDM, which classifies the security event data into classes of “normal” or “anomaly (critical, major, minor).”
The details of the service scenario are as follows:
Data separation: All security event data collected on the healthcare network are randomly sampled and divided into training and test data. The separated data are used to generate the model through learning and to validate the reliability of the model. Model training: The learning algorithm is selected considering various conditions; then, the parameters are adjusted according to the algorithm and learning is performed using only the training data from the data separated in ➀. After assessing the precision of the learning model using the test data, this process is repeated by applying different parameters and algorithms and other methods until the desired result is obtained. The processes in ➀ and ➁ are performed in batch form. Real-time classification 1: The model generated in ➁ is applied to the classifier; then, the real IoT medical devices network security event data (the real data do not overlap with the data in ➀) are input in real-time. The input data are first classified as “normal” or “anomaly” using a trained model that is not based on rules. Real-time classification 2: The IoT medical devices security event data classified as “anomaly” in ➂ are further classified as “critical,” “major,” or “minor.” The processes in ➂ and ➃ are performed in real-time.
In this study, we experimentally evaluated the performance of the proposed M-IDM, which was developed by employing CNN algorithms in Python 3.7.0 environment with orange. We selected a CNN by validating its classification performance and those of conventional machine learning algorithms such as naïve Bayes and SVM. The CNN has the structure:
The specifications of the PC used for the experimental setup are as follows: CPU i7-8700 3.2 GHz, memory 8 GB, and graphic card RTX 2060 4 GB. Several standard measures, such as precision, recall, area under the receiver operating characteristic curve (AUC), and F1-score were used.
To achieve an objective comparison of the proposed algorithm against existing conventional algorithms, the precision, recall, AUC, and F1-score [
Number of instances | Method | AUC | F1 | Precision | Recall |
---|---|---|---|---|---|
Naïve Bayes | 0.957 | 0.881 | 0.906 | 0.886 | |
Logistic regression | 0.947 | 0.871 | 0.900 | 0.875 | |
Naïve Bayes | 0.957 | 0.863 | 0.940 | 0.815 | |
Logistic regression | 0.929 | 0.865 | 0.894 | 0.901 | |
Naïve Bayes | 0.957 | 0.869 | 0.939 | 0.827 | |
Logistic regression | 0.932 | 0.897 | 0.915 | 0.923 |
*Constraints.
M-IDM (activation: ReLu, hidden layer: 100, maximal number of iterations: 200, regulation
Logistic regression (regulation type: ridge, strength:
SVM (cost: 1.0, regression loss epsilon: 0.1, iteration limit: 100).
Excluding the SVM in which the precision was significantly reduced, the naïve Bayes and logistic regression approaches (
Type of classes | Predicted | |||
---|---|---|---|---|
Critical (%) | Informal (%) | Major (%) | Minor (%) | |
Actual | ||||
Critical | 98.8 | 2.4 | 0.0 | 1.4 |
Informal | 0.3 | 94.3 | 5.3 | 1.2 |
Major | 0.2 | 0.3 | 87.7 | 0.1 |
Minor | 0.6 | 3.0 | 7.0 | 97.4 |
This section describes the effect of the number of labels on the prediction. The same data and conditions were used in these tests as those used for the M-IDM algorithm (
The following rates were observed: Anomaly 99.3% and normal 94.4% at two classes; critical 93.5%,
All the algorithms showed good accuracy of 85.3%–99.3%. At four classes, the accuracy by class ranged from 87.7% to 98.6%, where “major” had a relatively low accuracy of 87.7% compared with the other classes.
We compared the findings of this study with those obtained in existing studies based on various aspects.
Type | Hady et al. [ |
Gao et al. [ |
Alrashdi et al. [ |
This work | |
---|---|---|---|---|---|
Methodology | NN | Decision tree | Random forest | NN | |
Number of | |||||
Features | 34 | 7 | 12 | 10 | |
Records | 16,000 | 7,000 | 257,673 | 100,000 | |
Classes | 2 | 2 | 2 | 2 | 4 |
Hidden layer | 100 | – | – | 100 | 100 |
Min/max AUC | 91.45–93.42 | 87.7–90.37 | 98 | 94.3–99.4 | 87.7–98.6 |
Validation | 10-fold | – | – | 10-fold | |
Data source | Testbed data | Testbed data | UNSW-NB15 | Real-world data | |
Number of device types | 1 | 1 | – | 6 | |
Detection range (sensor–gateway–server) | Gateway–server | Gateway–server | Gateway–server | Sensor–gateway (edge node) |
In existing studies, binary classification is mainly used and only simple classification is possible. Moreover, because the number of devices used for data acquisition and generation is from a testbed, it is difficult to reflect the characteristics that occur in a mixed environment of heterogeneous devices. However, this study classifies various classes while considering the constraints of the IoT environment by acquiring traffic logs that multiple actual IoT medical devices communicate with and learning from the data an environment in which heterogeneous IoT medical devices are mixed.
We evaluated the complexity across the proposed model. As shown in
In this study, we proposed a multi-class security event classification model based on machine learning. The proposed model was built using real-world data and neural network-based multi-class intrusion classification algorithm for four classes. This work sufficiently reflects the complex network flow and characteristics of a real healthcare IoT environment, and machine learning technology was applied using data from real devices to classify network events into four different classes. In future work, more meaningful features should be found in security event data before refining to enhance the performance of the proposed approach, and methods should be developed to improve the somewhat low accuracy for rare classes to address the problem of data imbalance between the classes.