A Network Security Risk Assessment Method Based on a B_NAG Model

Hui Wang; Chuanhan Zhu; Zihao Shen; Dengwei Lin; Kun Liu; MengYao Zhao

doi:10.32604/csse.2021.014680

[BACK]

Computer Systems Science & Engineering DOI:10.32604/csse.2021.014680
Article

A Network Security Risk Assessment Method Based on a B_NAG Model

Hui Wang1, Chuanhan Zhu1, Zihao Shen1,*, Dengwei Lin2, Kun Liu1 and MengYao Zhao3

1School of Computer Science & Technology, Henan Polytechnic University, Jiaozuo, 454000, China
2Office of Educational Administration, Jiaozuo University, Jiaozuo, 454000, China
3Department of Computer Science, University College London, London, United Kingdom
*Corresponding Author: Zihao Shen. Email: szh@hpu.edu.cn
Received: 08 October 2020; Accepted: 09 January 2021

Abstract: Computer networks face a variety of cyberattacks. Most network attacks are contagious and destructive, and these types of attacks can be harmful to society and computer network security. Security evaluation is an effective method to solve network security problems. For accurate assessment of the vulnerabilities of computer networks, this paper proposes a network security risk assessment method based on a Bayesian network attack graph (B_NAG) model. First, a new resource attack graph (RAG) and the algorithm E-Loop, which is applied to eliminate loops in the B_NAG, are proposed. Second, to distinguish the confusing relationships between nodes of the attack graph in the conversion process, a related algorithm is proposed to generate the B_NAG model. Finally, to analyze the reachability of paths in B_NAG, the measuring indexs such as node attack complexity and node state transition are defined, and an iterative algorithm for obtaining the probability of reaching the target node is presented. On this basis, the posterior probability of related nodes can be calculated. A simulation environment is set up to evaluate the effectiveness of the B_NAG model. The experimental results indicate that the B_NAG model is realistic and effective in evaluating vulnerabilities of computer networks and can accurately highlight the degree of vulnerability in a chaotic relationship.

Keywords: Network attack graph; Bayesian network; state transition; reachability; risk assessment

1 Introduction

Computer networks play an indispensable role in people’s productivity and daily life. However, these networks face a variety of cyberattacks, most of which are highly contagious and destructive. These attacks threaten the network security of devices, affecting the popularization of networks and even severely damaging information security [1–3].

According to the “Development Status of China Internet Sites and Security Report in 2018” [4], the National Computer Network Emergency Response Technical Team of China (CNCERT) discovered that over 1.254 million Internet of Things smart devices was attacked successfully and therefore had a great threat to the security of networks. Moreover, in 2018, CNCERT discovered over 2.05 million cyberattacks continuing a trend of high growth over the previous six years. A survey of these attacks announced that the number of applications had quickly increased and was nearly three times higher than the percentage in 2016.

In recent years, researchers have introduced methods based on Bayesian probabilities in evaluating vulnerabilities of attack graphs [5–7]. Bayesian networks are capable of representing nondeterministic relationships and can be used to quantify the correspondences within attack graphs. Therefore, methods of effectively combining a Bayesian network with an attack graph for network vulnerability assessment have become an important focus of research.

2 Related Research

Recently, lots of scholars analyzed vulnerabilities of networks by using attack graph. Because of the asymmetric information between attackers and defenders, the detection of Zero Day attacks is still challenging. Revealing Zero Day attacks based on attack paths is a better strategy than targeting them individually.

Sun et al. [8] implemented the system ZePro to identify Zero Day attack paths by adopting the probabilistic approach. With evidence of intrusion as input, the Bayesian network used in this system can calculate the infection probabilities concerning object instances.

The dynamic defense framework was presented to select best countermeasures against diverse attack damage costs [9]. To calculate these costs, a new defense-centric model was designed on the basis of service dependency graphs. The current approaches suffer from some limitations. For example, only static countermeasure effectivity and static countermeasure deployment costs are considered, but the negative impacts of the possible countermeasures on service quality are neglected [10]. These above-mentioned restrictions may lead an industrial control system (ICS) to choose improper countermeasures and deployment locations. And then they can degrade the network performance and frustrate legitimate users.

The construction and analysis on inference rules of attack graph was presented by Garg et al. [11]. They developed a methodology for prioritizing individual vulnerabilities and attack paths using a PageRank model. The results were verified by using a Markov model, and showed that the methodology outperformed lots of current technologies [12] about risk analysis. However, the relevant experiment was lack of specific indicators, and the results were not convincing.

As Zhang et al. [13] said, dynamic risk analysis is an important component of protecting network security. However, risk assessment methods used in network systems are not very appropriate for ICSs due to their unique characteristics. That paper proposed a multilevel network model including attack functions and incidents based on Bayesian. On this basis, it proposed a new risk incident prediction method, and designed a dynamic security risk assessment method which can assess the risk caused by unknown attacks [14]. Moreover, a quantification method was presented to further calibrate the accuracy of assessment. Finally, to test and verify the method, the simplified control system was simulated in MATLAB.

On the basis of previous researches, the paper presents a Bayesian network attack graph (B_NAG) model and an algorithm to assess network vulnerabilities. In this paper, probability theory is introduced into the resource attack graph (RAG) model and converted into the corresponding B_NAG model. The reachability probability of nodes can be calculated, and the final reachability probability of attack paths can be calculated. Finally, the related posterior probability can be calculated, and enable network security administrators to assess network security more accurately and effectively.

3 The RAG Model

Attack graph is a method to analyze all sequences of vulnerabilities exploited by attackers. Attacks can be occurred against all available node status and vulnerability, and all sequences can be constructed into a directed graph. The purpose of the RAG model is to characterize an attack sequence launched against the attacker’s intentions according to Bayesian probability calculations to help network administrators properly understand the security status of their networks. The RAG model is constructed as described below.

Definition 1 The graph $R A G = (S, S_{0}, A, E, Γ, L, O)$ is a directed graph, where the relevant notations are defined as follows:

• $S = {s_{i} | i = 1, \dots, N}$ denotes a resource state nodes set.

• S0 ∈ S denotes the initial resource state nodes which are occupied by the attacker.

• $A = {a_{i} | i = 1, \dots, N}$ represents a set of attack behavior nodes.

• $E = {E_{1} \cup E_{2}}$ denotes a set of directed edges connecting all related nodes. $E_{1} \subseteq S \times A$ means that the attack will be occurred only if one attacker occupies some resources; $E_{2} \subseteq A \times S$ means that the attack can make this attacker occupy some resources. Its parent nodes set $m$ is denoted as $P r e (m)$ , and the child nodes set $m$ is denoted as $N e x (m)$ .

• Γ is the node state discriminant function. $Γ (x)$ denotes the current status of the node x and $Γ (x) \in {1, 0}$ , where $Γ (s_{i})$ means the current status of $s_{i}$ . $Γ (s_{i}) = 1$ indicates that the attacker has occupied the resource $s_{i}$ . Conversely, 0 indicates that the attacker has not occupied the resource.

• L is the logical relationships set between nodes, and $L = {a n d, o r, b l e}$ . There is an $a n d$ relationship between Pre( $a_{i}$ ) only if all preconditions for the corresponding attack node $a_{i}$ are met. And a successful attack will enable $Γ (s_{i}) = 1$ only if the attacker has occupied the resource $s_{i}$ . There is an $o r$ relationship between attack nodes when resource state nodes are child nodes. Finally, $b l e$ denotes a kind of chaotic logical relationship which exists between parent nodes.

• $O = {o_{i} | i = 1, 2, 3, . . ., N}$ represents the set of resource state nodes associated with those successful attacks which have been detected. For $\forall o_{i} \in S$ , $o_{i}$ represents the resource state nodes associated with the successful attacks are detected by IDS.

Definition 2 Attack path: In the RAG, if there exists a status sequence $s_{0}, a_{0}, s_{1}, a_{1}, \dots, a_{n - 1}, s_{n}$ , where $s_{0}$ represents the initial node of resource state and $s_{n}$ represents the target node. So the $P a t h_{k} =< s_{0} \to a_{0} \to s_{1} \to a_{1} \to \dots \to a_{n - 1} \to s_{n} >$ can be defined, where $\forall s_{i} \in S, \forall a_{j} \in A$ $(0 \leq i \leq n, 0 \leq j \leq n - 1)$ ; The $P a t h_{k}$ denotes the attack path kth.

Definition 3 Attack behavior: One attack behavior can be denoted by a four-tuple of the form $(S r c_i d, D s t_i d, A t t_c o d e, R e s)$ , where $S r c_i d$ denotes the host $i d$ launching an attack, $D s t_i d$ denotes the host $i d$ which has been attacked, $A t t_c o d e$ is the number which can identify attack behaviors, and $R e s$ is the result of this attack.

Definition 4 State transition: One state transition is denoted by a three-tuple of the form $(s i d, v i d, r)$ , where $s i d$ is the number which can identify state transitions, $v i d$ is the number which identify vulnerabilities used by attackers, and r is the resulting state transition which is caused by one attack using vulnerabilities.

4 The Algorithm E-Loop

4.1 The Method of Metrics

To remove loops in an attack graph, an attack difficulty metric is introduced. In the Common Vulnerability Scoring System (CVSS), three basic indexes are used to characterize vulnerabilities: the access vector index, the access complexity index, and the authentication index, which are denoted by Acc_com, Acc_vec and Auth respectively. The values of these indexes associated with different levels of severity of a vulnerability are shown in Tab. 1.

Table 1: Index levels

images

Based on these indexes, the availability score of a vulnerability used in the CVSS is defined as

$E x p = 20 \times A c c_v e c \times A c c_c o m \times A u t h (0 \leq E x p \leq 10)$ (1)

An attack becomes more difficult to perform successfully as the value of $E x p$ gets smaller. Thus, the attack difficulty is inversely proportional to the availability of a vulnerability. Accordingly, an attack difficulty metric $A g a_D i f$ may be defined based on the above three indexes as shown in Eq. (2). The larger the value of $A g a_D i f$ is for a particular node, the more difficult the node is to attack.

$A g a_D i f = \frac{1}{2 A c c_v e c \times A c c_c o m \times A u t h} (A g a_D i f \geq 1)$ (2)

4.2 The Algorithm E-Loop

In the generation of the RAG, a loop may arise that leads to repeated traversals over a given node. It has a great influence on Bayesian probability calculation in network security assessment. In order to overcome the problem, the algorithm E-Loop is proposed to eliminate loops in the RAG. The specific steps are as follows:

images

Fig. 1 shows an RAG built as described above. There are two loops, $P a t h_{1} =< s_{2} \to a_{3} \to s_{5} \to a_{5} \to s_{2} >$ and $P a t h_{2} =< a_{9} \to s_{11} \to a_{12} \to s_{12} \to a_{9} >$ . For $P a t h_{1}$ , the node $a_{5}$ will be eliminated by the algorithm $E - L o o p$ to remove the loop; For $P a t h_{2}$ , node $a_{9}$ can never be reached because of $A g a_D i f (a_{9}) \to \infty$ , so this loop can be removed by eliminating this node and all subsequent nodes. Fig. 2 shows the acyclic RAG ( $A c_R A G$ ) obtained after the loops are eliminated by the E-Loop algorithm.

images

Figure 1: RAG

images

Figure 2: Acyclic RAG

5 Probability Calculation in the B_NAG Model

In the B_NAG, the probability of each node is only constrained by its parent nodes, and the node remains conditionally independent of the others. In the RAG, the transition of node state is only correlated to whether the relevant resource has been occupied or not. A child node can occur a state transition only if its parent nodes are occupied. Thus, the state transition needs be associated with conditional independence in the B_NAG.

Tab. 2 presents the corresponding relationship between an Ac_RAG and a B_NAG. Although these graphs have corresponding structures, differences exist in their certain nodes. The detailed implementation described below is based on a B_NAG.

Table 2: Corresponding relationship

images

5.1 Implementation of the B_NAG Model

Definition 5 The resulting resource state node and the conditional resource state node: The resource state node where the attack has been occurred successfully is called the resulting resource state node; When the attack condition is satisfied, the required resource state node is called the conditional resource state node.

Definition 6 $W = {w_{i j} | i, j = 1, 2, \dots, N}$ , the set of weights between the resource state nodes: $W$ is represented in the form of two-tuples $(d e p c o e f, \cos t)$ ), where $d e p c o e f$ denotes the correlation coefficient between resource state nodes and $\cos t$ denotes the cost required to attack another resource state node from the current node. $w_{i j}$ is the weight value between the node $s_{i}$ and the node $s_{j}$ .

As illustrated in the example shown in Fig. 2, an RAG consists of four structures: a series structure, a parallel $o r$ structure, a parallel $a n d$ structure, and a mixed structure. While converting such a graph into a B_NAG, each of these structures can be transformed as follows:

(1) Series structure: By deleting the attack behavior node $a_{1}$ , the attack behavior can be represented by the directed edge from $s_{1}$ to $s_{2}$ :

$(s_{1} \to a_{1} \to s_{2}) \Rightarrow (s_{1} \to s_{2})$

(2) Parallel $o r$ structure: The nodes $a_{10}$ and $a_{11}$ exist an $o r$ relationship, meaning that the attack is able to occur when the resource state condition corresponding to either of the parent nodes $s_{9}$ or $s_{10}$ can be satisfied. The related attack behavior nodes are removed. And the resulting resource state node and the conditional resource state node can be linked by one directed edge. In the B_NAG, the resource state nodes have an $o r$ relationship:

$(s_{9} \to a_{10}, s_{10} \to a_{11}, a_{10} \lor a_{11} \to s_{12}) \Rightarrow (s_{9} \lor s_{10} \to s_{12})$

(3) Parallel $a n d$ structure: The parent nodes $s_{2}$ and $s_{3}$ of $a_{3}$ have an $a n d$ relationship, meaning that the attack behavior may occur only if all resource state conditions are satisfied. After the attack node is removed, and the resulting resource state node and the conditional resource state nodes can be linked by one directed edge, which represents the attack behavior. In the transformed B_NAG, the resource state nodes still have an $a n d$ relationship:

$(s_{2} \land s_{3} \to a_{3} \to s_{5}) \Rightarrow (s_{2} \land s_{3} \to s_{5})$

(4) Mixed structure: The parent nodes $s_{6}$ and $s_{4}$ of the node $a_{6}$ have an $a n d$ relationship, and the two nodes $a_{6}$ and $a_{7}$ that can get to the resulting resource state node have an $o r$ relationship. If the node $a_{6}$ is directly removed, the structure of the RAG will become confusing, causing inconvenience in the conditional probability calculation. In order to solve the problem, this paper defines a temporary mixed node $b l e n d$ ; namely, the node $a_{6}$ is denoted as the node $b l e n d$ :

$(s_{6} \land s_{4} \to a_{6}, a_{6} \lor a_{7} \to s_{9}) \Rightarrow (s_{6} \land s_{4} \to b l e n d, b l e n d \lor s_{7} \to s_{9})$

After this conversion process, each edge in the converted B_NAG represents an attack behavior and has a weight that describes the correlation between the two resource state nodes connected by that edge. It can be observed from the converted B_NAG shown in Fig. 3 that $b l e n d$ is a mixed resource state node representing the combination of $s_{4}$ and $s_{6}$ , so there must be directed edges from $s_{4}$ and $s_{6}$ to $b l e n d$ , namely, $P (b l e n d | s_{4}, s_{6}) = 1$ . The relationships between the nodes do not change upon conversion into a B_NAG, and only the resource state nodes will be included. All attack behaviors are represented by the directed edges of the B_NAG, and the only possible relationships are $a n d$ and $o r$ .

images

Figure 3: B_NAG

To clarify the process, the conversion algorithm is proposed as follows.

images

5.2 Calculation of the Probability of Reaching a Node Based on the B_NAG

The direct parent nodes of node S are denoted here by $D P r e (S)$ , and the attack probability $P_{a} (S)$ of the target node can be calculated:

$\begin{aligned} P_{a} (S) = P (S | Pr e (S)) P (Pr e (S)) \\ = P (S | D Pr e (S)) P (D Pr e (S)) \end{aligned}$ (3)

The state transition index $P_{m} (\cos t_{i})$ can be denoted as the probability of the conversion from $S_{i - 1}$ to $S_{i}$ . Because of the correlation between one resource and its parent nodes, the weights $W$ must be considered when the state transition indexes of the parent nodes are calculated. If a sufficiently high cost is paid, the attack will be guaranteed to be accomplished; namely, if $\cos t \to \infty$ , then $P_{m} (\cos t) = 1$ . If no cost is afforded, any target can’t be attacked successfully; that is, when $\cos t = 0$ , $P_{m} (\cos t) = 0$ . If an attack on a node fails, the state of this node remains unchanged. For the state transition index $P_{m} (\cos t_{i})$ , its value follows a certain distribution. Thus, $P_{m} (\cos t_{i})$ is calculated as follows:

$P_{m} (\cos t_{i}) = P (\cos t_{i} < C o s t) = 1 - e^{- d e p c o e f \times \cos t_{i}}$ (4)

Here, $\cos t$ refers to the cost required to perform an attack, that is, the knowledge, experience, and resources needed to complete the attack. $C o s t$ means the average cost required to complete the final attacks; it’s a default value and relies on the resources, knowledge, attack tools and time. $d e p c o e f$ is the correlation degree:

$d e p c o e f = \frac{1}{A g a_D i f} (0 < d e p c o e f < 1)$ (5)

Accordingly, the state transition index $P_{m} (\cos t_{i})$ is calculated as

$P_{m} (\cos t_{i}) = 1 - e^{- \frac{\cos t_{i}}{A g a_D i f}}$ (6)

It can be concluded from Eq. (6) that resource state nodes in the B_NAG interact with each other, so the probability of reaching a given node cannot be analyzed only by traditional inference in the vulnerability analysis; instead, these state transitions must also be considered deeply. To solute this problem, the index of state transition is used to consider the probability of node state transitions when assessing vulnerabilities of the network. $P_{e n d}$ denotes the probability of reaching a target node:

$\begin{aligned} P_{e n d} (s_{i}) & = P_{m} (\cos t_{i}) \times P_{a} (s_{i}) \\ = (1 - e^{- \frac{\cos t_{i}}{A g a_D i f}}) \times P (s_{i} | D Pr e (s_{i})) P (D Pr e (s_{i})) \end{aligned}$ (7)

Here, $P_{m} (\cos t_{i})$ is the state transition index of the target node $s_{i}$ ; $\cos t$ and $d e p c o e f$ are the attack cost and the correlation degree between node $s_{i}$ and its parent nodes, respectively; and $P_{a} (s_{i})$ denotes the Bayesian probability of $s_{i}$ is attacked. Eq. (7) gives the probability of reaching a single target node; the probability of reaching a whole path will be obtained by iterating Eq. (7) accordingly. The related iterative algorithm is provided below.

In Algorithm 3, all nodes are traversed firstly. The parent nodes $P r e (s_{i})$ are pushed onto their respective stack $q$ in accordance with the number of direct parent nodes $D P r e (s_{i})$ of $s_{i}$ , and it must make sure that the start node in the path of $D P r e (s_{i})$ is finally pushed onto $q$ . Then, the nodes are each removed in proper order based on the “last in first out” principle, and their reachability probability can be calculated to finally determine the whole path’s probability being reached.

images

In the example shown in Fig. 3, if $s_{12}$ is attacked, it is obvious that the attack target can be accomplished by the below three paths:

$P a t h_{1} =< s_{1} \to s_{2} \to s_{6} \to b l e n d (s_{6} \land s_{4}) \to s_{9} \to s_{12} >$

$P a t h_{2} =< s_{4} \to s_{7} \to s_{9} \to s_{12} >$

$P a t h_{3} =< s_{4} \to s_{7} \to s_{10} \to s_{12} >$

If the weights W, the values of $d e p c o e f$ and the attack costs are shown in Fig. 3, then the prior probabilities of $s_{4}$ and $s_{1}$ are 0.3 and 0.2, respectively. For instance, the related steps for $P a t h_{1}$ are described below:

$P (s_{2}) = P_{m} (\cos t_{1}) \times P_{a} (s_{2}) = P_{m} (\cos t_{1}) \times P (Γ (s_{2}) = 1 | Γ (s_{1}) = 1) \times P (s_{1}) = 0.0866$

$P (s_{6}) = P_{m} (\cos t_{2}) \times P_{a} (s_{6}) = P_{m} (\cos t_{2}) \times P (Γ (s_{6}) = 1 | Γ (s_{2}) = 1) \times P (s_{2}) = 0.0684$

$P (b l e n d) = P_{m} (\cos t_{6}) \times P (Γ (b l e n d) = 1 | Γ (s_{6}) = 1, Γ (s_{4}) = 1) \times P (s_{6}) \times P (s_{4}) = 0.0382$

In a similar way, the following is obtained from Eq. (7): the reachability probabilities of $s_{9}$ and $s_{12}$ by following $P a t h_{1}$ are $P (s_{9}) = 0.0272$ and $P_{e n d} (s_{12}) = 0.0201$ , respectively; the reachability probability of $s_{12}$ by following $P a t h_{2}$ is $P_{e n d} (s_{12}) = 0.0648$ ; and the reachability probability of $s_{12}$ by following $P a t h_{3}$ is $P_{e n d} (s_{12}) = 0. 0573$ . If the administrator knows that $a_{1}$ has been targeted, namely, $P (s_{1}) = 1$ , then the reachability probability of $s_{12}$ by following $P a t h_{1}$ can be recalculated as $P_{e n d} (s_{12}) = 0.0631$ . This indicates that when the resource state condition corresponding to $s_{1}$ is met, $s_{12}$ is more possible to be attacked, which is the same as expected.

5.3 Posterior Probability Calculation Based on the B_NAG

In a B_NAG, it is not possible to monitor changes in the network security conditions in real time when the probabilities of resource state nodes attacked by attackers are calculated. Based on the detected the precondition and the available information of security incidents, the posterior probabilities should be calculated, and these related node probabilities can then be updated to achieve real-time monitoring. The equation for calculating a posterior probability is as follows:

$P_{o} (S_{i} | O) = \frac{P (O | S_{i}) \times P (S_{i})}{P (O)}$ (8)

Suppose that $O_{1}$ in Fig. 3 can be detected, and the probability of $s_{12}$ is 1. Then, the posterior probability of $s_{9}$ is calculated as follows by Eq. (8).

$\begin{aligned} P_{o} (s_{9} | s_{12}) & = P (s_{12} | s_{9}) \times P (s_{9}) / P (s_{12}) \\ = (P (s_{12}, s_{10} | s_{9}) + P (s_{12},^{\land} s_{10} | s_{9})) \times P (s_{9}) / P (s_{12}) \\ = (P (s_{12} | s_{10}, s_{9}) \times P (s_{10} | s_{9}) + P (s_{12} |^{\land} s_{10}, s_{9}) \times P (^{\land} s_{10} | s_{9})) / P (s_{12}) \\ = (P (s_{12} | s_{10}, s_{9}) \times P (s_{10}) + P (s_{12} |^{\land} s_{10}, s_{9}) \times P (^{\land} s_{10})) / P (s_{12}) \\ = 0.082 \end{aligned}$

In this case, the probability of reaching node $s_{9}$ changes from 0.0272 to 0.082. When certain attacks occur, the corresponding posterior probabilities in the B_NAG can effectively discover the potential risk. The real-time calculation of the risk values of nodes in the B_NAG is of great significance for the assessment of vulnerabilities.

6 Experimental Analysis

6.1 Experimental Network Environment

To verify that the given method is feasible and effective, the experimental environment shown in Fig. 4 was created. The experimental network includes five hosts: the attacking machine, a web server, a file server, an e-mail server, and a database server. For ease of description, these hosts are represented by the letters A, W, F, E and D, respectively. W opens the telnet service, F opens the File Transfer Protocol (FTP) service, E opens the FTP and Hypertext Transfer Protocol (HTTP) services, and D opens the Oracle service. The final aim of attacker A is to obtain root permissions for host D, but the firewall allows the foreign host A access to only the telnet service of host D and denies other external access. Similarly, host E is allowed access to only the Oracle service of host D, while the other three hosts can openly gain access to each other’s services. Host W can directly access host E; when it obtains access to the two services provided by host E, it can, in turn, gain direct access to the Oracle service of host D.

images

Figure 4: Topological graph of the experimental network

Information about the internal host is shown in Tab. 3.

Table 3: Information about the internal host

images

6.2 Experimental Results and Analysis

After loops have been removed as previously described during the generation of the RAG in accordance with the attack graph model and the topological graph of the experimental network, the corresponding descriptions of the attack behavior nodes are as shown in Tab. 4. These attacks are related to the services provided by the hosts and their vulnerabilities.

Table 4: Attack behavior information of the experimental attack graph

images

After the application of the conversion algorithm based on the topological graph of the experimental network to replace the attack behavior nodes mentioned in Tab. 4 with corresponding edges, the converted B_NAG is as shown in Fig. 5.

As shown in Fig. 5, each host node must win the trust of another host through a service provided by that other host, corresponding to a parallel “and” structure in the graph. When one host opens two services, the trust of that host can be obtained by gaining access to either one of its services, so the relationship between the possible attacks against that host is “or”. A node with a mixed relationship can directly access the service provided by another host by crossing over the host it is attacking once it gains access to both services of the target host. A blend node is introduced to address the corresponding mixed relationship in this graph.

images

Figure 5: Example of a Bayesian network attack graph

There are 5 paths in Fig. 5 through which the target host D can be reached. The attack path information and the probabilities of reaching each whole path are shown in Tab. 5. P1 denotes the probability of reaching the whole path as calculated by considering the state transition index as proposed in this paper, while P2 is the probability of reaching the whole path calculated without considering the state transition index.

Table 5: Attack path information for the example graph

images

Based on Tab. 5, the probability of reaching each host node is plotted in Fig. 6.

images

Figure 6: Path probabilities under P1 and P2 (a) P1 (b) P2

As shown in Fig. 7, the hosts attacked on Path1 and Path2 (and on Path3 and Path4) are the same; the only difference lies in the service of host E that is accessed. Path1 accesses the FTP service of host E, while Path2 accesses the HTTP service of host E. The final probabilities of reaching the whole path for Path1 and Path2 are 0.03247 and 0.05784, respectively, as calculated using the proposed algorithm based on the state transition index. Obvious differences can be seen between the two paths in terms of the probability of the attack successfully proceeding from host F to host E, as shown in Fig. 7a. By contrast, when the state transition index is not considered, the final probabilities for Path1 and Path2 are 0.4768 and 0.4789, respectively, and there is no meaningful difference in the probability of proceeding from host F to host E, as shown in Fig. 7b. With the proposed algorithm, although the reference value of the probability for each node decreases, the differences in probability associated with attacking different nodes are fully apparent. Therefore, this approach is effective in enabling network security administrators to perform useful analyses.

images

Figure 7: Probabilities of Path1,2 under P1 and P2 (a) P1 (b) P2

As shown in Fig. 8, the traditional computational method for Path5, which includes a mixed relationship, is to calculate all “and” nodes and “or” nodes individually. This not only requires a large number of calculations but also ignores the correlations between nodes.

images

Figure 8: Probability of Path5 under P1 and P2 (a) P1 (b) P2

The mixed node approach introduced herein provides better calculation results than the traditional method, and it does so with fewer calculations. For the mixed relationship identified when host W attempts to gain access to host E, the probability calculated by considering the state transition index effectively reflects the degree of hazard of the associated vulnerability, making this type of vulnerability more likely to be noticed by the network security administrator.

7 Conclusion

Improving the accuracy of network vulnerability assessments is an important topic in the field of network security. This paper presents a B_NAG model and an associated vulnerability algorithm as well as the algorithm E-Loop to eliminate loops in an attack graph. To effectively capture mixed relationships between nodes during the process of converting a RAG into a B_NAG, the Alg-AGTrans algorithm is also proposed. In addition, the indexes of node attack complexity and node state transition are introduced into the calculation of the probability of reaching each node, and the posterior probabilities are also calculated on this basis. The results of an experimental evaluation show that the model proposed herein can provide an accurate and effective assessment of network vulnerability. However, the proposed algorithm also has some shortcomings that should be addressed. For example, the effects of some factors, such as risk costs, are not considered when calculating the probability of reaching a node.

Funding Statement: This work was partially supported by the National Natural Science Foundation of China (61300216, Wang, H, www.nsfc.gov.cn).

Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.

References

1. V. Varadharajan, K. Karmakar, U. Tupakula and M. Hitchens. (2019). “A policy-based security architecture for software-defined networks,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 4, pp. 897–912. [Google Scholar]

2. J. Ai, H. Chen, Z. Guo, G. Cheng and T. Baker. (2020). “Mitigating malicious packets attack via vulnerability-aware heterogeneous network devices assignment,” Future Generation Computer Systems-the International Journal of Escience, vol. 111, no. 2, pp. 841–852. [Google Scholar]

3. A. J. Gallo, M. S. Turan, F. Boem, T. Parisini and G. Ferrari-Trecate. (2020). “A distributed cyber-attack detection scheme with application to DC microgrids,” IEEE Transactions on Automatic Control, vol. 65, no. 9, pp. 3800–3815. [Google Scholar]

4. Netinfo Security. (2018). “China Internet Station Development Status and Security Report (2018),” China, . [Online]. Available: https://www.isc.org.cn/editor/attached/file/20180711/20180711201225_67539.pdf. [Google Scholar]

5. L. Muñoz-González, D. Sgandurra, A. Paudice and E. C. Lupu. (2017). “Efficient attack graph analysis through approximate inference,” ACM Transactions on Privacy and Security (TOPS), vol. 20, no. 3, pp. 1–30. [Google Scholar]

6. L. Munoz-Gonzalez, D. Sgandurra, M. Barrere and E. C. Lupu. (2019). “Exact inference techniques for the analysis of bayesian attack graphs,” IEEE Transactions on Dependable and Secure Computing, vol. 16, no. 2, pp. 231–244. [Google Scholar]

7. H. Wang, Z. Chen, J. Zhao, X. Di and D. Liu. (2018). “A vulnerability assessment method in industrial internet of things based on attack graph and maximum flow,” IEEE Access, vol. 6, pp. 8599–8609. [Google Scholar]

8. X. Sun, J. Dai, P. Liu, A. Singhal and J. Yen. (2018). “Using Bayesian networks for probabilistic identification of zero-day attack paths,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 10, pp. 2506–2521. [Google Scholar]

9. A. Shameli-Sendi, H. Louafi, W. He and M. Cheriet. (2018). “Dynamic optimal countermeasure selection for intrusion response system,” IEEE Transactions on Dependable and Secure Computing, vol. 15, no. 5, pp. 755–770. [Google Scholar]

10. G. Hu and P. Qiao. (2016). “Cloud belief rule base model for network security situation prediction,” IEEE Communications Letters, vol. 20, no. 5, pp. 914–917. [Google Scholar]

11. U. Garg, G. Sikka and L. K. Awasthi. (2018). “Empirical analysis of attack graphs for mitigating critical paths and vulnerabilities,” Computers & Security, vol. 77, no. 4, pp. 349–359. [Google Scholar]

12. B. Li, G. J. Sutton, B. Hu, R. P. Liu and S. Chen. (2017). “Modeling and QoS analysis of the IEEE 802.11p broadcast scheme in vehicular ad hoc networks,” Journal of Communications and Networks, vol. 19, no. 2, pp. 169–179. [Google Scholar]

13. Q. Zhang, C. Zhou, N. Xiong, Y. Qin, X. Li et al. (2016). , “Multimodel-based incident prediction and risk assessment in dynamic cybersecurity protection for industrial control systems,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 46, no. 10, pp. 1429–1444. [Google Scholar]

14. S. Wu, Y. Zhang and W. Cao. (2017). “Network security assessment using a semantic reasoning and graph based approach,” Computers & Electrical Engineering, vol. 64, no. 4, pp. 96–109. [Google Scholar]

This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.