Redefining the relationship between safety and security

Get Complete Project Material File(s) Now! »

The difficulty of mixing safety and security

Historically, safety and security models are monolithic: they only consider safety or security but not both. This is another by-product of a separation by design between safety and security discussed in the previous paragraphs. Due to interactions between safety and security, it is obviously necessary to use models that can consider them both.
Blending safety with security is not a trivial problem. Their objectives are different and the techniques developed to achieve them exploit some of specific aspects that they do not nec-essarily share. In safety, for example, the failure rate of a component is generally a well known value, obtained experimentally from the failure of other components. Whereas in security, esti-mating the time necessary for an attack step is a difficult problem and the consensus is that it has not yet been addressed. On the contrary, in security, probabilities are more used in terms of choice that an attacker can make between multiple options than on the time taken to perform them. This raises another point where in security, events happen because a person chose to cause them, while in safety, a component failure is generally accidental.
Dedicated models will capture these specific behaviours that allow them to achieve their objective but make almost impossible any generalisation. Gibaudo et al. [50] also cite the fact that the order of occurrence of events is fundamental in security when it can be mostly ignored in safety, as well as fault models being structurally similar to the system and therefore not much subject to change when attack model will need important revisions after even minor modifications.

Bayesian Networks For Security

As developed in section 2.2.2, Bayesian Networks are used to evaluate safety. The literature also shows that they can be used for security in various configurations to obtain different infor-mation. Qin and Lee [98], use Bayesian networks as a predictive tool to anticipate an attacker’s be-haviour. They propose a set of procedures to transform an attack tree into a Bayesian network and estimate the probabilities of the nodes of the Bayesian network. Using the network, they can estimate the likelihood for a node of the network to be a goal of the attacker, based on observed evidence. Using BN, their results are devoid of any temporal evaluation. Gribaudo et al. [50] or Dantu et al. [37] also use BN as a mathematical tool to compute probabilities based on an attack graph.
Bayesian networks are not always derived from attack trees and can also be used directly. For example, Feng et al. [45] have used Dynamic Bayesian Networks to predict goals when events, in the form of system calls, happen. Althebyan and Panda [3] use BN in conjunction with dependency graphs and knowledge graph to anticipate the risk that an attacker might obtain some confidential information based on a detected attack. However, they amalgamate risk and success probability so they do not quantify the impact on the system.

Boolean-logic Driven Markov Processes

Boolean-logic Driven Markov Processes (BDMP) were introduced in 2003 by Bouissou and Bon [17]. It was presented as a synthetic way to model complex systems but still keeping the mathematical properties inherited from Markov processes. They were initially designed as a tool to model safety and later adopted to security [93].
BDMP are a powerful graphical model with an underlying probabilistic component. The graphical model is similar to a fault tree with leaves consisting of events, other nodes being logical gates and the root also corresponds to the undesired events. However, a new graphical element is added: triggers. Triggers are arrows that point from an origin node to a destination node. Its meaning is that the event, or combination of events, represented by the destination node, cannot happen unless the origin node has been realised. It is a way of introducing se-quence modelling and dependencies between events. Figure 2.7 is an example of a BDMP. Each leaf corresponds to an event that can be in several modes and each mode can be in several states. Each mode is associated with a Markov process. Probabilistic distribution functions, called transfer functions, are used to describe how an event can switch between modes. A more thorough definition was given by Pietre-Cambacedes and Bouissou, 2010 [91]. This approach enables the modelling of a potential repairability of the component represented by a transfer function from a damaged mode to a functional mode.
BDMP are used to analyse the dependability of modelled systems. They are used to com-pute the general probability of occurrence of a scenario as well as the importance of each fault, or atomic attack, in term of contribution to the general failure. The initial contribution stands in a simpler and more readable modelling of Markov processes but still keeping their powerful mathematical outputs. In 2010, with the addition of security [93], it became one of the few mod-els to propose a semantics for representing both safety and security events. Kriaa et al., 2014 [66] showcases how BDMP can be used to measure the influence of security vulnerabilities in an industrial system.

Contributions to diagnosis

BDMP describe how individual events interact with each others to form complete scenarios. They have an explicit way to describe these logical interactions but have also a framework to express sequential temporal dependencies between events. Therefore, they are not only able to provide usual structural information such as minimal cut sets [17] but are also able to consider sequences of events when analysing the past or generate hypothesis on the future.
From a probabilistic point of view, BDMP are capable of computing usual metrics such as reliability or MTTF [17, 93, 66], meaning that they can provide on demand probabilistic evaluation of future events and of how much time it will take to realise some sequences of events. BDMP are able to measure the contribution of each event to the overall probability of success [17, 93, 66]. BDMP are therefore able to compute the likelihood of occurrence of past or future events, raising suspicion if relevant, as well as giving the most probable explanation from a probabilistic point of view.

Limits

BDMP cannot model every type of scenario. In many cases, restoring a faulty component to a functioning state ends up, after eventual propagation, to a functioning state of the system. However, in some cases such as a knowledge gain by an attacker, providing a countermeasure to the only abnormal component of the system does not prevent the scenario from going for-ward. To our comprehension and experimentations of the model, the way the triggers and the process selectors function mean that such situation cannot be handled by BDMP.
BDMP do not have an embedded metrics for the impact of events or sequences of events on the system and are therefore unable to compute risk. On more general aspects, even though BDMP require little to no other knowledge than the one already acquired by safety experts, they are quite tedious to construct and suffer from a poor adaptability of the model in case of a modification of the system. Automatic generation of BDMP are possible using the FIGARO language [19] and the KB3 platform.

Integration of fault trees and attack trees

Fovino et al. [47] mixed FT and AT, creating extended fault trees (EFT). Their reasoning is that the top event of an attack tree corresponds to an undesired event, similar to a basic event in a fault tree. Therefore, they basically plug the attack tree at the bottom of the fault tree, where the realisation of the attack objective could disrupt the system. To do so, they first adapt the attack tree so its structure complies with the fault tree framework and then connect it to the corresponding event or logical gate of the FT.
EFT represent an ingenious way to model both security and safety at the same time, but keeping the analysis tools of fault trees. However, they also inherit the same limits and are not able to overcome them (no time modelling, etc.).

Attack-Fault Trees

Attack-Fault Trees (AFT) were invented by Kumar and Stoelinga in 2017 [67]. Their proposal is to decompose an undesired event in smaller sub-goals until reaching basic component failures, basic attack steps or instant failures. The subgoals are described using logical gates taken from dynamic fault trees and attack trees such as the AND, OR, SAND, VOT(k)/n, PAND, FDEP and SPARE gates. In a way, attack-fault trees can be seen as a mix between fault trees and attack trees. The resulting model is then translated in several stochastic timed automata to compute some identified safety-security metrics: probability of disruption, expected cost of malicious disruption and mean-time to malicious disruption.
In their mixing of security and safety events, Kumar and Stoelinga are conscious to capture specificities of each events, much like BDMP do it. Where BDMP associate different Markov processes with each type of event, AFT do the same with stochastic timed automata. As such, they present a rather reasonable way of mixing safety events and security events. However, they still suffer from the inability to consider other probability distributions than the exponential one. Indeed, as we saw in section 2.2.2 other distributions are more suited to represent some types of events [114]. They also do not consider countermeasures or events that are neither failure nor attacks. Indeed, such events can provide meaningful context by tracking a relevant change in the system.

Semi-explicit correlation to obtain scenarios

CRIM has a correlation engine that is used to obtain complete scenarios from LAMBDA models. It draws links from postconditions of a LAMBDA model to the preconditions of another. The idea is that if the consequences of an event are also the conditions for another one, the realisation of the first event might trigger the second. It is therefore relevant to consider that there are sequential dependencies between the two events.
To express this formally, let us consider two LAMBDA models A and B. Let P ost(A) and P re(B) respectively the postcondition set of A and precondition set of B. P ost(A) and P re(B) are sets of predicates. A and B are said to be correlated if there exists predicates predA in P ost(A) and predB in P re(B) such that predA and predB are unifiable through a most general unifier (mgu) θ. Two correlated events correspond to two events that can potentially happen in sequence. They can therefore be represented in a directed graph to visually express this sequentiality, or dependency, for example. Using this principle, by trying to correlate every two pairs of LAMBDA models, CRIM is able to obtain the dependencies between every pair of events, and thus obtain complete scenarios out of atomically modelled events.

Table of contents :

1 Introduction
1.1 The new paradigm of the Industry 4.0
1.2 Redefining the relationship between safety and security
1.2.1 What are safety and security?
1.2.2 The intricacy of interactions
1.2.3 The difficulty of mixing safety and security
1.3 Different questions for diagnosis
1.3.1 The different types of diagnosis
1.4 Contributions
1.5 Organisation of the thesis
2 State of the Art
2.1 Chapter content
2.2 Safety models
2.2.1 Fault tree based models
2.2.2 Bayesian Networks
2.2.3 Petri nets
2.2.4 Safety models summary
2.3 Security models
2.3.1 Attack trees
2.3.2 Other attack graph based approaches
2.3.3 Bayesian Networks For Security
2.3.4 Dependencies models
2.3.5 Petri nets for security
2.3.6 Security models summary
2.4 Hybrid models
2.4.1 Boolean-logic Driven Markov Processes
2.4.2 Integration of fault trees and attack trees
2.4.3 Hybrid models summary
2.5 Classifying models based on their contributions to diagnosis
2.5.1 Description of categories
2.5.2 Classification of studied models
3 PROS2E: A new probabilistic representation of events
3.1 Introduction
3.2 Preliminaries: logical dependencies
3.2.1 Conditions and consequences
3.2.2 Semi-explicit correlation to obtain scenarios
3.2.3 Raising alerts for an on-line use
3.2.4 Advantages of this approach
3.3 Preliminaries: Probabilistic theory
3.3.1 Atomically modelling the time
3.3.2 Propagating the probability
3.3.3 Probabilistic Equivalences
3.3.4 Obtaining relevant values for the distributions
3.3.5 Recap
3.4 PROS2E
3.4.1 The event model
3.4.2 Obtainable information
3.5 An example on how to use PROS2E
3.5.1 Taum Sauk Hydroelectric Power Station
3.5.2 The scenario
3.6 Conclusion
4 Handling countermeasures
4.1 Introducing countermeasures
4.1.1 Defining countermeasures
4.1.2 What makes countermeasures different from regular events?
4.1.3 Countermeasures in the dependency model
4.1.4 Probabilistic representations of countermeasures in other models .
4.2 Motivating example
4.3 Probabilistic modelling of countermeasures
4.3.1 Computing probabilities with countermeasures, splitting the cases
4.3.2 Simplifying the situations
4.3.3 Probabilistic expression
4.4 Application to the study case
4.4.1 Modelling events and obtaining the graph
4.4.2 Computing metrics
4.5 Conclusion
5 Improving PROS2E with enhanced and accurate representation of the time
5.1 Motivating example
5.1.1 First diagnosis example: processing new alerts
5.1.2 Second diagnosis example: Purely accidental?
5.2 Why PROS2E needs improvement
5.2.1 Needing more sequentiality
5.2.2 Knowing when time is ticking or not
5.3 Modelling sequences
5.3.1 Correlation with a SEQ-AND
5.3.2 Probabilistic expression of the SEQ-AND
5.3.3 Nature of events: not just safety and security
5.4 Managing the wear
5.5 Application to the study case
5.5.1 Modelling events and obtaining the graph
5.5.2 First diagnosis example: processing new alerts
5.5.3 Second diagnosis example: Purely accidental?
5.6 Conclusion
6 Conclusion
6.1 Contributions of this thesis
6.2 Perspectives