Intelligent behavioral based IDS: Neural networks basis

Get Complete Project Material File(s) Now! »

Reverse security priority in the industrial world

The cybersecurity field is based on three CIA triads which are confidentiality, integrity and availability [Ross 2020] [Lee and Jang 2009]. These three security properties are defined below:
• Confidentiality means that the access to data or equipment has to be restricted. Equipment or information have to be accessible only for authorized persons [Kumar et al. 2015]. The violation of this security requirement could be exploited by hackers who can disrupt services and steal data [Yin et al. 2002].
• Integrity implies that data remains accurate and consistent throughout its life cycle. Users’ requests to services have to be made correctly. The violation of this charac-teristic could alter, lose and compromise data [Kumar et al. 2015]. Any alteration or modification in the data is made by authorised persons and according to a defined policy.
• Availability signifies that data or equipment have to be accessible and services have to be provided without interruption [Yin et al. 2002]. This securing characteristic depends sometimes on some conditions related to time slots and the people autho-rised to use them.
To propose an eﬃcient securing mechanism, these three requirements have to be fully considered. The priority and the significance of these requirements diﬀer according to the context (application field and the data inside the information system) [P. Williams 2001]. Between IT and OT worlds, the priority is reversed. Confidentiality is the priority number one in IT world. Ensuring confidentiality in a company allows not only to have a good image and high level of credibility, but also to assume legal responsibility to avoid the leak and theft of sensitive customer data. The violation of this security property could expose a company to consequent legal proceedings. In the IT world, ensuring confidentiality is not an option but an obligation. While confidentiality is the primary concern for IT engineers, availability or integrity seems to be the top priority for ICS systems according to activity sector. Industry kind is decisive to prioritize either availability or integrity. For instance, availability of material and data is of a major importance in automotive sector instead of integrity and confiden-tiality. However, pharmaceuticals industries are concerned that the proportions of raw materials used in their products may be disrupted which could compromise the quality of their products and their notoriety.

Real time constraint

IT and OT control systems have diﬀerent requirements. Latency and delay are not al-lowed in industry and if they occur, it could be expensive for manufacturers. Unlike IT equipment that requires high throughput and allows a response delay, ICS are time-critical equipment and in some particular cases, a real-time response is required [Stouﬀer et al. 2015]. In Industry 4.0, real-time aspect is very important. This notion encompasses other con-cepts which are presented below :
• Real-time alerts : Alerts can be raised, machines can be proactively maintained and operators can do remote maintenance [i-scoop 2017]. Today, monitoring and predictive diagnostic and maintenance have became possible.
• Real-time capability : This characteristic consists of being able to acquire data, transmit and evaluate it in a short time frame [Basler 2020]. This specificity could be major for applications where speed is required. All advanced analytic, both IIoT and smart industrial production participate all to the development of real-time capability’s notion.
• Real-time information : Thanks to the acquisition and analysis of real-time produc-tion information, today we are able to optimize a production process and reduce unnecessary tasks. This real-time data could also help predicting dysfunction prob-lems and plan proactive maintenance. It also contributes to a good decision-making about particular situations. Therefore, it allows an eﬃcient production in the smart factory.

Convergence IT/OT

Industry 4.0 is based on the connection of the numerical world to the physical worlds through factory 4.0 that contains cyber-Physical Systems (CPS). Before the advent of the 4th industrial generation, Industrial Control Systems (ICS) were isolated and more protected from the outside world. Today that this border is opened with the emergence of the industrial internet in the factories, the opening of industrial systems to external world and the democratization of IP, the convergence of Information Technologies (IT) and Operation Technologies (OT) becomes a reality. This convergence allows to industries the automation of the entire life cycle from design step to production along with several challenges.
OT operators are well aware of physical security threats and have implemented safety measures in industrial systems for decades. However, they face today threats that are potentially beyond their control. Since machines and control systems are no longer close and isolated, the hacking threat is introduced, with the endangerment of employees such as through overheating or cancellation of emergency stops. To face these threats, OT world has to work with IT world, but some concerns arise. IT engineers generally have little experience with industrial systems and their traditional security solutions are often incompatible with existing control systems. In [19] author says that OT and IT tend to use diﬀerent approaches to problem-solving. IT implements solutions using a top-down approach while in OT professionals are using a bottom up approach solutions, starting from the individual components to build a more complex system.
In addition, the attacks that were played in the past in IT system could be replayed today in ICS. Furthermore, industrial concurrency have increased during this last decade. Therefore, data violation has to be avoid and confidentiality of the transmitted data must be guaranteed with the securing mechanisms proposed by IT engineers.

Equipment and protocols heterogeneity

Another challenge is observed in industry 4.0 which is the heterogeneous equipment. While IT world is used to frequent and regular patches and software updates, industrial equipment are too obsolete and outdated tending to take a more systemic approach which make their configuration incompatible with the standard security oﬀers of IT teams. Furthermore, in industry we find both wired and wireless equipment. Some of them are still based on the traditional protocol such as IP/TCP Modbus/TCP and other are using some new protocols related to Industrial Internet Of Things (IIoT) such as OPC-UA, MQTT, Lora, zigBee, Sigfox…
In this context, industry has to put in place a good strategy for an adaptive combination between equipment and interfaces of all of these components [Zhou, Taigang Liu, and Lifeng Zhou 2015]. This heterogeneity raises more challenges to industry 4.0. These challenges consist of equipment connectivity management, analysis of a huge heterogeneous data sources but above all, the main challenge is the proposal of an adaptive cybersecurity strategy to face the cyber threats. IT and OT convergence complicates this task since some proposed securing techniques are suitable for some IT equipment such as encryption but this technique generates a delay which is not allowed in OT environment. The interoperability of the proposed securing mechanisms has to be guaranteed.

Towards anomaly-based intrusion detection systems

Anomalies are part of the industrial cycle. In industry, the risk of anomaly is inherent to any manufacturing process, but eﬃciency must come first. It is necessary to acquire the capacity to protect against any break in production continuity by anticipating possible failures, detecting anomalies suﬃciently upstream so as avoid the damages that could be costly, especially in vital sectors. These anomalies could be as well intentional as unintentional origin. The border between these both could be very thin. Currently, in industrial intrusion detection system field, the main diﬃculty is how to distinguish dysfunction anomalies from real intrusions to reduce the false positive alerts. For this purpose, we combine in the proposed approach two anomaly-based intrusion detection systems:
• Specification-based IDS: it is one of the two kinds of the anomalies-based IDS. A specifications-based IDS builds a system reference model from its properties which constructs patterns and rules [Caselli et al. 2016]. In the proposed specifications-based IDS, MESA model extracted from ISA-95 standard is used to define these specifications. This model is described in details in the next chapter. It takes as input, 13 rules that we identified from MESA model (see chapter 3). To check the nominal function of our production line, a rules set is defined. The IDS raises an alert as output if one of these rules is not respected and the error is reported in the IDS logs for more investigations by the operator.
• Behavioral-based IDS: this one represents the second type of anomalies-based IDS. It defines a behavior reference model from features and observes their changes. Any activity which diﬀers from this defined reference behavior is considered as abnormal [Salwa Alem et al. 2019] [Hodo et al. 2016]. Diﬀerently from specification-based IDS, behavior of features is derived by a learning phase.
This IDS takes as input the network traﬃc features that we pre-processed and la-beled. It is an intelligent IDS based on neural network model. From an extracted dataset, we train a neural network model to recognize normal activity from the malicious one. The IDS pushes an alert if the activity deviates from the reference defined model. The alerts are listed in the IDS logs to be used by the third mod-ule which is responsible of a decision making regarding the nature of the detected anomaly.

READ Paper “Central mechanisms of itch: a systematic literature review and meta-analysis”

Specification-based IDS

In [Carcano et al. 2011], the authors propose a critical state-based IDS which manages and monitors the evolution of SCADA states. This approach is composed of three main modules to detect abnormal activity and lead a SCADA system to a critical state. These modules are: « System Description and Critical State Representation » where the sys-tem is described using Industrial State Modelling Language (ISML). ISML formalises the condition-action of an industrial system. « A State Evolution Monitor » which tracks the system states and how they evolve. Using ISML, a virtual image of the managed system is created as an in-memory model. The latter is fed by network traﬃc to behave like a real system. And the third is « A Critical State Detector » which establishes whether goes one of the previous monitor states through a Critical State (CS) or not by checking the CS rules defined using ISML language. To the previous modules, the authors have added multidimensional metrics to predict the critical state. These metrics are: state-state distance, state-critical states distance and distance evaluation metrics. Their performance seems good but their proposed IDS targets a particular class of attacks aiming to lead an industrial system to a critical state. In [Parvania et al. 2014], the authors propose a new hybrid IDS approach. The first step of this approach is the definition of a set of rules describing the normal behavior. They define five rules between rules related to network, like master IP, and others related to process, like operation time. After defining these rules, the authors model the normal behavior and consider any behavior which deviates from the normal model as an intrusion. They applied their approach on the Modbus protocol and use Bro to implement their rules. They test their approach in an emulated environment with a real PLC. The authors rate the detection attack of their IDS as accurate and that it gives a good performance.
In [Caselli et al. 2016], a new specification-based IDS approach is proposed using the doc-umentation of the networking system. The documentation is automatically used to reduce the human eﬀort required to define the specification rules. The approach is composed of three steps: system discovery, features lookup, and rule definition. This IDS is imple-mented in a real environment using Building Automation and Control protocol (BACnet) and identifies process control errors and the level of danger caused by misconfiguration. In [Kabir-Querrec et al. 2015], the authors present a new resilient IDS implemented in the Ethernet layer. It analyses the vertical flows exchanged between the devices in the ground level and SCADA system. The latter receives an alert if an intrusion is detected. This IDS is implemented for an electrical station using GOOSE protocol. The authors simulated Ethernet storms and usurped GOOSE attacks. After receiving alerts, the In-telligent Electronic Device (IED) control is rewritten to consider the alerts.

Table of contents :

List of figures
List of Tables
0. Acronyms
1 Introduction
1.1 Cybersecurity in industry 4.0 : Overview
1.2 Industrial cybersecurity problematic
1.2.1 Reverse security priority in the industrial world
1.2.2 Real time constraint
1.2.3 Convergence IT/OT
1.2.4 Equipment and protocols heterogeneity
1.3 Thesis motivations and positioning
1.3.1 Thesis Motivations
1.3.2 Thesis positioning
1.3.2.1 IEC-62443 standard
1.3.2.2 Towards anomaly-based intrusion detection systems .
1.3.2.3 Positioning: Manufacturing Executive System (MES) .
1.4 Organisation of the dissertation
2 State of the art of industrial IDS and datasets
2.1 Intrusion Detection System (IDS)
2.2 Conventional industrial IDS
2.2.1 Signature-based IDS
2.2.2 Anomaly-based IDS
2.2.2.1 Specification-based IDS
2.2.2.2 Behavioral-based IDS
2.3 Industrial IDS for IoT equipment
2.4 Existing datasets
2.4.1 Public datasets
2.4.2 Non-public datasets
2.4.3 Industrial datasets
2.4.4 Conclusions and discussion
3 Intelligent behavioral based IDS
3.1 Introduction
3.1.1 Intelligent behavioral based IDS: principle
3.1.2 Intelligent behavioral based IDS: assumptions
3.1.3 Intelligent behavioral based IDS: Neural networks basis
3.1.4 Neural network : motivation
3.2 Experimentation platform
3.3 IDS basis: industrial dataset
3.3.1 Methodology
3.3.2 Design criteria
3.3.3 Dataset generation process
3.3.4 Dataset model
3.3.5 Attacks simulation for dataset generation
3.3.5.1 Attacks choice motivations
3.3.5.2 Attack scenarios
3.3.6 Data acquisition: Extractor
3.3.7 Labelling and pre-processiong data
3.3.8 Dataset extension approach to other protocols
3.3.9 Digital description of the dataset
3.4 Results
3.4.1 Neural network: Experimental parameters
3.4.2 Neural network: Graphical interface
3.4.3 Neural network: Performance results
3.5 Conclusions and discussion
4 Specification-based IDS
4.1 Introduction
4.2 Specification-based IDS: Global view
4.2.1 Specification-based IDS principle
4.2.2 Specification-based IDS assumptions
4.2.3 Specification-based IDS: Motivations
4.3 Specification-based IDS: basis
4.3.1 The MESA Model
4.3.2 MES database: tables
4.4 Identified anomalies in the industry
4.4.1 Sequential anomalies
4.4.2 Temporal anomalies
4.4.3 Content anomalies
4.4.4 Added metrics : ISO 22400 standard
4.5 Anomalies illustration: use case
4.5.1 Context
4.5.2 Use case
4.6 IDS formalism
4.7 IDS tool: Technical specifications
4.8 Results
4.9 Conclusions and discussion
5 BI-ANOmaly-based IDS: BIANO-IDS
5.1 Introduction
5.2 Approach: Global view and principle
5.3 Approach: BIANO-IDS components
5.4 Decision Making System: DMS
5.4.1 Decision Making System: theory
5.4.1.1 Decision Making System (DMS): Definition
5.4.1.2 Decision Making System (DMS): theoretical steps and markers
5.4.2 DMS: Model and global view
5.4.3 DMS: Principle and rules
5.4.4 DMS: Programming
5.4.5 DMS: alerts classification
5.5 Results
5.6 Conclusions and discussion
6 Conclusions and perspectives
Conclusions and perspectives
6.1 Summary
6.2 Contribution
6.3 Limitations
6.4 Perspectives and future works
Bibliography