Basic Concepts of Machine Learning – Project topics materials

Get Complete Project Material File(s) Now! »

Introduction to Cryptography

The terms Cryptography, from the Greek kryptòs (secret) and graphein (writing), and Cryptanalysis, denote two branches of a science named Cryptology, or science of the secret. Cryptography initially refers to the art of encrypting messages, which means writing meaningful messages in such a way to appear nonsense to anyone unaware of the encryption process. The readable message is referred to as plaintext, while the unintelligible output of the encryption is referred to as ciphertext. In general, cryp-tography aims to construct protocols to secure communication, while cryptanalysis studies the resistance of cryptographic techniques, developing attacks to break the cryptosystems’ security claims. These two complementary domains evolve in par-allel, since the evolution of attack techniques allows conceiving more resistant cryp-tographic algorithms, and inversely the resistance of such algorithms requires the conception of more sophisticated attacks.
The art of cryptography is very ancient, probably as ancient as the language, but only the development of information technology made cryptology take the shape of a proper science, sometimes referred to as Modern Cryptology. The last is seen as a branch of different disciplines, such as applied mathematics, computer science, elec-trical engineering, and communication science. Modern cryptosystems exploit al-gorithms based on mathematical tools and are implemented as computer programs, or electronic circuits. Their goal is to provide security functionalities for commu-nications that use insecure channels, for example the Internet. In particular, modern cryptosystems are designed in order to ensure at least one of the four following in-formation security properties:
a. confidentiality: the transmitted message must be readable only by a chosen pool of authorised entities;
4 Chapter 1. Context, Objectives and Contributions
b. authenticity: the receiver can verify the identity of the sender of a message;
c. non-repudiation: the sender of a message cannot deny having sent the message afterwards;
d. data integrity: the receiver can be convinced that the message has not been corrupted during the transmission.
Two branches of cryptography may be distinguished: the symmetric cryptogra-phy and the asymmetric cryptography. The first one historically appeared before and is based on the hypothesis that the two communicating entities share a common se-cret, or secret key; for this reason this is also called secret key cryptography. The second one, introduced around 1970, allows any entity to encrypt a message in such a way that only a unique chosen other entity could decrypt it; this is also called public key cryptography.
A general principle in cryptography, nowadays widely accepted by cryptogra-phy researchers, is the one given by Kerckhoff in 19th century: it states that cryp-tosystems should be secure even if everything about the system, except the key, is public knowledge. Following this principle, today many industrials and govern-mental agencies exploit, for their security services, cryptosystems based over stan-dardised algorithms. Such algorithms are of public domain, thus have been tested and tried to be broken by a large amount of people, before, during and after the stan-dardisation process. Resistance to many attempts of attacks is actually the strengths of standard algorithms. Low-level cryptographic routines, called primitives, are often used as building blocks to construct cryptographic protocols. We provide hereafter a description of a standard primitive, the symmetric AES, whose implementation will be the target of all experiments described in this thesis.

Description of AES

The Advanced Encryption Standard (AES) has been standardised in 2001 by the United States governmental agency National Institute of Standards and Technology (NIST), through the Federal Information Processing Standards Publication 197 (FIPS PUB 197) [NIS]. It is a block cipher, meaning that the encryption and decryption of the AES are functions that take as input a string (respectively the plaintext or the ciphertext) of fixed length over the binary alphabet. Indeed, the AES operates on blocks of 128 bits.1 There exist three versions of AES, characterized by the size of the used key:
1When a block cipher is used to encrypt a plaintext of different size, the plaintext is chunked into blocks of the appropriate one, and each block is encrypted accordingly to a so-called mode of operation.
128, 192 or 256 bits. The encryption is done by rounds. The number of executed rounds depends on the key size (10 rounds for 128 bits, 12 for 192 and 14 pour 256). The basic processing unit in AES algorithm is a byte. For AES internal operations, bytes are arranged on a two-dimensional array called the state, denoted s. Such a state has 4 rows and 4 columns, thus contains 16 bytes. The byte lying at the i-th row, j-th column of s will be denoted by si;j for i; j 2 f0; 1; 2; 3g. The 16 input bytes and the 16 output bytes are indexed column-wise as shown in Fig. 1.1. Each ele-ment si;j of the state is mathematically seen as an element of the Rjindael finite field, defined as GF (28) = Z=2Z[X]=P (X) where P (X) = X8 + X4 + X3 + X + 1. Five functions are performed during the AES, named KeySchedule, AddRoundKey, Sub-Bytes, ShiftRows and MixColumns. At high level the AES algorithm is described hereafter:
AddRoundKey
Each byte of the state is combined with the corresponding byte of the round key via an addition over the Rjindael field GF (28), i.e. a bitwise exclusive OR (XOR) operation .
SubBytes
The SubBytes transformation is a non-linear byte invertible substitution that oper-ates independently on each byte of the State using a substitution table (called Sbox). The SubBytes is composed of the following two functions:
the inversion in GF (28) where the element f00g is mapped to itself the affine transformation which maps each byte bi to: bi b(i+4)mod 8 b(i+5)mod 8 b(i+6)mod 8 b(i+7)mod 8 ci , (1.1) where ci is the ith bit of f63g = (01100011)2.
ShiftRows
The bytes in the last second, third and fourth rows of the State are cyclically shifted over 1, 2, and 3 byte(s) respectively.
MixColumns
Each column of the State is treated as a four-term polynomial. They are considered as polynomials over the Rjindael field GF (28) and multiplied modulo X4 + 1 with a fixed polynomial a(X) = f03gX3 + f01gX2 + f01gX + f02g.
KeySchedule
To lighten notations, the KeySchedule is described for the 128-bits cipher, which al-lows to fix many parameters to the value 4. For the 192-bits and 256-bits such param-eters have to be fixed respectively to 6 and 8. The key round of the initial round of AES coincides with the secret encryption key K = (k0;0; k0;1; : : : ; k0;3; k1;0; : : : ; k1;3; : : : ; k3;3). The i-th round key is given by
Ki = (k4i;0; k4i;1; : : : ; k4i;3; k4i+1;0; : : : ; k4i+1;3; : : : ; k4i+3;3); where k4i+a;b is calculated, for i > 0, a 2 f0; : : : 3g and b 2 f0; : : : ; 3g, as follows: 8k4i+a;b = k4i+a 4;b k4i+a 1;b if a 6= 0 < 4;b Sbox(k4i+a 1;(b+1)mod 4) Rcon(a) if a = 0 and b = 0 k4i+a;b = k4i+a >k4i+a;b = k4i+a 4;b Sbox(k4i+a 1;(b+1)mod 4) if a = 0 and b 6= 0 , where Rcon(a) = f02ga 1 in the Rjindael finite field,2 and Sbox is the substitution table used for the SubBytes transformation.

Secure Components

As we have seen in the previous section, modern cryptography proposes solutions to secure communications that ask for electronic computations and repose their se-curity over some secret keys. Keys are represented as long bit strings, very hard to be memorised by users. Thus, keys need to be stored in a secure medium, and never delivered in clear over insecure channels. Smart cards (or smartcards) were historically conceived as a practical solution to such a key storage issue: they con-sist in small devices a user can easily carry around with, which not only store secret keys, but also are able to internally perform cryptographic operations, in such a way that they can be involved in secure communication protocols, that do not require the delivering of the secret keys. The registrations of a first patent describing mem-ory cards by Roland Moreno in 1974 [Mor74], and of a second one describing cards equipped with microcontrollers by Michel Ugon in 1977 [Ugo77] are often referred to in order to date the smart card invention, finally produced for the first time in 1979. Smart cards are pocket-sized plastic-made cards equipped with a secure com-ponent, which is typically an integrated circuit containing some computational units and some memories.
Today, about 40 years after its invention, they still have a huge diffusion, both in terms of applicative domains and in terms of quantity of exemplars. Indeed, they serve as credit or ATM cards, healthy cards, ID cards, public transport pay-ment cards, fuel cards, identification and access badges, authorization cards for pay television, etc. Slightly changing the card support, we find other applications of the same kind of integrated circuits, for example the mobile phone SIMs (Subscriber Iden-tity Module) and the electronic passports. In terms of quantity, a marketing research [Abi] found out that in 2014 8.8 billion smart cards have been sold, i.e. the same order of magnitude of the global population.
In addition to smart cards, the recent growing and variation of security needs lead to the development and specification of other kinds of secure solutions, for example the Trusted Platform Module (TPM), which is a secure element providing cryptographic functionalities to a motherboard, or completely different solutions based on software layers, that are today in great expansions. An example is pro-vided by the Trusted Execution Environment (TEE), which is a software environment of the main processor of a smartphone or tablet, designed to assure resistance to software menaces.

READ Private Ownership of a Cultural Event: Do Attendees Perceive it as a Risky Lottery?

Embedded Cryptography Vulnerabilities

Side-Channel Attacks

Until the middle of the nineties, the security of embedded cryptosystems was con-sidered, in the public domain, as equivalent to the mathematical security of the em-bedded cryptographic algorithm. In classical cryptanalysis, an attacker usually has the knowledge of the algorithm (in accordance to Kerckhoff’s principle) and of some inputs and/or outputs. Starting from these data, his goal is to retrieve the secret key. This attack model considers the algorithm computation as a black box, in the sense that no internal variable can be observed during execution, only inputs and/or out-puts. With his seminal paper about Side-Channel Analysis in 1996, Paul Kocher showed that such a black-box model fails once the algorithm is implemented over a physical component [Koc96]: an attacker can indeed inspect its component during the execution of the cryptographic algorithm, monitor some physical quantities (e.g. the execution time [Koc96] or the instantaneous power consumption [KJJ99]) and deduct information about internal variables of the algorithm. Depending on the at-tacked algorithm, making inference over some well chosen internal variables (the so-called sensitive variables of the algorithm) is sufficient to retrieve the secret key. After these first works, it was shown that other observable physical quantities contained leakages on sensitive information; for example the electromagnetic radiation emanat-ing from the device [GMO01; QS01] and the acoustic emanations [GST14]. More-over, if until few years ago it was thought that only small devices, equipped with slow microprocessors and with small-sized architecture, such as smart cards, were vulnerable to this kind of Side-Channel Attacks, the last cited recent work about acoustic emanations, together with other works exploiting electromagnetic fluctu-ations, pointed out that much faster and bigger devices, i.e. laptops and desktop computers, are vulnerable as well [Gen+15; GPT15; Gen+16].

A Classification of the Attacks against Secure Components

The Side-Channel Attacks outlined in previous paragraph, and which are the main concern of this thesis, belong to a much bigger family of hardware attacks that can be performed to break cryptographic devices’ security claims. A classification for hardware attacks is briefly outlined in Tab. 1.1. They are commonly classified on the base of two criteria: on one hand we can distinguish passive and active attacks, on the other hand we can distinguish invasive, semi-invasive and non-invasive attacks.
Passive attacks: in passive attack, the device run respecting its specifications. The attacker observes its behaviour without provoking any alteration;
Active attacks: in active attacks a special manipulation is performed in order to corrupt the normal behaviour of the device.
Invasive attacks: in invasive attacks, the device is unpackaged and inspected at the level of the component technology. The circuit can be modified/broken, signals can be accessed via a probing station, etc. There is no limits to the manipulations an attacker can do to the component;
Semi-invasive attacks: as in invasive attacks the device is unpackaged, but in contrast to them, no direct electrical contact to the chip is done;
Non-invasive attacks: in non-invasive attacks the device is not modified and only accessible interfaces are exploited.
In the literature, the term Side-Channel Attacks (SCAs)3 commonly refers to the passive non-invasive attacks. Nevertheless, the techniques proposed under the name of SCAs, that always require the acquisition of some signals, might also in-clude attacks where the device is unpacked, in order to improve the signal ampli-tude. In this sense, SCAs belong to the semi-invasive group of attacks as well. Sim-ilarly, active non-invasive attacks are often referred to as Fault Injection Attacks, that might also be run in a semi-invasive way.
Beyond hardware attacks, there exists a second class of attacks that menaces the security of cryptographic devices: the software attacks. In contrast with hardware attacks, software attacks exploit vulnerabilities that are not related to the physical implementation of the cryptographic functionalities of the device: they are not based on hypotheses about the material execution of the cryptographic algorithms, but ex-ploit vulnerabilities of the software interfaces. A typical example of software attack consists in charging malware code into the device, enabling access to data and in-structions contained in memories (RAM or ROM), in order to retrieve, modify or destroy information they hold. In last years, together with the growing complexity of secure devices, attacks become more and more sophisticated and the boundary between hardware and software attack is more and more blurred. Moreover com-bined software/hardware attacks are being developed, e.g. [BICL11].

Table of contents :

Acknowledgements
I Context and State of the Art
1 Context, Objectives and Contributions
1.1 Introduction to Cryptography
1.1.1 Description of AES
1.2 Secure Components
1.2.1 Embedded Cryptography Vulnerabilities
1.2.1.1 Side-Channel Attacks
1.2.1.2 A Classification of the Attacks against Secure Components
1.2.2 Certification of a Secure Hardware – The Common Criteria
1.2.2.1 The actors
1.2.2.2 The Target of Evaluation and the security objectives
1.2.2.3 Evaluation Assurance Level and Security Assurance Requirements
1.2.2.4 The AVA_VAN family and the Attack Potential
1.2.2.5 The Evaluation Technical Report
1.3 This thesis objectives and contributions
1.3.1 The Preliminary Purpose of this Thesis: Research of Points of Interest
1.3.2 Dimensionality Reduction Approach
1.3.3 Towards Machine Learning and Neural Networks Approach
2 Introduction to Side-Channel Attacks
2.1 Notations and Probability and Statistics Recalls
2.2 Side-Channel Attacks: an Overview
2.3 Physical Nature of the Exploited Signals
2.4 Sensitive Variables
2.5 The Strategy Family
2.5.1 Simple Attacks
2.5.2 Collision Attacks
2.5.3 Advanced Attacks
2.6 The Shape of the Attack
2.7 The Attacker Knowledge
2.8 Efficiency of the SCAs
2.9 Advanced Attacks
2.9.1 Leakage Models
2.9.2 Distinguishers
2.10 Profiling Side-Channel Attacks
2.10.1 Template Attack
2.10.1.1 The Curse of Dimensionality
2.10.1.2 The Gaussian Hypothesis
2.10.2 Points of Interest and Dimensionality Reduction
2.11 Main Side-Channel Countermeasures
2.11.1 Hiding
2.11.2 Masking
3 Introduction to Machine Learning
3.1 Basic Concepts of Machine Learning
3.1.1 The Task, the Performance and the Experience
3.1.2 Example of Linear Regression
3.1.3 Example of Linear Model for Classification
3.1.4 Underfitting, Overfitting, Capacity, and Regularization
3.1.5 Hyper-Parameters and Validation
3.1.6 No Free Lunch Theorem
3.2 Overview of Machine Learning in Side-Channel Context
II Contributions
4 Linear Dimensionality Reduction
4.1 Introduction
4.2 Principal Component Analysis
4.2.1 Principles and algorithm description
4.2.2 Original vs Class-Oriented PCA
4.2.3 Computational Consideration
4.2.4 The Choice of the Principal Components
4.2.4.1 Explained Local Variance Selection Method
4.3 Linear Discriminant Analysis
4.3.1 Fisher’s Linear Discriminant and Terminology Remark
4.3.2 Description
4.3.3 The Small Sample Size Problem
4.3.3.1 Fisherface Method
4.3.3.2 SW Null Space Method
4.3.3.3 Direct LDA
4.3.3.4 ST Spanned Space Method
4.4 Experimental Results
4.4.1 The testing adversary
4.4.2 Scenario 1
4.4.3 Scenario 2
4.4.4 Scenario 3
4.4.5 Scenario 4
4.4.6 Overview of this Study and Conclusions
4.5 Misaligning Effects
5 Kernel Discriminant Analysis
5.1 Motivation
5.1.1 Getting information from masked implementations
5.1.2 Some strategies to perform higher-order attacks
5.1.2.1 Higher-Order Version of Projection Pursuits
5.1.3 Purpose of this Study
5.2 Feature Space, Kernel Function and Kernel Trick
5.3 Kernel Discriminant Analysis
5.3.1 KDA for dth-order masked side-channel traces
5.3.2 The implicit coefficients
5.3.3 Computational complexity analysis
5.4 Experiments over Atmega328P
5.4.1 Experimental Setup
5.4.2 The Regularisation Problem
5.4.3 The Multi-Class Trade-Off
5.4.4 Asymmetric Preprocessing/Attack Approach
5.4.5 Comparison with Projection Pursuits
5.5 Conclusions and Drawbacks
6 Convolutional Neural Networks
6.1 Motivation
6.2 Introduction
6.3 Neural Networks and Multi-Layer Perceptrons
6.4 Learning Algorithm
6.4.1 Training
6.4.2 Cross-Entropy
6.5 Attack Strategy with an MLP
6.6 Performance Estimation
6.6.1 Maximal Accuracies and Confusion Matrix
6.6.2 Side-Channel-Oriented Metrics
6.7 Convolutional Neural Networks
6.8 Data Augmentation
6.9 Experiments against Software Countermeasures
6.9.1 One Leaking Operation
6.9.2 Two Leaking Operations
6.10 Experiments against Artificial Hardware Countermeasures
6.10.1 Performances over Artificial Augmented Clock Jitter
6.11 Experiments against Real-Case Hardware Countermeasures
6.12 Conclusion
7 Conclusions and Perspectives
7.1 Conclusions
7.2 Tracks for FutureWorks
A Cross-Validation
B Artificially Simulated Jitter
C Kernel PCA construction
C.1 Kernel class-oriented PCA
Bibliography