Privacy-preserving classification based on quadratic polynomial functional encryption

Get Complete Project Material File(s) Now! »

Decisional Diﬃe-Hellman assumption

The Decisional Diﬃe-Hellman assumption (DDH) has been used to prove that the Diﬃe-Hellman protocol [DH76a] was useful for practical crypto-graphic purposes. It speaks about groups and in practice those are finite groups. The computational Diﬃe-Hellman assumption (CDH) which states that in a particular group, no eﬃcient algorithm can compute gab with only g, ga, gb, was not suﬃcient. The DDH assumption is way stronger than the CDH assumption. Assuming that DDH stands in a group means that there is no eﬃcient probabilistic algorithm that given any triplet ga, gb, gc outputs “true” if a = bc and “false” otherwise. We note that there exist groups where DDH assumption is false but for which the CDH assumption is believed to be true. We now give the formal definition of the DDH assumption. For more details about CDH and DDH please refer to [Bon98, Gal12].
Definition 1. [Bon98] In a cyclic group G of prime order q, the Decisional Diﬃe-Hellman (DDH) problem is to distinguish the distributions D0 = {(g, ga, gb, gab)|g R – G, a, b R – Zq} and D1 = {(g, ga, gb, gc)|g R – G, a, b, c R – Zq}.

Extended learning with errors assumption

The extended learning with error problem has been introduced by O’Neil, Peikert and Waters [OPW11]. It can be described as LWE↵,q, with a number m fixed. The problem is to distinguish between the following two distribu-tions (where z is sampled from a specific distribution): D 0 = {(A, A · s + e, z, he, zi)|A – Zm⇥n, s – Zn, e – Dm } q q Z,↵q and D1 = {(A, u, z, he, zi)|A – Zmq⇥n, u – Zmq, e – DZm,↵q}.
Many variants of the learning with error problem exist. One of those is the multi-hint extended-LWE problem [ALS16]. Its definition follows. Definition 2. [ALS16] The multi-hint extended-LWE problem is a variant of extended-LWE problem for which multiple hints are given for the same noise term. It exists a reduction from LWE to mheLWE [ABDCP15]. It is formal-ized as follows. Let q, m, t be integers, ↵ be a real and ⌧ be a distribution over Zt ⇥ m , all of them functions of a parameter n. The multi-hint extended-LWE problem mheLWEq,↵,m,t,⌧ is to distinguish between the distributions of the tuples (A, A · s + e, Z, Z · e) and (A, u, Z, Z · e) .

Public-key setting versus private-key setting

Because of the public-key setting, every constant in the function associated with a secret key is known by its owner. Indeed, since Bob is able to encrypt every message xi with Alice’s public key, and also to decrypt all of them with his secret key skf associated with the function f , he gets an unlimited number of couples (xi, f (xi)) which enables him to determine the constants of f . On the contrary, with a private-key setting, it is possible to have hidden values in secret key functions. There is, of course, no public key in such systems, and the encryption algorithm takes instead as input the master secret key. When it is about private-key functional encryption, we can find in the literature some inner-product functional encryption schemes [BS15, SSW09, BRS13], but also some multi-input inner-product functional encryption schemes [BKS16, DOT18, KS17]. This thesis focuses on public-key functional encryption.

Functional encryption’s security

A fundamental security requirement for functional encryption schemes is called collusion resistance. Indeed, the authority of a functional encryp-tion scheme delivers several secret keys associated with diﬀerent functions (authorized function evaluation) to the users. The idea behind collusion resistance is that if a user owns diﬀerent secret keys {skfi }i and an en-cryption of x, he cannot learn about x more than {fi(x)}i (the set of the evaluation outputs). This property is captured by two security definitions: indistinguishability-based security and simulation-based security.
The idea behind indistinguishability-based security (IND) is that an at-tacker is given two messages x0 and x1 such that, for all the secret keys {skfi } he owns, fi(x0) = fi(x1). Then, when he gets an encryption of xb, he has to determine if b = 0 or if b = 1. It shows how a single message can remain secure against an arbitrary number of users performing a collusion. However, it appears that the IND definition is not strong enough: indeed, it has been shown [BSW11, O’N10] that a trivially insecure scheme can be proved IND-secure. The simulation-based security (SIM) means that a secret key associated with a function f makes it only possible to get f (x) from an encryption of x. Some results [BSW11, BO13, AGVW13] show that the SIM definition is not always achievable. To sum up, the SIM-based security is has real security guarantee but might be impossible to achieve for som cases, and the IND-based security does not prevent a scheme from being insecure.

Diﬀerence between functional encryption and fully homomorphic encryption

Traditional public-key cryptography has been generalized in many ways, among others, Functional Encryption (FE) but also Fully Homomorphic En-cryption (FHE) [G+09, AGH10, VDGHV10, BGV12, Bra12, FV12, GSW13, BGV14, CGGI16, BDF18]. Both enable to compute algorithms over encryp-ted inputs but one diﬀerence between them is that FE’s decryption algorithm output is unencrypted, when FHE one’s remains encrypted. Figure 2.5 shows an overview of an FHE system. There is no need of a trusted authority within FHE systems. It works as a traditional public key system except that there are two additional algorithms: add and mul. The first one takes two ciphertexts cm and cm0 of two plaintexts m and m0, and it returns a new ciphertext, which is the encryption of the message m + m0. The second algorithm takes also cm and cm0 bur returns a new ciphertext, which is the encryption of the message m ⇥ m0. It means

Table of contents :

1 Introduction
I State of the art
2 Functional Encryption
2.1 Public key cryptography
2.1.1 Decisional Diffie-Hellman assumption
2.1.2 Extended learning with errors assumption
2.2 Introduction to Functional encryption
2.2.1 Public-key setting versus private-key setting
2.2.2 Functional encryption’s security
2.2.3 Difference between functional encryption and fully homomorphic encryption
2.2.4 Functional encryption’s leakage
2.3 Inner-product functional encryption
2.3.1 Inner-product functional encryption based on the DDH assumption
2.3.2 Inner-product functional encryption based on the LWE assumption
2.4 Functional encryption beyond the inner-product functionality
3 Necessary machine learning background
3.1 Machine learning basics
3.1.1 Supervised machine learning
3.1.2 Unsupervised machine learning
3.2 Classification algorithms
3.2.1 Linear classification
3.2.2 Extremely randomized trees classifier
3.3 Artificial neural networks
3.3.1 Fully connected neural network
3.3.2 Convolutional Neural Networks
3.3.3 Generative Adversarial Networks
3.3.4 Activation functions
3.3.5 Neural network measures
3.4 Other related details
3.4.1 MNIST dataset
3.4.2 Principal component analysis
II Contributions
4 Privacy-preserving classification based on functional encryption
4.1 Overview
4.2 Implementation of inner-product functional encryption schemes
4.3 Privacy-preserving classification based on inner-product functional encryption
4.3.1 Linear classification
4.3.2 Extremely randomized trees classification
4.3.3 Neural network classification
4.3.4 Experimental results
4.4 Privacy-preserving classification based on quadratic polynomial functional encryption
4.4.1 Experimental results
4.5 Discussion
5 Leakage-based attacks on functional encryption settings
5.1 Context of the attack
5.2 Attack methods
5.2.1 Principal component analysis
5.2.2 Fully connected network
5.2.3 Convolutional network
5.3 Results analysis
5.3.1 Digit comparison
5.3.2 Privacy-preserving classification and MNIST datasetuse case
5.4 Generalization to others functional encryption schemes
5.5 Discussion
6 Tradeoff between classification performance and attack efficiency
6.1 Vectors with the best classification performance
6.2 Generating vectors with different approaches
6.2.1 Random vectors
6.2.2 GAN obtained vectors
6.3 Results analysis
6.4 Discussion
7 Conclusion
8 R´esum´e en fran¸cais