Secure Enforcement for Global Process Specifications

Get Complete Project Material File(s) Now! »

Secure Enforcement for Global Process Specifications

Introduction

In this chapter, we consider high-level specifications of distributed applications, which focus on global interactions and hide local implementation details. These high-level specifications define a protocol interface, that can be used to program and verify the conformance of local implementations. In an adversarial setting, however, there is no guarantee that every local implementation complies with the protocol. In this chapter, we propose a generic defensive implementation scheme ensuring that the protocol is performed correctly.
The whole thesis concerns the construction of defensive implementation schemes; the next chapters consider lower level specifications that include local implementation details, and use a more concrete model of cryptography (computational instead of semantics). However, the higher level of abstraction of this chapter is adequate for the control-flow problem considered here and we believe that the semantics properties used to prove our theorems are easily implemented (and computationally proved) with basic cryptography.

Distributed Programming with Sessions

Distributed applications may be structured around their global interactions: the pattern in which they exchange messages (often called a protocol) is specified, but not their local implementation details. For example, we consider an e-commerce protocol, involving three participants: a customer, a merchant, and a bank. The customer wants to negotiate a good with the merchant, who wants to verify that the customer is able to pay. We can define the global interactions of this protocol :
First, the customer requests the price of the good to the merchant.
Then, the merchant responds with its price.
Then, the customer accepts or refuse the oﬀer.
Then, if the customer accepts the oﬀer, the merchant asks the bank for a wiring authorization.
Last, the bank sends an authorization or a refusal of payment.

This protocol specifies the global interactions, but gives no details on how the merchant chooses the price, how the customer estimates the oﬀer, or how the bank chooses if it authorizes the payment.

When the protocol is specified, each machine that takes part in the application is assigned a fixed initial process corresponding to its role in the protocol, e.g., customer (hence, often called a role) and may run any local code that implements its process, under the global assumption that all other participating machines will also comply with their respective assigned processes. In our example, the role of the bank specifies that it should send an authorization or refusal of payment when it receives the request of the merchant. Moreover, when the bank receives the request, it can assume that the protocol has been respected until this point. For instance, it can assume that the merchant contactes him only if the customer has accepted the transaction.

This approach yields strong static guarantees, cuts the number of cases to consider at runtime, and thus simplifies distributed programming. It is explored with binary sessions [Honda et al., 1998, Takeuchi et al., 1994, Gay and Hole, 1999] i.e., sessions with only two participants, and, more recently, multiparty sessions [Corin et al., 2007, Honda et al., 2008]. Within sessions, machines exchange messages according to fixed, pre-agreed patterns, for instance a sequence of inputs and outputs between a client and a server; these patterns may be captured by types, expressed as declarative contracts (also known as workflows), or more generally specified as processes.

Multiparty session types are an active subject of research. The multiparty session type theory is developed in process calculi [Bettini et al., 2008, Capecchi et al., 2010, Mostrous et al., 2009], and used in several diﬀerent contexts such as distributed object communication optimizations [Sivaramakrishnan et al., 2010], design by contract [Bocchi et al., 2011], parallel and web service programming [Pernet et al., 2010, Yoshida et al., 2009] and medical guidelines [Nielsen et al., 2010]. Deni´elou and Yoshida [2011a] use types to determine the maximum size of message buﬀers necessary to run the session. Deni´elou and Yoshida [2011b] study more dynamic sessions, where the number of participants can change during the session. Yoshida et al. [2010] study sessions parametrized by indexes. Dezani-Ciancaglini and de’Liguoro [2010] do an overview on session types.

The secure implementation problem

These global specifications provide an adequate level of abstraction to address distributed security concerns. Each machine is interpreted as a unit of trust, under the control of a principal authorized to run a particular role in the application. Conversely, these principals do not necessarily trust one another. Indeed, many applications commonly involve unknown parties on open networks, and a machine may participate in a session run that also involves untrustworthy machines (either malicious, or compromised, or poorly programmed). In this situation, we still expect session integrity to hold: an “honest” machine should accept a message as genuine only if it is enabled by the global specification. To this end, their implementation may perform cryptographic checks, using for example message signatures, as well as checks of causal dependencies between messages. In our example, the bank should not accept the merchant’s message if the customer did not accept the transaction. To this end, the bank may ask for the signed message of the customer as a justification for the merchant’s request.

Automated protocol synthesis for sequential sessions

Cryptographic implementation mechanisms need not appear in the global specification. They can sometimes be automatically generated from the specification. Corin et al. [2007] and Bhargavan et al. [2009] perform initial steps in this direction, by securing implementations of n-ary sequential sessions. Their sessions specify sequences of communications between n parties as paths in directed graphs. A global session graph is compiled down to secure local implementations for each role, using a custom cryptographic protocol that protects messages and monitors their dependencies. Their main security result is that session integrity, that is, the respect of the session specification, is guaranteed for all session runs—even those that involve compromised participants. They also report on prototype implementations, showing that the protocol overhead is small. Although their sequential sessions are simple and intuitive, they also lack flexibility, and are sometimes too restrictive. For instance, concurrent specifications must be decomposed into series of smaller, sequential sessions, and the programmer is left to infer any security properties that span several sub-sessions. More generally, they leave the applicability of their approach to more expressive specification languages as an open problem.

In this chapter, we consider a much larger class of session specifications that enable concurrency and synchronization within session runs; and we systematically explore their secure implementa-tion mechanisms, thereby enforcing more global properties in a distributed calculus. We aim at supporting arbitrary processes, and as such we depart significantly from prior work: our specifica-tion language is closer to a generic process algebraic setting, such as CCS [Milner, 1989]. On the other hand, we leave the cryptography implicit and do not attempt to generalize their concrete protocol design. The example below illustrates our session calculus, with concurrent specification.

Example 1 To illustrate our approach, consider a simple model of an election, with three voting machines (V1, V2, and V3) and one election oﬃcer machine (E) in charge of counting the votes for two possible candidates (c1 and c2) and sending the result (r1 or r2) to a receiver machine (R).

We specify the possible actions of each machine using CCS-like processes (defined in Section 2.2), as follows: V1 = V2 = V3 = c1 + c2 E = c1.c1.r1 | c2.c2.r2 R = r1 + r2.

Machines running process Vi for i = 1, 2, 3 may vote for one of the two candidates by emitting one of the two messages c1 or c2; this is expressed as a choice between two output actions. The machine running process E waits for two votes for the same candidate and then fires the result; this is expressed as a parallel composition of two inputs followed by an output action. Finally, the machine running process R waits for one of the two outcomes; this is expressed as a choice between two input actions.

Protocols and applications

We now give an overview of the protocols and the attacker model. Our processes represent protocol specifications, rather than the application code that would run on top of the protocol. Accord-ingly, our protocol implementation monitors and protects high-level actions driven by an abstract application. The implementation immediately delivers to the application any input action enabled by the specification, and sends to other machines any action output enabled by the specification and selected by the application. The application is in charge of resolving internal choices between diﬀerent message outputs enabled by the specification, and to provide the message payloads. In our e-commerce example, the application on the bank machine is in charge of accepting or refus-ing the wiring, while our implementation is in charge of verifying that the merchant’s request is legitimate. And in Example 1, the application on the voters machine is in charge of choosing a vote.

We intend that our implementations meet two design constraints: (1) they rely only on the global specification—not on the knowledge of which machines have been compromised, or the mediation of a trusted third party; and (2) they do not introduce any extra message: each high-level communication is mapped to an implementation message. This message transparency constraint excludes unrealistic implementations, such as a global, synchronized implementation that reaches a distributed consensus on each action. For instance, the implementation of our e-commerce example cannot add a message from the customer to the bank confirming that he accepted the trade. And in Example 1, the voters cannot directly send a confirmation to the receiver machine.

Attacker Model

We are interested in the security of any subset of machines that run our implementation, under the assumption that the other machines may be corrupted. In Example 1, for instance, the receiver machine should never accept a result from the election oﬃcer if no voter has cast its vote.

Our implementations provide protection against active adversaries that controls parts of the session run: We assume an unsafe network, so the adversary may intercept, reorder, and inject messages—this is in line with classic symbolic models of cryptography pioneered by Dolev and Yao [1983]. We also assume partial compromise, so the adversary may control some of the machines supposed to run our implementation, and run instead arbitrary code; hence, those machines need not follow their prescribed roles in the session specification and may instead collude to forge messages in an attempt to confuse the remaining compliant, honest machines.

We focus on global control flow integrity, rather than cryptographic wire formats; thus, we assume that the message components produced by the compliant machines cannot be forged by our adversary. (This can be achieved by standard cryptographic integrity mechanisms, such as digital signatures.) Conversely, the adversary controls the network and may forge any message component from compromised machines. For instance, our implementation of the e-commerce example should ensures that a corrupt merchant cannot forge a convincing message to the bank without the customer agreement. And in Example 1, a corrupt election oﬃcer should not be able to forge a result without the corresponding votes. Also, we do not address other security properties of interest, such as payload confidentiality or anonymity.

Contents

Section 2.2 defines a global semantics for specifications. Section 2.3 describes their generic imple-mentations, and states their soundness and completeness properties. Section 2.4 considers binary specifications. Section 2.5 explains our diﬃculties with implementing multiparty specifications. Section 2.6 defines our type system to avoid unimplementability. Section 2.7 presents our history-tracking implementation and establishes its correctness for well-typed specifications. Section 2.8 considers sequential n-ary sessions. Section 2.9 concludes.

Global process specifications

We consider specifications that consist of distributed parallel compositions of local processes, each process running on its own machine. In the remainder of the chapter, we let n be a fixed number of machines, and let P range over global process specifications, that is, n-tuples of local processes (P0, . . . , Pi, . . . , Pn−1).

Syntax and informal semantics

Our local processes use a CCS syntax, given below. For simplicity, we choose not to include message payloads in our description of the protocol, since the enforcement of their security is mostly independent to the enforcement of the session control flow (our focus in this chapter). Also, our message outputs are asynchronous (i.e., no action can directly depend on the correct emission of a message) since, in a distributed protocol, it is not possible to know whether a message has been received (without an extra acknowledgment message).

Distributed process implementations

We describe distributed implementations of the specifications of Section 2.2, each process being mapped to one machine. A distributed implementation defines the messages the machine can send or receive depending on its implementation state, but it also sets their assumptions on the adversary, i.e., what the adversary is able to do depending on which machines are compromised. Hence, we first specify how we separate compliant (honest) machines from compromised (dishonest) machines, then we define their implementations, their semantics, and their properties (soundness and completeness). The implementation of the compliant machines does not depend on knowing which machines are compromised, as discussed in Section 2.1. However, the global implementation semantics that defines the possible executions in our adversarial model is necessarily parametrized by the set of compliant machines.

We let C range over subsets of {0, . . . , n − 1}, representing the indexes of P whose machines are compliant. We let P range over tuples of processes indexed by C. Intuitively, these machines follow our implementation semantics, whereas the other machines are assumed to be compromised, and may jointly attempt to make the compliant machines deviate from the specification P . For instance, if the election oﬃcer of Example 1 is compromised, it may immediately issue a result r1 without waiting for two votes c1. Similarly, a compromised voter may attempt to vote twice. We are interested in protecting (the implementations of) compliant machines from such actions.

Next, we give a generic definition of process implementations. In the following sections, we show how to instantiate this definition for a given specification. Informally, an implementation is a set of programs, one for each specification process to implement, plus a definition of the messages that the adversary may be able to construct, in the spirit of symbolic models for cryptography [Dolev and Yao, 1983].
Definition 1 (Distributed implementation) A distributed implementation is a triple γ (abbreviated Q, −→, ⊢C ) where, for each i ∈ 0..n − 1 and each (Qi)i<n, (−→i)i<n, (⊢C )C⊆0..n−1 C ⊆ 0..n − 1, Qi is an implementation process; γ • −→i is a labeled transition relation between implementation processes; • ⊢C is a relation between message sets and messages.

In the definition, each Qi is an initial implementation processes, and each −→i is a transition relation between implementation processes of machine i, labeled with either the input (M ) or output (M ) of a message M (the syntax of M is left abstract). We use the notation Qi only for implementation processes. Further processes are denoted Q′i, Q′′i,. . . . We let γ range over M and M , and let ψ range over sequences of γ, representing low-level traces. For each implementation process, these transitions define the messages that may be received and accepted as genuine and the messages that may be sent. We assume that every message M implements a single high-level communication α. (Intuitively, M is an unambiguous wire format for α, carrying additional information.) We write ρ(γ) for the corresponding high-level communication α, obtained by parsing (with, ρ(M ) = ρ(M )).

The relations ⊢C model the capabilities of an adversary that controls all machines outside C to produce (or forge) a message M after receiving (or eavesdropping) the set of messages M. For instance, if the implementation relies on digital signatures, the definition of ⊢C may reflect the capability of signing arbitrary messages on behalf of the non-compliant machines.

The transition relation on machine 0 allows its local program to send a message a to machine 1; then, it transmits the first message b received from machine 1. In particular, we choose not to allow the reception of b before the emission of a since this behavior is not possible in the global specification. Conversely, the transition relation on machine 1 transmits the first message a received from machine 0 to the program on machine 1; then, it allows it to send a message b to machine 0. Concretely, the first adversary rule models an open network, where the adversary can replay messages. The second adversary rule is reminiscent of symbolic cryptography models. Suppose that each participant has its private key and the public key of the other participant. If every participant sign its messages and verify the signature of the messages it receives, the adversary can produces fake messages for compromised participants, as modeled in the second rule.
(A0, A1), (−→0, −→1), (⊢0,1, ⊢0, ⊢1) is a distributed implementation.

Distributed semantics (→−I)

For a given distributed implementation Q, →−, ⊢C and a given set of compliant machines C, we define a global implementation semantics that collects all message exchanges between compliant machines on the network. In our model, compromised processes do not have implementations; instead, the environment represents an abstract adversary with the capabilities specified by ⊢C .

Thus, the global implementation transitions −→I relate tuples of compliant implementation processes Q′ and also collects all compliant messages M exchanged on the network (starting with an γ M ′, Q′, with the following two rules:

Rule (SENDI) simply records the message sent by a compliant participant, after ensuring that its index i matches the sender recorded in the corresponding high-level communication i a j (so that no compliant participant may send a message on behalf of another). Rule (RECEIVEI) enables a compliant participant to input any message M that the adversary may produce from M (including, for instance, any intercepted message) if its index matches the receiver i in the corresponding high-level communication j a i; this does not aﬀect M, so the adversary may a priori replay M several.

Table of contents :

1 Introduction
1.1 Enforcing security by construction
1.2 How to specify security?
1.3 How to enforce security?
1.4 Our work
1.5 Roadmap
2 Secure Enforcement for Global Process Specifications
2.1 Introduction
2.2 Global process specifications
2.3 Distributed process implementations
2.4 Implementing two-party specifications (application)
2.5 Unimplementability
2.6 Implementability by typing
2.7 History-tracking implementations
2.8 Sequential multiparty sessions (application)
2.9 Conclusion
2.10 Proofs
3 Information-Flow and Non-Interference
3.1 An imperative probabilistic While language
3.2 Security policies
3.3 Information-flow and Non-interference
3.4 Declassifications and endorsements
3.5 Properties of program transformations
3.6 Proofs
4 Cryptographic Information Flow Compiler
4.1 Cryptographic Information Flow Compiler
4.2 Cryptographic assumptions
4.3 Security and completeness of the compiler
5 Information-Flow Types for Homomorphic Encryptions
5.1 Introduction
5.2 Datatypes for encryptions and keys
5.3 Typing Encryptions
5.4 Blinding Schemes: Security and Typing
5.5 Homomorphic encryptions
5.6 Computational Soundness
5.7 Private Search on Data Streams
5.8 Bootstrapping Homomorphic Encryptions
5.9 Proof of Theorem 10
6 Compiling Information-Flow Security to Minimal Trusted Computing Bases
6.1 Programming with TPMs
6.2 An imperative higher-order language
6.3 Information flow security
6.4 Command semantics for secure instructions
6.5 Attested boot sequences
6.6 Implementing virtual hosts on TPMs
6.7 Experimental results
6.8 Proofs
Bibliography