Introduced in [Cas00], the quasi-synchronous approach is a set of techniques for building distributed control systems. It is a formalization of practices that Paul Caspi observed while consulting in the 1990s at Airbus, where engineers were deploying Lustre/SCADE [HCRP91] designs onto networks of non-synchronized nodes communicating via shared memories with bounded transmission delays.
In contrast to the Time-Triggered Architecture (TTA) [Kop11], the quasi-synchronous approach does not rely on clock synchronization. Processors execute periodically with the same nominal period and communicate via a reliable network, but the activation periods of the processors and communication delays are subject to jitter. We call this a quasi-periodic architecture. This is a classic model, also called synchronous real-time model in the distributed systems community [ADLS94, Cri96]. Such systems arise naturally as soon as two or more microcontrollers running periodic tasks are interconnected. They are common in aerospace, power generation, and railway systems.
This chapter presents the quasi-periodic architectures that are central to this thesis. The aim of this chapter is to provide a global framework to describe embedded applications running on such architectures that can adapted and refined in the following chapters. We adopt a classic approach of modeling distributed systems using a discrete synchronous formalism and give a complete discrete model of the architecture: computing nodes, communication media, and delayed transmission.
Traditionally, the link between the discrete mode and real time is ‘pushed outside’ the model: real-time constraints are modeled with additional inputs—like activation signals or regular clock signals—and the model does not formally provide any information on how to produce these inputs. We, however, implement our model in Zélus—introduced in chapter 2— which allows us to express both the discrete synchronous model and the real-time constraints of the architecture in an executable language.
Outline In section 3.1 we define the concept of a quasi-periodic architecture. Running an application on such an architecture introduces sampling artifacts that are described in section 3.2. Then, in section 3.3, we show how to model quasi-periodic architectures in a discrete synchronous formalism. Using Zélus, we link this discrete model to the real-time characteristics of the architecture in section 3.4. Finally, in section 3.5, we show how to adapt our approach to other modeling tools, namely, Ptolemy and Simulink.
Figure 3.1: Example of a real-time trace of a quasi-periodic architecture with three processors. Rectangles represent tasks and arrows denote message transmissions. Note the jitter both on node activation periods and transmission delays.
In this section we give an abstract model of quasi-periodic architectures. The goal is to account for all sources of timing nondeterminism—hardware, operating system, communication network—in one simple model.
Traces of distributed systems are typically described as a set of tasks and transmissions. Figure 3.1 shows an example of such a trace. In our model, we assume that individual processes are synchronous: reactions triggered by a local clock execute in zero time (atomically with respect to the local environment). The only events are thus processor activations and message transmissions.
To abstract from the execution time, the easiest solution—illustrated in figure 3.2—is to consider instantaneous activations and capture the execution time as part of the communication delay. This abstraction imposes that a processor cannot receive a message during the execution of a task. For instance, in figure 3.1, the first message sent by processor A is only read by processor B at its third activation. It is, however, safe to assume that a message received exactly when a consumer activates can be read immediately.
For each processor, an execution is now an infinite sequence of instantaneous activations triggered by a local clock.
We assume without loss of generality that all nodes start executing at t = 0. Initial phase diﬀerences between nodes can be modeled by a succession of mute activations before the actual start of the system. The margins encompass all sources of divergence between nominal and actual values, including relative clock jitter, interrupt latencies, and scheduling delays.
Figure 3.2: We abstract a task as an instantaneous activation and a communication delay τ . The communication delay τ encompasses both the execution time τEXEC and the transmission delay τTRANS : τ = τEXEC + τTRANS .
Communication by Sampling
Messages sent by nodes are stored at receivers in local memories which are updated atomically. A producer can send the same message to several receivers. A memory is only sampled when the corresponding node is activated by its local clock. There is no synchronization between producers and receivers since local clocks are unsynchronized. This communication model is called Communication by Sampling (CbS) [BCDN+ 07].
To illustrate this communication scheme one imagine a sleepy student in a class room. Periodically, the student wakes up and reads the blackboard. Then he falls asleep until he wakes up and reads the board again. In this example, there is no synchronization between the teacher and the student. During a nap, information written on the board may or may not change. Due to this analogy, this simple communication scheme is sometimes called blackboard communication [Ber89, §3].
Finally we assume a reliable communication network, that is, the network guarantees message delivery and preserves message order. If a message m1 is sent before m2 then m2 is never received before m1 . This is necessarily the case when τMAX ≤ TMIN + τMIN . Otherwise, with a lossless network, it is always possible to number messages and stall those that arrive out of sequence.
Value duplication and loss
The lack of synchronization means that successive variable values may be duplicated or lost. For instance, if a consumer of a variable is activated twice between the arrivals of two successive messages from the producer, it will oversample the buﬀered value. On the other hand, if two messages of the producer are received between two activations of the consumer, the second value overwrites the first which is then never read.
These eﬀects occur for any non-zero jitter ε, regardless of how small. The timing bounds of definition 3.1 mean, however that the maximum numbers of consecutive oversamplings and overwritings are functions of the bounds on node periods and transmission delays (∀x ∈ R, ⌈x⌉ denotes the smallest integer i such that x ≤ i).
Property 3.1. Given a pair of nodes executing and communicating according to definition 3.1, the maximum number of consecutive oversamplings and overwritings is nOS = nOW = TMAX + τMAX − τMIN − 1. (3.1) TMIN Proof. Consider a pair of nodes A and B with B receiving messages from A. In the best case, illustrated in figure 3.3a, a message sent by A at time t arrives in B’s shared memory at t + τMIN .
Figure 3.3: Witnesses for the maximum number of oversamplings (a) and overwritings (b) arrives in B’s shared memory at worst at t + TMAX + τMAX . The maximal delay between two successive arrivals is thus TMAX + τMAX − τMIN .
At best, B is activated every TMIN . The maximum number of executions n of B is thus: nTMIN < TMAX + τMAX − τMIN .
Each execution of B that occurs between the two arrivals samples the last received value. The maximum number of oversamplings nOS = n − 1 is thus given by equation (3.1).
The proof for the number of consecutive overwritings, illustrated in figure 3.3b, is similar.
Property 3.1 implies that data loss can be prevented by activating a consumer more frequently than the corresponding producer. This can be achieved by introducing mute activations of the receiver at the cost of higher oversampling. Quasi-periodic architectures involving such producer-consumer pairs are studied in [BCLG+ 02].
Quasi-periodic architectures are a natural fit for continuous control applications where the error due to sampling artifacts can be computed and compensated for. On the other hand, discrete control systems, like state machines, are generally intolerant to data duplication and loss.
There is another obstacle to implementing discrete applications on a quasi-periodic architecture: naively combining variables can give results that diverge from the reference semantics. Consider the classic example of figure 3.4 [Cas00, CB08, BBC10]. A node C reads two boolean inputs a and b, produced by nodes A and B, respectively, and computes the conjunction, c = a ∧ b. The first two lines of figure 3.4 show the content of the local memories of node C corresponding to variables a and b.
Suppose that a is false for three activations of A before becoming true and b is true for three activations of B before becoming false. In a synchronous semantics, with simultaneous activations of A, B, and C, node C should return false at each activation. But, as figure 3.4 shows, since nodes are not synchronized, it is probable that the values of a and b do not change at exactly the same time. There is thus a small interval where both a and b are true.
If node C does not sample the content of the memories during this interval, the output is always false, as expected (case 1). On the other hand, if node C does sample the content of the memories during this interval, the output c is set to true for one period of C (case 2).
In other words, the outputs of this simple application depend on the relative activations of nodes A, B, and C. This phenomenon cannot be avoided by changing the frequency of node activations.
Buﬀers We consider here one-place buﬀers, but the previous remarks can be generalized to bounded buﬀers of arbitrary size. Since nodes are not synchronized it is impossible to ignore the case of producer/consumer pairs where the producer runs as quickly as possible, that is every TMIN , and the consumer runs as slow as possible, that is every TMAX . As soon as the nodes are not perfectly synchronous, that is, TMIN < TMAX , the diﬀerence between the cumulative tick counts of nodes can diverge without bound. Without any additional synchronization mechanism it is thus impossible to bound the size of the buﬀers while ensuring that no data will be lost or oversampled.
Table of contents :
1.1 Quasi-periodic systems
1.2 A synchronous approach
2 A Brief Introduction to Zélus
2.1 A synchronous language ..
2.2 … extended with continuous time
2.3 A complete example: the Zélus clock
3 Quasi-Periodic Architectures
3.2 Communication by Sampling
3.3 Discrete model
3.4 Real-time model
3.5 Other modeling tools
3.6 Bibliographic notes
4 The Quasi-Synchronous Abstraction
4.1 A discrete abstraction
4.2 Traces and causality
4.3 Unitary discretization
4.4 Quasi-synchronous systems
4.5 Multirate systems
4.6 Bibliographic notes
5 Loosely Time-Triggered Architectures
5.1 Synchronous applications
5.2 General framework
5.3 Back-pressure LTTA
5.4 Time-based LTTA
5.5 Round-based LTTA
5.6 Clock synchronization
5.7 Comparative evaluation
6 Symbolic Simulation
6.2 Related work
6.3 Difference-Bound Matrices
6.4 ZSy: an extended subset of Zélus
6.5 Static typing
7.2 Open questions
7.3 Concluding remark