Interactive multimedia systems are real-time systems; as they deal with audio and video streams, they have to provide audio or video frames periodically before a precise deadline. We will first present the main concepts used to tackle real-time programming, and then, how scheduling works in some IMS.
Interactive multimedia systems are softwares that are part of an operating system, and the scheduling is carried out in several layers (see Fig. 1), the multimedia software, then a multimedia library, and ultimately, the OS scheduler.
Here, we present several principles to deal with real-time scheduling, some important algorithms, how to handle both periodic and aperiodic actions and events, which is particularly crucial for IMS which have to react input from the users, and overload, from the multimedia system as well as from the surrounding operating systems, since multimedia systems are most of the times deployed on mainstream operating systems which do not facilitate hard real-time programming.
Real-time scheduling can be characterised  by their behaviour into several categories:
Dedicated to hard or soft real-time. We distinguish between hard real-time and soft real-time. Hard real-time tasks are tasks for which their deadline must absolutely be respected, because for instance not respecting it would lead to a crash of a drone. On the contrary, a soft real-time task is a task for which respecting the deadline is preferable but exceeding the deadline only entail a degraded quality of service, for instance, clicks in an audio system, or late delivery in a communication system.
Static or dynamic scheduling (also called online and offline scheduling) Static scheduling pre-pares a scheduling sequence for all tasks before the launch of the tasks before starting execution of the system. It requires to know in advance the tasks that are going to be executed, and their arrival and release time. It only induces a low overhead because the scheduling schema is simply written in a table read at runtime.
Dynamic scheduling has more overheard, and is less precise than the offline approach, but can deal with the unpredictability of sporadic or aperiodic tasks, for instance.
Preemptive and not preemptive. When tasks can be stopped in the middle of a computation, in order to launch other tasks, and can be continued later: they are called preemptible tasks. On the other hand, some tasks, such as interrupt handlers, must not be preempted and are non-preemptive tasks (also called immediate tasks).
Dependencies or not between tasks. Real-time tasks can depend on each other, that’s to say, one task needs the output of another task to start execution. This is the case in digital signal process-ing graphs for instance, where the audio signal flows from one node to others and is processed through effects.
Periodic tasks or sporadic tasks. A task can be periodic, executed with period T, or aperiodic (GUI events, for instance), or sporadic, that’s to say tasks that regularly repeat but without a fixed period.
Utilization and overload. Utilization defines how much (as a ratio) of the processor is used by the tasks that are executed. If the ratio is more than one, it means that there is overload.
Latency is time between input and output for a processed piece of data.
Jitter is the variation of latency (allowed before deadline but not after).
Reaction to permanent or transient overhead. A transient overload, contrary to a permanent over-load, is a burst in utilization of the processor, when the utilization is more than 1.
We introduce common approaches to real-time scheduling deployed in majority of systems and present the concept of schedulability test to check whether tasks can be scheduled with a given scheduling algorithm.
Rate Monotonic (RM)
RM  is usually used for tasks that execute periodically, without dependencies. It is optimal with respect to feasibility: if it exists a priority assignment that yields a feasible schedule, then RM yields a feasible schedule. It is online and uses fixed priorities.
where Ti are the periods of the tasks and Ci their worst case execution time.
When the number of tasks tends to infinity, U tends to a maximum theoretical utilization of 88%. As this schedulability test is a sufficient condition, and not a necessary one, there may exist in some cases scheduling schemes that achieve a better utilization that RM.
Earliest deadline first (EDF)
EDF is an online scheduling algorithm, with dynamic priorities (the same task can have various priorities after several activations). In the EDF policy, tasks with the nearest absolute deadlines are scheduled first, that’s to say, priorities are the inverse of deadlines. This policy is optimal, preemptive, and is used for periodic or sporadic tasks, without dependencies. EDF minimizes the minimal lateness and is optimal for feasibility.
A sufficient schedulability test for EDF exists when deadling with periodic tasks:
where Di are the deadlines of the tasks and Ci, their worst case execution times. Contrary to RM, EDF can ensure a 100% utilization.
If there are only periodic tasks (thus, deadlines are equal to periods), the condition is also a necessary condition.
Tasks can also depend on other tasks: they must not be executed if another task has not fin-ished. However, EDF with precedences is not optimal. In EDF* by Chetto , deadlines are modified as follows: di0 = min(di, minj2D(i)(d0j ej))
where D(i) is the set of tasks that depend on task i, di, its deadline, ej, its execution time, and di0, the modified deadline.
Dealing with sporadic and aperiodic events
In a system with periodic and aperiodic tasks, periodic tasks can interfere with aperiodic tasks, and make them execute too late. There are three main ways of dealing with this interaction be-tween aperiodic and periodic tasks :
Background scheduling. In this approach, aperiodic tasks are scheduled when no periodic tasks are ready to execute. If there are several aperiodic tasks, they are executed in FIFO order. A disadvantage is the huge response time.
7Not to be confused with start time: start time is when the task is given to the scheduler, and release time, when the scheduler actually launches the task.
Server. A server is a periodic task the purpose of which is to serve aperiodic requests. A server has a period and a granted computation time (server capacity or budget) that it can use before letting other periodic tasks resume execution. In some case, servers can degrade utilization.
Slack stealing and joint scheduling. It consists of stealing processing time from the periodic tasks without having them miss deadlines by using laxity, i.e. the temporal difference between the deadline, the ready time and the run time.
In case of (unpredictible) overload, several policies  can be used. There are two main strategies: having several versions of a task that don not yield the same quality, or a way of cancelling tasks over other tasks.
For the first policy, there can be a primary program P1, and an alternate program P2; P1 has a good quality of service and takes an unknown duration to complete, on the opposite P2 is deter-ministic, has a bad quality of service, but has a known worst case execution time. Two strategies can be used to choose between the two tasks, first chance and last chance, depending on whether the good quality tasks is stopped to launch P2, or if P2 has terminated and there is still time, P1 is executed while the results of P2 are kept in case of P1 exceeding its deadline.
Another method is the imprecise computation model: multimedia processing is separated into a mandatory part, and an optional part that can be dropped.
For the second policy, tasks are given an importance value and in case of overload, tasks with the least importance are killed until there is no more overload. To handle the termination of such tasks gracefully, tasks present two modes, a normal mode, and a survival mode8, with an abortion or an adjournment property (such that freeing memory, or saving partial computations to continue later).
Scheduling in interactive multimedia systems
Interactive multimedia systems deal with audio and video streams, where a buffer has to be filled periodically, and with controls that can be aperiodic (a GUI change) or periodic (a low frequency oscillator). The audio and video streams can use different samplerates, and the controls, other rates, that have to be accommodated together, with several priorities. Moreover, audio streams and controls are connected in a dataflow graph, which can also involve spectral processing, gran-ular synthesis using a corpus of sounds stored in a database. How articulating control and pro-cessing is one of the challenges of scheduling IMS: control can take too much time and delay audio processing; audio processing often processes by chunks to use the vector instructions of the targeted processor, so that control cannot occur inside such a chunk. We will see that in the fol-lowing IMS, control indeed happen at the boundaries of a buffer, which entail that control is not sample-accurate.
Patching as graphical programming environment
The Patcher family, with Max/MSP , PureData , jMax , MAX/FTS  originally devel-oped by Miller Puckette at Ircam , emphasises visual programming for dataflow processing. Functions are represented by boxes, placed on a screen called canvas. Objects are connected to-gether with links, and data flows from one object to another through these links. Here, we present more specifically PureData, which is opensource.
In Puredata, actions can be timestamped, for instance the boxes metro, delay, line, or timer. GUI updates are also timestamped. Other incoming events are MIDI events, with notein and OSC. The scheduling in Puredata is block-synchronous, that’s to say controls occurs at boundaries (at the beginning) of an audio processing on a block of 64 samples, at a given samplerate. For an usual samplerate of 44.1 kHz, a scheduling tick lasts for 1.45 ms. Depending on the audio backend,9 the scheduler can be a polling scheduler (see Fig. 2) or use the audio callback of the backend.
Samplerate and block size can be changed locally in a patch using the block box.
SuperCollider  is an interactive multimedia system mainly used for audio synthesis and for live coding.10 SuperCollider uses a client/server architecture (see Fig. 3): on the client side, a Smalltalk-like language makes it possible to compose the piece, and generates OSC messages which are sent to the server. OSC messages can be timestampped and in this case, they are put in a priority queue and executed on time, or can be sent without a timestamp and are then processed when received. All events such that their timestamps is lower than the tick size (or scheduling period) are executed at once.
The server uses three threads, a network thread, a real-time thread for signal processing, and a non-real time thread for purposes such as reading files. Commands among threads are commu-nicated via lock-free FIFOs.
There are three clocks in the language, AppClock, a clock for non-critical events that is allowed to drift, SystemClock for more precise timing, and not allowed to drift, and TempoClock, the ticks 9 Jack or ALSA on Linux, CoreAudio on MAC, or WASAPI on Windows, for example.
10 Live coding consists of creating a musical piece at the time of the performance by coding it in a DSL in front of the attendance: the editing of the lines of codes is usually projected as the musical piece is created.
Chuck  is a strongly timed (i.e. with an explicit manipulation of time, thanks to variable now), synchronous, concurrent, and on-the-fly music programming language.. It is mainly used for live-coding. It is compiled to bytecode and executed in a virtual machine.
Time and event-based programming mechanism. Time can be represented as a duration (type dur) or a date (type time). The keyword now makes it possible to read the current time, and to go forward in time. Time can also be advanced until an event, such as a MIDI event, an OSC event, or now allows time to be controlled at any desired rate, such as musical rate (beats), as well as sample or subsample rate, or control rate.
Chuck makes it possible to use an arbitrary control rate (control occurs after the assignation to now stops blocking) and a concurrent control flow of execution.
Cooperative scheduling. ChucK processes are cooperative lightweight user-spaces called shreds (and are scheduled by a shreduler); they share data and timing. When time advances thanks to …
=> now, the current shred is blocked until now attains the desired time, giving room to other shreds to execute.
Shreds are synchronous (sample-based-clock of 44.1 kHz in general) and shreduling is deter-ministic: instantaneous instructions in every shred are executed until reaching an instruction that next advances in time.
Shreds are shreduled using a queue of shreds sorted by waiting time before wake up (see Fig. 4).
Schedulability. The language is type-checked, but no analysis of schedulability is performed: a very complex musical performance could well lead to lateness for some audio processing units. This absence of analysis is quite understandable when reckoning the live coding purpose of the language, and such problems must be dealt with by the musician at performance time.
Faust  is a synchronous domain-specific language dedicated to digital signal processing, which compiles to C++. It also aims at facilitating the creation of audio plugins for digital audio work-station, and thus provides a way of defining a graphical interface or to communicate to the audio process via OSC.
Audio is processed sample by sample in the language, but after compilation, it is processed on buffers (32 samples by default) and control only occurs at boundaries of the buffers.
In this part, we present Antescofo, an IMS, which is embedded in a host IMS (typically PureData or Max/MSP). Antescofo harnesses the audio effect of the host environment. Our goal is to embed directly digital signal processing in Antescofo, and to formalize the interaction of control and audio computations using buffer types.
Antescofo  is an IMS for musical score following and mixed music. It is an artificial musician that can play with human musicians in real time during a performance, given an augmented score prepared by the composer. Since 2008, Antescofo has been featured in more than 40 original mixed electronic pieces by world renowned ensembles and composers, such as Pierre Boulez, Philippe Manoury, Marco Stroppa, the Radio France orchestra and Berlin Philharmonics.
Fig. 5 shows the architecture of the system: it is composed of two subsystems, a listening ma-chine and a reactive machine. The listening machine processes an audio or a MIDI stream and esti-mates in real-time the tempo and the position of the live performers in the score, trying to conciliate performance time and score time. Position and tempo are sent to the reactive machine to trigger actions specified in the mixed score. These actions are emitted to an audio environment, either Max/MSP, or PureData, in which Antescofo is embedded as a performance patch. As for human musicians, Antescofo relies on adapted synchronization strategies informed by a shared knowl-edge of tempo and the structure of the score.
The mixed scores of Antescofo are written with a dedicated reactive and timed synchronous language, where events of the physical world – notes and rhythm played by musicians, for in-stance –, parts of the artificial musician, and the synchronization between them are specified. It has to take into account the temporal organization of musical objects and musical clocks (tempi, which can fluctuate during performance time) which are actualized in a physical, performance time. This performative dimension raises particular challenges for a real-time language:
Instrumental events The augmented score details events played by human musicians (pitches, silences, chords, trills, glissandi, improvisation boxes) and their duration (absolute, in seconds, or relative to the tempo – quaver, semiquaver…). An ideal tempo can also be given.
Actions Actions (see Fig. 6) are triggered by an event, or follow other actions with delay. Actions follow the synchronous hypothesis, and are executed in zero time, and in the specified order.
Atomic actions It is a variable assignment, an external command sent to the audio environ-ment, or the killing of an action.
Compound actions A group is useful to create polyphonic phrases: several sequential ac-tions which share common properties of tempo, synchronization, and error handling strategies. A loop is similar to a group but its actions execute periodically. A curve is also similar to group but interpolates some parameters during the execution of the actions. Compound actions can be nested and inherit the properties of surrounding blocks unless otherwise explicitly indicated.
An action can be launched only if some condition is verified, using guard statements, and it can be delayed for some time after the detection of an event (or previous action). The delay is expressed in physical time (milliseconds) or relative time (beats).
Processus Processus are parametrized actions that are dynamically instantiated and performed. Calling a processus executes one instance of this processus and several processes can execute in parallel.
Expressions and variables Expression are evaluated to booleans, ints, floats, strings (atomic values) or data structures such as maps and interpolated functions (non atomic values).
Variables are either global, or declared local to a group. The assignment of a variable is an event, to which a logical timestamp is associated. $v is the last value of the stream of values associated with v, and [date]:$v the value at the date date.
whenever statement It makes it possible to launch actions if a predicate is satisfied, contrary to other actions which are linked to events sequentially notated in the augmented score.
Temporal patterns, i.e. a sequence of events with particular temporal constraints, can also be matched by a whenever.
Synchronization strategies and fault tolerance One strategy, the loose strategy, schedules ac-tions according to the tempo only. Another strategy, the tight strategy, not only schedules ac-cording to timing positions, but also with respect to triggering relative event-positions. There are also strategies that can synchronize according to a target in the future: the tempo adapts so that the electronic part and the followed part arrive on this target at the same time.
In case of errors (missed events), two strategies are possible: if a group is local, the group is dismissed in the absence of its triggering event; if it is global, it is executed as soon as the system realizes the absence of the triggering event.
A logical instant in Antescofo, which entails an advance in time for the reactive engine, is either the recognition of a musical event, or the external assignment by the environment of a variable, or the expiration of a delay. In one logical instant, all synchronous actions are executed in the order of their declaration.
The reactive engine maintains a list of actions to be notified upon one of those three types of events. For actions waiting for the expiration of a delay, three waiting queues are maintained: a static timing-static order queue, for actions delayed by a physical time (in seconds, for instance); a dynamic timing-static order queue, for delays related to the musician tempo (in beats); and a dynamic timing-dynamic order queue, for delays depending on local dynamic tempo expressions. Delays are reevaluated depending on changes of tempo, so that the order of actions in the queue should change. The next action to execute is the action which has the minimum waiting delay among the three queues.
Listing 1 shows the beginning of Anth`emes 2, a mixed music piece for solo violin and electronics, which relies on Antescofo to coordinate the electronic effects and synthesis. The electronic part is still done inside Max or Puredata, but has been ported (by Nicolas´ Schmidt ) to Faust effects embedded into the Antescofo augmented part (see also section 3.1).
Anth`emes 2 uses complex control structures such as curves (in the excerpt we show, in order to
increase or increase smoothly the level of volume for some effects).
Audio processing architecture in Antescofo
Using a scheduling algorithm such as EDF as presented in part I makes it possible to handle more precisely the interaction between audio processing and control in complex scenarios: contrary to the round-robin schedulers of section 2), control and audio can be interrupted, so as to get sample-accurate control and handle control calculations that delay audio processing too much. Audio is processed in fixed-size buffers. Here, we present a way of practically dealing with variable-size buffers.
In Antescofo, audio processing was previously handled in Puredata or Max/MSP, the host en-vironment, or more rarely in synthesis environments such as SuperCollider, giving control of scheduling to these environment, without taking advantage of the additional information given by an augmented score. In new versions, audio processing tasks are beginning to be embedded into Antescofo , with Csound , Fluidsynth11, or Faust . Using Faust, which can com-pile to optimized code, makes it possible to target small single-board cards such as Raspberry Pi or Udoo.
DSP network Audio signals flow through an audio processing graph which connects audio pro-cessing nodes together. Audio links, which are independent from the audio processing tasks, can adapt the data flow to the specific needs of the nodes they link. Connections between audio can be dynamically changed in response of an event detected by Antescofo using the patch actions, which differ from PureData or Max/MSP where connections cannot be easily12 changed during a perfor-mance. The audio processing nodes can be controlled using the control variables of Antescofo, so that parameters of an effect can change according to the tempo of the score, for instance.
The input and output of audio, through a periodic audio callback,13 implies specific constraints on how to manage the rates of all the nodes with respect to the rate of the input/output nodes.
In the graph, data does not flow sample by sample, but buffer by buffer, and control happens at boundaries of the processing of a buffer, the usual separation between control rate and audio rate.
Continuous variables Continuous variables connect Antescofo discrete variables to the con-tinuous signal14. A continuous variable can be assigned a constant value, or a curve. In practice, the sampling rate of the audio signal becomes the sampling rate of the curve.
Topological sort To determine the order of the audio processing, a topological sort is performed on the graph of effects. The graph of effect is assumed to be acyclic (otherwise, an error is thrown).
Table of contents :
I Multimedia scheduling
1 Real-time scheduling
1.1 General characteristics
1.2 Common approaches
1.2.1 Rate Monotonic (RM)
1.2.2 Earliest deadline first (EDF)
1.3 Dealing with sporadic and aperiodic events
2 Scheduling in interactive multimedia systems
2.1 Patching as graphical programming environment
2.5 Challenges in multimedia scheduling
II Antescofo and audio processing
2.6 The Antescofo language
2.7 Antescofo runtime
2.8 Anth`emes 2
3 Audio processing architecture in Antescofo
3.1 General structure
4 Accommodating various rates
4.1 Related work
4.2 Buffer flow
4.2.1 Characterizing buffers
4.2.2 Concatenation of buffers
4.2.3 Type coercion
4.2.4 Buffer flow graph
III Model-based analysis
5 Paradigms for real-time systems
5.1 Synchronous languages
5.2 Giotto and xGiotto
5.2.1 Giotto: a time-triggered language
5.2.2 xGiotto: a time and event-triggered language
6 Intermediate representation
6.2 Variables and clocks
6.4 IR generation from Antescofo programs
7 Schedulability test
8 Bounds on tempo
9 Two-player safety game