Testing Mixed Music Systems: State of the Art

Get Complete Project Material File(s) Now! »

Testing Mixed Music Systems:

State of the Art

We dedicate this section to present the context and techniques related to our work. Section 2.1 introduces the mixed music and its specific time. We briefly list music systems before presenting Antescofo, the system under test in our case study. Then, some existing tools for assessing such music sys-tems are presented. Thereafter, Section 2.2 defines the model-based testing formalization used through the document. We give our definition of model and present existing methods and tools to perform model-based testing for real-time systems.

Testing Mixed Music Systems

Music has always been an art of creativity in which sounds are produced in time by one or several musicians. The music pieces complexity has not ceased to increase, bringing through the history solid and expressive nota-tions. These notations, as the western notation for the occidental culture, established the sustainable power of the composer creations. With the intro-duction of mixed music, involving electronic actions within musical pieces, a new step forward in musical expressivity was required. As a solution, the mixed score was created. It is an extension of the score document in which, aside from the musician input events, a set of output electronic actions is defined.
In this work, a mixed score is a composer requirement containing: – one or multiple sequences of ideal musician events and, sequences of actions timely related to these events -. Moreover, a musician play (called a performance) is considered as a sequence of event onsets (when the musician begins to play effectively an event) and a duration before the next event onset. Notice that rests are considered as event durations here.
In the remaining of this section, we detail some music notions focussing on score and performance. They both contribute substantially to the realization of a music piece and come from two main steps of a music creation process: composition and performance.

Composition and Authoring Systems

During composition, composers invoke creative ideas and use complex tools in order to make a piece. A myriad of documents are written, explaining the piece to be playable following the creator intentions. The mixed score is one of these documents and usually contains a specific unit for expressing its time dimension, the relative time unit. Relative time unit is an abstract time unit manipulated by composers in order to englobe every musician performance of its piece. A duration in such a time unit is measured in beat and is relative to the musician pace (called tempo) during performance. It is an important musical dimension since relative time provides a freedom of interpretation for musicians.
Figure 2.1: A mixed score specify-ing one musician part and two elec-tronic parts, these parts are called staves. Each stave is a coher-ent phrase of notes through musical time (horizontal axis). The mixed score represents the ideal sequence of these staves, played “in concur-rence”, i.e. altogether during the performance.


Example 2.1.1: In Figure 2.1, three musical parts are specified. These parts, called staves, represent the ideal sequence in time (the horizontal axis) of a piece. For instance, the musician stave (on the top) specifies one note lasting one beat (called a quarter) with the pitch D5# and a label e1 (depicted above the note). The two next events last a half of e1 duration (0.5 beat, called eigth) and have the pitch A4 and C4# for respectively e2 and e3. Notice that we deal with note pitches using american pitch notation. This information is sufficient for musicians to play the stave (remark that usually the labels are not specified).
However, there are implicit and strongly timed synchronizations accord-ing to the vertical axis, since vertically aligned notes must be “played at the same moment”. For example, the mixed score above specifies a simultaneous play of e1 (by the musician) and ont1 (by the first electronic system).
Authoring systems aim at simplifying the composer creation, by focusing on offline and static solutions. The literature provides a number of visual programming languages, Domain Specific Languages (DSL) or abstract rep-resentation tools in order to bring to composers a manner of expressing their imagination. As an example, we only cite a few of the existing authoring systems: OpenMusic [27], PatchWorks [67] and Common Music [84], all based on the programming language Common Lisp [86].
OpenMusic is a visual programming language allowing an utilization of symbolic music representations. The system manipulates relative time (in beat) which is not continuous as the physical time and allows the applications of generic algorithms abstracting every possible performance. Currently, in order to provide more dynamism to the authoring system, recent works added a possibility to play a performance in OpenMusic [28]. These studies tackle the lack of dynamism of such authoring systems and increase composer possibilities by providing interactions with musicians.

Interactive Music Systems for Performance

During performance, musicians interpret a piece by altering the ideal event durations written in the score. This interpretation decorates the piece with performers’ emotions/intentions and translates the abstract time units (in beat) of the score into physical ones (in seconds). The real-time context of a performance makes it non-reproducible and unpredictable, two highly risky properties for computer systems.
Figure 2.2: The vertical line depicts the positions of each musician or electronic system on its stave. Here, the musician is going to play e2, the first electronic system should idle (rest) and the second should play ont2. Also, the line’s left side de-picts the past (e1 for the musician) and the right side the performance continuation (e3).
Example 2.1.2: The Figure 2.2 depicts a performance using a vertical line on the score. The line symbolizes the current position of the musi-cians/systems on the staves. Hence, it implies that the notes before the line have been played and those after should be played. We define a sequence of played events by the dates when their onset were played relatively to the beginning of the performance. As if a timer was launched at the first onset and for each next onset its timer value is stamped onto them. For example, a musician can play e1 at 0 second, e2 at 1 second and e3 at 1:25 seconds, we call these values timestamps.
Each musician has a tempo in beat per minute (bpm) when playing a performance which can be computed thanks to the translation function. In-deed, 1 beat lasts 1 second with a tempo of 60bpm, 2 seconds with 30bpm and 0:50 second with 120bpm. After computing the event durations using the next event timestamp, it is so possible to deduce the musician pace. In our case, e1 lasted 1 second and e2 0:25 second. Hence, e1 was played with 60bpm and e2 with 120bpm because it lasted 0:25 second for 0:5 beat in the score. However in real cases, one different tempo is viewed for each musician, therefore, the concrete position of each one is not aligned in a vertical line
with the other positions.
Interactive Music Systems (IMS) aim at being involved in concerts. In [81] and [19] a generic definition of IMS is presented. We depict this definition in Figure 2.3. IMSs are systems interacting in real-time with performers (musicians, dancers : : 🙂 and audience (concert listeners, piece watchers : : :). In order to interact with its environment, such systems use sensors and actuators, a sensor detection inducing the system reaction. In this thesis, we restrict this definition to a music application and define the IMS as an electronic musician mimicking human ones. It implies several actions:
* input detections,
* real-time reactions to inputs, and
* anticipation of future actions.
IMSs detect inputs by listening to or tracking of performer(s). These sys-tems are event triggered and must react instantaneously at an input de-tection. Finally, they manage timed synchronizations or/and compute the musician’s pace for anticipating and adapting their reactions accordingly. In practice, pedals are used to synchronize with musicians (and compute easily their pace) and stochastic models estimate future durations for anticipating in real-time. Usually in reactions, IMSs can preform sound processing or message sending to other audio applications.
The most demanding problem of IMSs is the temporal reliability. Indeed they consider time as a critical resource since an output may have a precise value and a precise date (not only the “what” but the “when” is primordial). A IMS’s reaction must not be undertaken too early or too late.
As an example, we cite some IMSs in the literature, Formula [4], MAX-MSP [78] and PureData [79] and Chuck [48].
MAX-MSP [78] and PureData [79] are visual programming language envi-ronments. They are long-established and provide a simple way to construct dynamic and reactive systems (called patches). A patch consists in plug-ging functions or sub-patches to each other, which are sequentially linked and executed in real-time. These visual data-flow languages are particularly effective for hierarchical control and signal processing modules creation.
Chuck [48] is a strongly timed programming language that provides a linker keyword called the Chuck operator => and an explicit notion of time, mapped into a number of audio samples, which can be manipulated in the language. At runtime, a virtual machine, called Shreduler, runs all the Shreds, Chuck programs, in concurrence with the audio management (the audio Unit Generators (UGen) a synthesis network). Explicit time eases Chuck’s syn-chronization and concurrence management, indeed, the Shreduler knows at runtime when the Shreds wake-up and stop. This feature is called the time-mediated concurrency which is appreciated by people enjoying live coding.
Jitter problem. A challenge for IMSs is to compute and produce sounds while dynamically taking into account user wishes. Indeed, when produc-ing continuous sounds, a minimum rate (around 20 ms) of data buffers has to be provided by the system in order to avoid incoherent glitches in the output (called “clicks”). However, IMSs should sometimes compute complex algorithms for controlling and making the expected sound.

The score-based Interactive Music System Antescofo

Among IMSs we distinguish score-based IMSs, which are constrained to fol-low a pre-specified timed scenario during performance (generally given be-forehand as input). These particular systems must tackle the score following problem, consisting in localizing a musician position on a given score during performance.
We consider for our work the IMS Antescofo as the system to test in our case study. Antescofo is a score-based IMS performing score following. In order to introduce the system, we highlight the historical evolutions of musical human-system interactions:
1939 The first interactions were created with the electronics part fixed on a support (i.e. audio records) followed by musicians at performance. For instance, “Imaginary Landscape No.1” by John Cage (1939), or later, the piece “Kontakte” for piano, drums and tape by Karlheinz Stock-hausen (1958-60). In the second piece, the interactions were expressed as audio timestamps on the top of the score to synchronize and local-ize the musicians according to the records as shown on the left of the Figure 2.4.
1983 Score following algorithms are created to allow systems to compute the current humans positions on a score, as in “Barry Vercoe” by Roger Dannenberg (1983). In this piece, the system used an abstract MIDI input to follow the musician performance.


1990’s Hidden Markov Models [52] (HMM) introduce probabilities in speech and sound processing. They enabled a finer and smarter score following directly from audio waves and improved online algorithms.
2008 Algorithms enabled anticipative score following which extracts online the position and the tempo of musicians. This idea is implemented in the system Antescofo and is a step closer to a machine mimicking human musician behaviors.
This evolution is depicted in Figure 2.4. It presents on the left an ex-tract of “Kontakte” by Karlheinz Stockhausen (1958-60) and on the right an Antescofo mixed score.
Figure 2.4: Evolution of score based IMS systems, on the left an extract of “Kontakte” by Karlheinz Stockhausen, on the right an example of the Antescofo visual interface Ascograph for an automatic accompaniment piece.
Antescofo was first developed as a MAX-MSP object to be embedded in a patch (available in PureData too). Also, during performance, Antescofo usu-ally waits for audio signal or midi input stream and sends output messages. As a score based IMS system, Antescofo belongs both to the authoring and real-time IMS systems:
Authoring: Antescofo requires a mixed-score written in a Domain Spe-cific Language (DSL), specifying the events to detect and the elec-tronics actions to send. The system provides authoring features to aid composers for writing an Antescofo mixed score enabling possible complex algorithms in the piece.
Performance: After processing the mixed score, Antescofo: – waits for an event produced by humans, detects it, processes score following and finally reacts by sending a message -. All these actions are performed online during performance.
A standalone version of Antescofo was implemented. The standalone should use a virtual clock to run in a fast-forward fashion (i.e. there is no real-time management). It means that instead of waiting for a duration in real-time, the system notifies that an amount of time is passed to its virtual clock and continues its execution.


Testing Interactive Music Systems

The context of music imposes us unpredictable and free input events received from the environment. Even if a musician does not make any mistakes in its play, the detection process by itself can erroneously detect wrong pitches or altered durations. Because IMSs are involved in concerts with human-musicians, they should not crash or report errors: “the show must go on!” However, how currently are IMSs tested and verified to prevent from crashes at show-time?
Rehearsals. Figure 2.5 depicts a general sketch of a piece lifetime. First, a score is created from ideas (and magics) by composers and several formats of this score are written (paper, electronics or both). Second, rehearsals, in a concert situation, test the IMS behaviors while the composers retrieve a global feedback on the piece. Then, the performance takes place and the show starts giving feedback for future performances.


Although rehearsals are effective to assess the global feeling of a piece, they are not actually dedicated to IMSs testing. Indeed, rehearsals are costly in time and money (concert simulation) and cannot be disrupted by system crashes. Moreover, multiple drawbacks can be highlighted:
– Only the performance played during the rehearsal is tested which differs from the concert one.
– It is real-time, in the sense that the piece has to be played physically, hence it takes one hour to test a single performance on a complete one hour long piece.
– This method is tedious since the composers listen to the result to assess the IMS.
Finally, the rehearsal is more an artistic judgement than a IMS debugging.
For this reasons another way to find and fix IMS bugs has to be found.
Given a system to test, called Implementation Under Test (IUT), two well-known approaches consist in regarding or not the IUT’s source code in order to test the system. White-box testing [88, 29] is based on traversing the source code (or its abstraction) of the IUT. On the contrary, black-box testing [88, 29] focuses only on the IUT inputs/outputs to test. We follow this two approaches in order to introduce and present existing tools for testing IMS. Moreover, we highlight the pros and cons for testing timed behaviors of systems and estimating the tool efficiency in our context.
White-box testing Examples of usual white-box testing are assertions inserted on the source code or fuzzing techniques which traverse instructions with a set of system inputs in order to assess absence of errors.
Assertions are code instructions and check, using predicates or boolean functions, variable values (input or output). They are useful to stop the system when an unexpected value is detected. MAX-test package in MAX-MSP [38] developed this feature for testing MAX patches. It provides an automatic tool for testing IMSs but requires both the expected and IUT’s values that have to be computed in another manner. Assertion techniques are effective to test if a value is between bounds (i.e. for parameters) however it is related to the code and fails to provide a manner to compute precise expected values. Therefore, assertions cannot be used for testing temporal properties.
Fuzzing techniques [51] have the advantage to be easy to set up since its goal is to start with a single input sequence and mutate it, in general randomly, to generate a set of inputs. The suite of input sequences is then stimulated on the IUT for monitoring its behavior. In [20], a fuzzing method is used directly in production, testing an operating system. This technique allows to test a wide number of programs with many input sequences. More-over, they use constraint solvers and keep track of the instructions tested to mutate the best input and optimize the coverage of the input trace suite generated. However, only crashes and critical errors are checked with fuzzing techniques and, similarly to assertions, these techniques fail to provide ex-pected values during the test. Although random generation is important for testing because it always raises unexpected errors, this generation is ineffec-tive for covering code (i.e. executing all the instructions of a large code). Finally, this technique cannot test the IUT’s temporal behaviors.
Black-box testing We focus the presentation of black-box testing tech-niques on Model-Based Testing (MBT) techniques. A model is a simple specification of the IUT, abstracting few of its aspects. Models usually are human readable graphs, in which nodes abstract IUT states and transitions IUT observable behaviors between states. The goal of a model is to give an easy specification regarding some characteristics of the system, i.e. its communications, its time behaviors, its input/outputs and be able to make formal reasonings.
This method is the solution we have chosen to test the IMS Antescofo.
Some works on modeling the IMS i-Score [7, 6] have been produced. They use timed automaton or petri-net models for verifying the timing and communication behaviors of the IMS. However, these models of i-Score focus on reasoning about formal model properties rather than IMS testing.
Usually, the main limitation of MBT is the manual construction of mod-els that is tedious and error prone. However, MBT is effective to specify temporal behavior. For our work, we follow MBT techniques for testing IMS and designed an automatic solution to ease the model constructions.

Model of Timed System

Model-based testing (MBT) [63] is a formal method to test whether a sys-tem conforms a specification. In the remaining of the document, we call specification or model an abstract representation of a concrete system. A model allows to reason automatically on system properties and is usually a simpler description of this system. MBT uses models in order to generate a suite of tests for assessing an implementation of the concrete system, called Implementation Under Test (IUT).


In the following of the section, we present relevant models to specify real-time systems. Thereafter, the MBT method is detailed with two different approaches of implementation.

Input/Output System Models

A model can be defined over a set of actions A abstracting the input/output behaviors of a system. Usually, MBT methods are based on Finite State Machines (FSM), an abstract machine with finite sets of locations, edges and actions. A location abstracts a system’s state and its outgoing edges the possible actions the system can do through states.
Model of Interactions Real-time systems are usually specified using states (abstracting a state of the concrete system), transitions (possible ac-tions from a state) and three kinds of action: input, output and internal actions. These actions specify an input expectation, an output emission and an internal system computation respectively. Input-Output Labeled Transi-tion Systems (IOLTS) [87, 85] are models with finite sets of states, transitions and actions. Each transition is labeled with an action specifying an interac-tion with their environment. It is a simple model allowing the specification of open systems where some inputs are expected and some outputs returned.
Example 2.2.3: The example in Figure 2.6 depicts an IOLTS model. The model states are depicted with circles and the transitions are depicted with arrows, starting from a state called source and guiding to the target state. An action is depicted ii?, for receiving an event (modeling an event detection); oi!, for emitting an output; and , for performing an internal action. The double circled state (q5) designs the ending state, called exit state, which is a state without outgoing transition.
We specified in this figure a system reacting to two inputs i1 and i2. The system must send the corresponding output oi only once for i 2 f1; 2g. However, if i2 is received before i1 then o1 is discarded. The model abstracts an internal action performed before returning o1.
It is a typical IMS behavior which waits for event detections and needs to anticipate or manage a missed event during performance. One can imagine that inputs i1 and i2 are notes, chords or more complex event detections from a listening machine. The internal action can abstract an effect computation from the input i1 (for example echoing the input with a disturbance) resulting to the output o1. o2 can be a simpler atomic command, to switch off the output flow. Then for example, a possible model might describe that if the first input is detected as expected (in a normal case), the effect is computed for the output o1 . Otherwise, if the musician played i2 and missed i1, the system has to omit this echo because the relevant input was not played, and the output o2 is sent directly.
Definition 2.1. A IOLTS is a 4-tuple hQ; q0; A; Ti such as:
Q is a non-empty and finite set of states, q0 2 Q is the unique initial state,
– Ain is set of input actions,
– Aout is set of output actions,
– f g is the generic internal action. T Q A Q is a set of transitions.
We denote as i? 2 Ain, the wait for an input i and o! 2 Aout, the emission of an output o. The set AV = Ain [ Aout is the set of observable actions, these actions are visible from outside of the model. We commonly use to denote an emission or a wait for a symbol in A, and in AV.
A IOLTS is efficient in specifying input-output relations but time is not considered in the model. Therefore, an infinite amount of time can last at each state as for example between ii and oi in Figure 2.6.

Time Modeling

Timed Automaton (TA), defined in the 90’s by Alur and Dill [2], is well-known for specifying time within a model. TAs are finite automata ma-nipulating variables called clocks. A clock value domain is defined on the nonnegative reals R 0, in order to abstract and manipulate time. Thus, ev-ery duration or clock value is in model time unit (mtu). The principle of TAs is that all clocks advance synchronously with the same amount of time but can be independently set to 0. In the model, clocks are used through constraints in two cases: guards, in transitions, to forbid an execution of a transition when the constraint does not hold; invariants, in locations, to permit their occupation only when the constraint holds. We present Timed Automata with Input Output (TAIO) defined in [64] which is a TA with a set of actions A = Ain [ Aout [ f g and urgent locations. An urgent location is a location in which time is frozen and cannot advance.

Table of contents :

1 Introduction
2 Testing Mixed Music Systems: State of the Art 
2.1 Testing Mixed Music Systems
2.1.1 Composition and Authoring Systems
2.1.2 Interactive Music Systems for Performance
2.1.3 The score-based Interactive Music System Antescofo
2.1.4 Testing Interactive Music Systems
2.2 Model of Timed System
2.2.1 Input/Output System Models
2.2.2 Time Modeling
2.2.3 Network of Timed Automata
2.3 Model-Based Testing
3 Event and Time Triggered Model 
3.1 Interactive Real-Time Model
3.1.1 Principles of Interactive Real-Time Models
3.1.2 Syntax and Semantics
3.2 Correspondence with Timed Automata
3.2.1 Translation into Timed Automata
3.2.2 Soundness of the Translation
3.3 The Real-Time Virtual Machine
4 Real-Time Model-Based Testing Framework 
4.1 Automatic Model-Based Testing Workflow
4.1.1 Requirements
4.1.2 Model Construction
4.1.3 Generation of Input Test Data
4.1.4 Simulation for Generation of Reference Output
4.1.5 Test Execution
4.1.6 Comparison
4.1.7 Verdict
4.2 Model Construction Rules
4.2.1 Operators
4.2.2 Rules for requirements
4.2.3 Environment rules
4.2.4 Toy example model
4.3 Input Generation Algorithms
4.3.1 Model-Based Algorithms
4.3.2 Requirement-Based Algorithms
4.3.3 Stochastic Algorithms
5 Case Study: Application to the Interactive Music Systemv Antescofo 
5.1 Antescofo
5.1.1 Architecture
5.1.2 Antescofo Domain Specific Language
5.2 Model-Based Testing Antescofo
5.2.1 Model Construction
5.2.2 Antescofo Models
5.2.3 Applying Test Framework
5.3 Experiments
5.3.1 Results with Offline Approach
5.3.2 Results with Online Approach
6 Conclusion and Perspectives 
6.1 Discussions
6.2 Related Work
6.3 Future Work


Related Posts