Towards Writing of Time and Interaction in Computer Music

Get Complete Project Material File(s) Now! »

Psychology of musical expectation

In this section we briefly look at the psychology of musical expectation, presented as a “natural science” approach to the perception and cognition of music as pursued by many researchers to model musical expectation in an eﬀort to explain musical behavior. We begin this review by presenting several areas of experimental research in cognitive musicology with regard to expectation, with an aim of emphasizing the role of expectation in many aspects of music research. In sections 2.1.2 and onwards, we look more closely at some important concepts taken out of the literature which strongly inspire the modeling premises and definitions presented in section 2.2 and used throughout our proposals following part II of this thesis.

Experimental Research Scopes

Research on the psychology of musical expectation is built on both experimental and theoretical frameworks where the latter is most often the fruit of the former. In this thesis we are concerned with theoretical implications of musical expectations in order to motivate design principles that will be introduced in the following chapter. Due to the centrality of musical expectation in the perception and cognition of music, the range of experiments is vast and touches many key problems in music cognition. Here, we briefly introduce some of the experimental setups in music perception literature dealing with the issue to motivate the upcoming theoretical aspects. For more details the reader is referred to (Huron, 2006, Chapter 3).
An important area of research dealing with expectancy is studies on systematic variation in listeners’ judgements when exposed to a musical continuum. The main finding of these studies is that expectations are predictable from various music theo-retic and perceptual/cognitive principles of pattern classification (Schmuckler, 1990). Another similar research paradigm is processing and encoding of music information among listeners. For example Bharucha and Stoeckig (1986) showed that a target chord is responded to more quickly and accurately following a harmonically related prime chord, compared to on preceded by a harmonically unrelated chord.
The eﬀect of musical expectation in production and performance of music infor-mation has also been a subject of experimental research. For example Carlsen (1981) had listeners sing in response to a continuation of a two-note interval. The time in-terval between the two notes as well as the response interval were analyzed for their frequency of occurrence and the context interval. Results demonstrate that response intervals vary with context.In another experiment, Schmuckler (1989) undertook ex-periments on pianists and had them complete diﬀerent melodic and harmonic-context contours. He found that performers’ production mirrored expectancy judgements. One criticism of works done in this field is that the majority of these works study single-event anticipation in a sequence rather than multi-events. Despite this lim-itation, these experiments underly the importance of musical expectation in many musical activities ranging from performance to listeners’ judgments.
Researchers working on musical memory have also contributed to the debates and experiments on music expectation. The majority of research in this field has been ded-icated to the importance of tonality in musical memory specially in long-term memory structures. One candidate model is that expectancy and memory are positively cor-related; that is, high-expectancy information is better remembered (Bartlett, 1932). Another model is the Restorﬀ eﬀect stating that isolating items from background en-hances learning and thus leads to faster access and recall in the memory (Schmuckler, 1997).
The scopes of musical expectation in experimental research is vast and what came above is just a glimpse of an ongoing eﬀort in the research community. Besides revealing the importance of expectation in many aspects of music perception, such eﬀorts have helped the emergence of various theories of auditory perception whose key facets are studied in the coming sections.

Auditory Learning

In the music cognition literature, there is still ongoing debate on innate or learned nature of many auditory phenomena. Huron shows that nature does not have this preoccupation. From a biological perspective, there is a clear criterion for which it is best for a behavior to be instinctive and when it is best for the behavior to be learned. The determining factor is the stability of the environment. When there is little environmental change, conditions favor instinctive or innate behavior which are usually fast and eﬀective. On the contrary, when the environment changes quickly it is best to learn. Therefore, the diﬀerence between instinctive and learned behavior is not that the first is genetic and the second environmental. Contrary to intuition, learning involves more genetic machinary than do instinctive behaviors and instincts reflect a longer and more profound interaction with the environment than does learning. Therefore, the process of learning is just as much a product of evolution by natural selection as any pre-wired instinct. This evolved capacity to learn is referred to as the Baldwin Eﬀect (See Huron, 2006, Ch. 4).
But how does auditory learning occur? Over the past half century, experimental research has shown that we are sensitive to the frequency of various stimuli in our environments. An example of such research has led to the Hick-Hyman law, which shows that the reaction-time responses to known and unknown stimuli follow an or-derly (logarithmic) law. Said in other words: perception is more eﬃcient for expected stimuli than for unexpected stimuli.
One of the most important discoveries to this end in auditory learning has been that listeners are sensitive to the probabilities (or contingent frequencies) of diﬀerent sound events and patterns, and that these probabilities are used to form expectations about the future. In other words, auditory learning is shaped by the frequency of occurrence of individual stimuli and groups of stimuli. An important landmark evi-dence for this is the work of Saﬀran et al. (1999) at the University of Rochester. In one of their experiments, they constructed small musical vocabularies consisting of 3-note figures. Using these figures, they constructed a long (seven minute) sequence that consisted of a random selection of six figures. Newborn infant were exposed to a continuous succession of tones for 21 minutes. After this exposure, they were newly exposed to 4 diﬀerent 3-note sequences and their familiarity ranks were checked (using the head-turning eﬀects). As a result, infants correctly identified three-note sequences they had been exposed to. The conclusion was that listeners were simply cuing on simple statistical properties of various tone sequences. Saﬀran’s experiment has been recently replicated by Loui et al. (2006) on adult subjects and using an uncommon tuning system (to avoid veridical and schematic expectations on familiar patterns among adult listeners), leading to similar observations. Similar observations have also been reported for reproduction of rhythmic patterns among westerners matching the actual distribution of such rhythms in western music (Sadakata, 2006).
An important property of auditory learning is that learned mechanisms can be fallible but still useful. In other words, what happens in the brain is a statistical approximation of the outside world and not the thing itself (See Huron, 2006, Ch. 4). This is in parts due to how sounds and music patterns are mentally represented in our brain which is our next topic.

Concurrent and Competitive Representations

Following observations in the previous section, one might ask about the information contents of mental representations on which contingencies or statistics are made. The brain does not store sounds. Instead, it interprets, distills and represent sounds1. It is suggested that brain uses a combination of several underlying presentations for musical attributes. A good mental representation would be one that captures or approximates some useful organizational property of an animal’s actual environment (Huron, 2006, Ch. 7).
But how does the brain know which representation to use? Huron suggests that expectation plays a major role. There is good evidence for a system of rewards and punishments that evaluates the accuracy of our unconscious predictions about the world. Our mental representations are being perpetually tested by their ability to usefully predict ensuing events, suggesting that competing and concurrent represen-tations may be the norm in mental functioning (Huron, 2006). This view is strongly supported by the neural Darwinism theory of Edelman (1987). According to this theory, representations compete with each other according to Darwinian principles applied to neural selection. Such neural competition is possible only if more than one representation exists in the brain. In treating diﬀerent representations and their expectation, each listener will have a distinctive listening history in which some rep-resentations have proved more successful than others.

Mental Representations of Expectation

According to Huron, memory does not serve for recall but for preparation. He ad-dresses the structure rather than content of mental representations and introduces a taxonomy for auditory memory that constitutes at least four sources of musical expectations as follows (Huron, 2006, Ch. 12): Veridical Expectation: Episodic Memory is an explicit and autobiographical me-mory where specific historical events from our past are hold. Episodic memory is easily distorted through repeated retelling or recollection. Most importantly, our memories for familiar musical works are episodic memories that have lost most of their autobiographical history while retaining their accuracy. This sense of familiarity or expectation of familiar works is refereed to, by Huron (2006) and Bharucha (1993), as Veridical expectation.
Schematic Expectation: This type of expectation is associated with semantic me-mory; another type of explicit memory which holds only declarative knowledge and is distinguished from episodic memory by the fact that it does not associate the knowledge to any historical past but as stand-alone knowledge. This kind of memory is most active in first-exposure listening where our past observations and learned schemas are generalized. These sort of auditory generalizations are reminiscent of the learned categories characteristic of semantic memory.
Dynamic Adaptive Expectation: Expectation raised from short term memory is Dynamic Adaptive Expectation. It occurs when events do not conform with expectations that have been formed in the course of listening to the work it-self. These expectations are updated in real time especially during exposure to a novel auditory experience such as hearing a musical work for the first time. Pat-terns of repetition, form, and motivic structure are among musical phenomena that are linked to dynamic expectancy.
Conscious Expectation: All the three types of expectations discussed above are unconscious in origin. Another important class of expectations arise from con-scious reflection and prediction. Such explicit knowledge might come from ex-ternal sources of information (such as program notes) or as part of a listener’s musical expertise, or even arise dynamically while listening to a novel musical work. An argument for the last type, and most important for this work, is the perception of musical form during listening.
All these expectation schemes operate concurrently and in parallel. Schematic expectations are omnipresent in all of our listening experiences. When listening to a familiar work, the dynamic-adaptive system remains at work – even though the veridical expectation anticipates exactly what to expect. Similarly, when listening for the first time to an unfamiliar work, the veridical system is constantly searching for a match with familiar works. The veridical system is essential for catching the rare moments of musical quotation or allusion. In short, an anticipatory eﬀect such as surprise is a result of various types of interactions among these lower-level components of music expectation cognition. For a thorough discussion see (Huron, 2006, Ch. 12).

READ Random Matrix Theory for large system analysis and Massive MIMO design

Anticipation Defined

Expectations are more than mere representations of our beliefs and their existence are coupled with their consequent actions and eﬀects. We saw earlier that expectations entail mental representations, whether partial, accurate or fallible. They are learned through interactions with a stable or unstable surrounding environment, and enter-tain our very acts of perception (through evoking attention, appraisal or emotions as evoked in (Huron, 2006)). In this schema, it would be simplistic to distinguish any ex-pectancy from its consecutive eﬀects. This is in fact at the core of every experimental study of expectancies (section 2.1.1). In other words, expectations lead to predictions which by themselves evoke actions, either physiological, mental, or physical, that in return of their outcome aﬀect our beliefs and expectations. We study this activism aspect of cognition pertaining to expectation under the term anticipation. With this introduction, we provide the following definition for anticipation:
Definition 2.1. Anticipation is an action, that a system takes as a result of prediction, based on current belief or expectations, including actions on its own internal state or belief. We now refine this definition from diﬀerent perspectives: that of expectation, enaction and computational design.

Anticipation in view of Expectation

Expectation viewed as a cognitive process is intertwined with both biology and cul-ture (Huron, 2006, Ch. 1). It is a biological adaptation with specialized physiological structures and the capacity to form accurate expectation is important for survival among all species. Culture provides preeminent environments where many expecta-tions are acquired and assessed. When it comes to music, the context for predicting future context is dominated by the latter but does not exclude the former. From a biological perspective, the purpose of expectation is to prepare an organism for the future. It is through this preparatory physiological process that we experience arousal or attentional responses (or the reverse, reduced or inhibit responsiveness). From a phenomenological perspective, the most interesting property of expectation is that it evokes feelings and emotions. Huron proposes that emotions are evoked by expectation involving five functionally distinct physiological systems that underly his ITPRA theory: imagination, tension, prediction, reaction, and appraisal (Huron, 2006, Ch. 1). Each of these systems can evoke responses independently involving both physiological and psychological changes.
The picture to retain here is that mental representations evoking expectations are adaptive and in constant interaction with the living environment. Expectations imply some sort of mental representation or beliefs adapted to the surrounding environment. These mental representations can be partial and fallible. In all cases, these mental beliefs are in constant interaction with the living environment evoking physiological or psychological actions. Without these interactions, which reflect stabilities and instabilities of our environment, neither can expectations exist nor can we survive as living beings. Therefore anticipation as defined above, not only constitutes the outcome of expectation but also expectations themselves, how they are formed, their contents and whether they exist.

Anticipation in view of Enaction

If expectations are determined by the actions coupled with the environment, there is essentially no need to separate the representation from the action. In other words, perception is not a process in the brain but a kind of skillful activity and a mode of exploration of the environment drawing on implicit understanding of environmental regularities as depicted in section 2.1.2. This view is shared by the Enactive school of cognition (Varela et al., 1992; Noë, 2004). Such a view of cognition implies that per-ception is an activity of sensorimotor coupling with the environment, advocates the marriage of action and perception in cognitive studies, and rejects the idea widespread in both philosophy and science that perception is a process in the brain whereby the perceptual system constructs an internal representation of the world. Within this marriage of action and perception, the (internal) representation is coupled with the sensorimotor actions which for pure listening takes the form of conceptual under-standing (as developed in (Noë, 2004, Ch. 6)). Therefore, the perceptual presence in absence of accurate mental representations is assessed by an access controlled by patterns of expectational dependence with which we are familiar.
In other words, although our mental representations are rooted in empirical ob-servations, we produce them by means of our active cognitive apparatus instead of passively processing the structure of the world. Perception is a kind of skillful ac-tivity on the part of an animal as a whole, and constitute active explorations of its environment. Thus representation consists of future potentialities of interaction with an environment. The connection of the enactive school of cognition to our work will be further developed in chapter 3.

Anticipation in view of Computation

The ideas presented above have been exploited in a computational perspective. In fact, one of the main advantages of definition 2.1 is that it prepares the ground for a computational framework where the study of the interaction between a system’s belief of itself or its environment with itself becomes possible. The view is shared by an emerging field in artificial intelligence and robotics literature on anticipatory systems (Rosen, 1985, for original definition). Consideration of anticipation in a computational framework within this definition will be presented and detailed in chapter 3. Studies of anticipation along the lines of definition 2.1 imply explicit consideration and study of expectation. We therefore continue our review of key elements of the expectancy literature helpful for modeling anticipation.

Table of contents :

Acknowledgments
Abstract
Résumé en français
1 Introduction
1.1 Approach
1.2 Organization
1.3 Contributions
I From Modeling Anticipation to Anticipatory Modeling
2 Modeling Musical Anticipation
2.1 Psychology of musical expectation
2.1.1 Experimental Research Scopes
2.1.2 Auditory Learning
2.1.3 Concurrent and Competitive Representations
2.1.4 Mental Representations of Expectation
2.2 Anticipation Defined
2.2.1 Anticipation in view of Expectation
2.2.2 Anticipation in view of Enaction
2.2.3 Anticipation in view of Computation
2.3 Models of Musical Expectation
2.3.1 Music Theoretic Models
2.3.2 Automatic Learning Models
2.3.3 Information Theoretic Models
2.4 Modeling Investigations
2.4.1 Imperfect Heuristics and Naive Realism
2.4.2 Over-intellectualization of the intellect
2.4.3 Scientific pluralism
2.5 Summary
3 Anticipatory Modeling 27
3.1 Anticipatory Computing
3.2 General Modeling Framework
3.2.1 Markov Decision Process Framework
3.2.2 Interactive Learning in an Environment
3.3 Distinctions of Anticipatory Behavior
3.3.1 Implicit Anticipation
3.3.2 Payoff Anticipation
3.3.3 Sensorial Anticipation
3.3.4 State Anticipation
3.4 Learning Approaches
3.4.1 Reinforcement Learning
3.4.2 Learning Classifier Systems
3.5 Modeling Implications
3.5.1 Information as Available
3.5.2 Interactive and on-line Learning
3.5.3 Multimodal Interaction and Modeling
II What to Expect
4 Music Information Geometry
4.1 General Discussions
4.2 Preliminaries
4.2.1 Information Geometry of Statistical Structures
4.2.2 Elements of Bregman Geometry
4.2.3 Exponential Family of Distributions
4.2.4 Bregman Geometry and Exponential distributions
4.3 Music Information Geometry
4.3.1 Methodology
4.3.2 Data IR
4.3.3 Model IR
4.4 From Divergence to Similarity Metric
4.4.1 Symmetrized Bregman Divergences
4.4.2 Triangle Inequality
4.5 Incremental Model Formations
4.6 Discussions
5 Methods of Information Access
5.1 Incremental Clustering and Structure Discovery
5.1.1 Related Works
5.1.2 Audio Oracle Data Structure
5.1.3 Audio Oracle Learning and Construction
5.1.4 Sample Results
5.1.5 Discussions
5.2 Guidage: Fast Query-Based Information Retrieval
5.2.1 Research Scope
5.2.2 Related Works
5.2.3 General Framework
5.2.4 Search Domain and Meta Data
5.2.5 Guidage Algorithm
5.2.6 Resynthesis
5.2.7 Sample Applications and Results
5.2.8 Discussions
III How to Expect
6 Adaptive and Interactive Learning
6.1 Introduction
6.2 Background on Stochastic Music Modeling
6.2.1 Memory Models
6.2.2 Approaches to Statistical Learning
6.2.3 Approaches to Planning and Interaction
6.3 General Discussions
6.4 Active Learning Architecture
6.4.1 Audio Oracles for Memory Models
6.4.2 Guidage for Active Selection
6.5 Anticipatory Learning
6.5.1 Competitive and Collaborative learning
6.5.2 Memory-based Learning
6.6 Active Learning Algorithm
6.6.1 Model Complexity
6.7 Results and Experiments
6.7.1 Knowledge-Based Interactions
6.7.2 Anticipatory Style Imitation and Automatic Improvisation
6.8 Discussions
IV When to Expect
7 Anticipatory Synchronization
7.1 Introduction
7.2 Background
7.2.1 Score Following Research
7.2.2 Cognitive Foundations of Musical Time
7.2.3 Compositional Foundations of Time
7.2.4 Probabilistic Models of Time
7.3 General Framework
7.3.1 Anticipatory Multimodal Inference
7.3.2 Hybrid Models of Time
7.4 Inference Formulation
7.5 Stochastic model of time in music performance
7.5.1 Attentional Model of Tempo
7.5.2 Tempo Agent and Decoding
7.5.3 Survival Distribution Model
7.6 Music Score Model
7.6.1 Basic Events
7.6.2 Special timed events
7.7 Observation Model
7.8 Evaluation
7.8.1 Evaluation of Tempo Prediction
7.8.2 Evaluation over synthesized audio from score
7.8.3 Evaluation of real-time Alignment
7.9 Discussions
8 Towards Writing of Time and Interaction in Computer Music
8.1 Background
8.1.1 Computer Music Language Paradigms
8.1.2 Practical Status
8.1.3 Compositional Status
8.1.4 Research Status
8.2 Antescofo: A preliminary tool for writing of time and interaction
8.2.1 Motivations
8.2.2 General Architecture
8.3 Antescofo: A modular and concurrent synchronizer
8.4 Antescofo’s Score Semantics
8.4.1 Event Declarations
8.4.2 Control Commands
8.4.3 Action Declarations
8.5 From the Time of Composition to the Time of Performance in Antescofo
8.6 Discussions and Future Directions
8.6.1 Augmenting the Semantics of Interaction
8.6.2 Multimodal Coordination
8.6.3 Intuitive Interfaces
8.6.4 Relating to the Community
9 Conclusions
9.1 The story so far
9.2 Outlook
V Appendices
A Supplemental Material for part II
A.1 Properties of Multinomial Manifolds
A.2 Bregman Divergence Symmetrization
A.2.1 Geodesic-walk Algorithm for Multinomial Manifolds
B Supplemental Material for Part IV
B.1 Derivation of Forward Recursion
B.2 Raphael’s Tempo Inference Model
References