Instrumental Music Skills as Input Primitive for Visual Creative Authoring

Get Complete Project Material File(s) Now! »

Syntactic and semantic knowledge

Shneiderman uses the syntactic/semantic model of user behavior to explain the power of DM (Shnei-derman 1983). On the one hand, syntactic knowledge relates to the arbitrary syntax that requires memoriza-tion for user interaction. One example is the syntax of programming languages, which are often unique: The same software (e.g., a calculator) can be implemented in multiple diﬀerent languages, each one with a unique syntax (e.g., one can implement a calculator in either C, Python, or Java). To code the calculator, one needs to memorize the specific syntax and to articulate instructions in terms of this specific syntax. On the other hand, the semantic knowledge concerns context-specific concepts related to the domain where the interaction occurs. This knowledge would hierarchically increase from low to high-level concepts where one depends on the other. For the context of music, for example, se-mantic knowledge in increasing complexity could include notes, timing, dynamics, scales, chords, harmony, and genre-specific repertoire. Unlike syntactic knowledge, semantic knowledge is inde-pendent of system-specific rules.
For Shneiderman, the power of DM comes from abstracting away the syntactic knowledge, empowering users to focus instead on semantic knowledge–which is often familiar to them. As Shneiderman explains: “The object of interest is displayed so that actions are directly in the high-level problem domain. There is little need for decomposition into multiple commands with a complex syntactic form. On the contrary, each command produces a comprehensible action in the problem domain that is immediately visible, The closeness of the problem domain to the command action reduces operator problem-solving load and stress” (ibid.).

Gulf of execution and gulf of evaluation

Hutchins and colleagues (1985) provided a cognitive account for the eﬃcacy of DM according to two critical problems in interacting with computer systems: the gulf of execution and the gulf of evaluation. On the one hand, the gulf of execution concerns the problem of how the user expresses intention using the available controls provided by the interface. In other words, it regards users expectations on how the interface controls may help them to concretize their intentions. Hutchins and colleagues argue that the gulf of execution can be reduced by “making the commands and mechanics of the system match the thoughts and goals of the user” (ibid.).
On the other hand, the gulf of evaluation concerns the problem of understanding the results of users actions through the feedback provided by the interface. Hutchins and colleagues argue that this gulf can be reduced by “making the output displays present a good conceptual model of the system that is readily perceived, interpreted and evaluated” (ibid.).
Hutchins and colleagues argue that the higher the gulfs (of execution and evaluation) one interface has, the higher is the cognitive eﬀort required by that particular interface requires, making the interface less direct. Therefore, implementing an eﬀective DM system could be achieved by reducing both gulfs.

Instrumental degrees for directness

Human beings use a wide diversity of tools to interact with the real physical world, augmenting the capabilities of the human body. Using this metaphor, Michel Beaudouin-Lafon generalizes DM in terms of an interaction model he called instrumental interaction (Beaudouin-Lafon 2000). The instrumental interaction is based on two key components: the domain objects (i.e., ob-jects of interest) and the tools1 (i.e., “instruments”). Domain objects concern the computational representation we interact upon. Tools are actionable computational mediators, that enable us to interact with domain objects, providing feedback while the action happens. For example, in traditional drawing applications, the canvas represents a domain object (i.e., computational rep-resentation of a sheet of paper), whereas the pencil and the paint bucket represent tools (i.e., one representing freehand-drawing, another representing painting a particular area).
Beaudouin-Lafon argues that instrumental power can be explained in terms of three properties: Degree of indirection concerns the spatial and temporal distance between the tools’ actions and their perceivable eﬀects on the domain object. More specifically, spatial distance concerns the match between where the action occurs and where its outputs take place. For example, the pencil has a low spatial distance as the place where the drawing is displayed is almost the same where the pencil drawing occurs. Temporal distance concerns the match between when the action occurs and when its outputs take place. For example, the pencil has a low temporal distance as the drawing trace is continuously displayed as drawing occurs. Therefore, the lower the degree of indirection (i.e., the lower the spatiotemporal distance), the higher the directness.
Degree of integration concerns the match between the degrees of freedom provided by the input device with the degrees of freedom provided by the control parameters of the tool. For example, navigating multidimensional data using a two-dimensional mouse will likely result in a low degree of integration, and therefore, in lower directness.
Degree of compatibility concerns the behavioral match between physical actions required by the tools and their perceivable eﬀects on the object domains. As Beaudouin-Lafon exempli-fies: “Using text input fields to specify numerical values in a dialog box, e.g., the point size of a font, has a very low degree of compatibility because the input data type is diﬀerent from the output data type. Similarly, specifying the margin in a text document by entering a number in a text field has a lower degree of compatibility than dragging a tab in a ruler” (Beaudouin-Lafon 2000). Therefore, the fewer mediators are necessary for the interaction, the higher the degree of compatibility and the higher the directness. Beaudouin-Lafon explains that eﬀective directness translates to a low degree of indirection, a high degree of integration, and a high degree of compatibility. Successful DM systems are capable of articulating these characteristics.

Where are we today?

While almost 40 years old, DM remains a relevant research topic in HCI. One recent research direction argues that traditional WIMP GUIs are intrinsically indirect, as these interfaces do not seem to fulfill the required accounts for directness. For example, dialog boxes have arguably a high degree of spatial and temporal indirection, as they shift user attention from the object of interest to the dialog window. Similarly, “specifying the margin in a text document by entering a number in a text field has a lower degree of compatibility than dragging a tab in a ruler” (Beaudouin-Lafon 2000). In this direction, some authors have proposed new powerful ways to explore DM—for beyond WIMP GUIs. Here, we cover some examples.
Dragicevic and colleagues (2008) explore DM in the context of video browsing. Instead of relying on time-oriented seeker bars (traditionally used in videos), the authors propose a new interaction technique focused on visual objects of the video, showing how their trajectory flow over time by mouse dragging–implemented in a new tool called the DimP. Evaluation has shown that DimP outperforms traditional seeker bars for navigating video based on specific visual elements, and was characterized as cleaner and easier-to-use by participants. Recent Ph.D. theses have also investigated DM. Bridging research from information visualiza-tion and HCI (Perin 2014), Charles Perin explores DM principles to introduce novel interaction techniques for information visualization. Examples can be found on the ‘À Table!’, that enables users to directly manipulate sports ranking tables, focusing on how variables (e.g., number of points, victories, and defeats) evolve and compare to each other over time. Interaction occurs by mouse dragging visual elements of the ranking table, instead of relying on interface widgets. A mixed-methods user study with 143 participants suggests that ‘À Table!’ enabled participants to improve their analysis of soccer championships compared to traditional ranking tables. A similar example is the Interactive Horizon Graphs, which involves parallel visualization and comparison of multiple time series. A quantitative user study with 9 participants shows that the Interactive Horizon Graphs improved task performance–in terms of time, correctness, and error–for scenarios involving manipulation of a large number of time series.
Another recent Ph.D. thesis applies DM to traditional desktop interaction (Aceituno 2015). Here, Jonathan Aceituno explores diﬀerent accounts of directness within the HCI community, synthesizing them into a unified theoretical framework for directness. This framework drives the development of novel interaction techniques to extend the expressive capacity of standard input devices for desktop environments. One example is the subpixel interaction, a technique that enriches the mouse to enable continuous direct manipulation in-between discrete screen pixels, resulting in increased mouse accuracy in constrained display areas. Subpixel interaction introduces a wide range of new expressive applications such as editing calendar events (with minute precision) and video frame selection (with frame precision). Another example is the push-edge and the slide-edge, two novel auto-scrolling techniques, that outperform the default scrolling technique used in OS X computers in terms of reduced selection time and reduced overshooting.
Another context that recently explored DM concerns end-user programming. Hempel and Chugh (2016) introduce the Sketch-n-Sketch, an authoring environment for SVG drawing that brings DM to text-based structured programming. With Sketch-n-Sketch, snippets of textual programming code could be created via a WIMP GUI, allowing parameters customization via mouse dragging (e.g., one could drag-and-drop variables to change their values, and verify the impact of this change in the SVG in real-time). As a consequence, non-expert programmers can benefit from basic templates and customization, while experts can create more complex extensions to fit their needs.

READ BUILDING UP THE LOCAL CHURCH THROUGH A NEED-ORIENTED DIACONAL MINISTRY

Mainstream WIMP GUIs in music tools

Following the tradition of commercial human-computer systems, the successful interface type for music tools are WIMP GUIs (c.f., section ‘Direct Manipulation’). I also include within this metaphor visual controls that emulate those physical input controls commonly used in hardware music equipment, such as knobs, buttons, and faders. Sometimes, even the ‘look and feel’ of these hardware pieces are visually emulated. This metaphor has been implemented by the most widely-used and successful commercial software music tools. Today, several examples of Digital Audio Workstations (DAWs), such as Ableton Live, Pro Tools, and Apple’s Garage Band, Virtual Studio Technology (VSTs), and even some audio programming environments (e.g., Reaktor)–to name a few–illustrate this paradigm.

Mobile-based applications

With the popularization of mobile computing and software-distribution platforms like the Apple Store and Google Play, a plethora of remarkable touch mobile-based musical applications are being designed today–contributing only in 2015 to an estimated 28.5 million dollars market of computer music applications (National Association of Music Merchants 2015). A relatively recent survey of these applications (focused only on iOS) can be found on Thor Kell Master thesis (2014), contain-ing 337 tools analyzed according to their interface metaphor and mapping strategies employed. Notable examples such as the Loopy, the Borderlands, the Ocarina illustrate how these applica-tions can be intuitive and powerful, captivating amateur and expert musicians alike. Chapter 3 attempts to go in this same direction.

Visual programming for computer music

Visual-oriented tools are also common in computer music programming. Examples such as Max MSP (Puckette 2002) and Puredata (Puckette 1997)–two dataflow-based visual programming environments–are among the most popular tools in the computer music community. In a re-cent study (2018), Pošcic and Krekovic surveyed 218 users of five of these tools–including Max MSP and Puredata–and showed that: 1) most developers are male and highly academic trained in music; 2) most of them have first heard of music programming while during university (28%) and from friends (26%); and 3) that most of them learn programming by online written (61%) and video (39%) tutorials.

Table of contents :

1 Introduction
1.1 Direct manipulation
1.1.1 Syntactic and semantic knowledge
1.1.2 Gulf of execution and gulf of evaluation
1.1.3 Instrumental degrees for directness
1.1.4 Where are we today?
1.2 Goal: Direct manipulation for musical interface design
1.3 Musical interfaces based on visuals
1.3.1 Iánnis Xenákis’ UPIC and hand-sketching metaphor
1.3.2 Laurie Spiegel’s Music Mouse
1.3.3 Mainstream WIMP GUIs in music tools
1.3.4 Mobile-based applications
1.3.5 Visual programming for computer music
1.3.6 Golan Levin’s painterly interfaces
1.3.7 Sergi Jordà’s sonigraphical instruments
1.3.8 Jean-Michel Couturier’s graphical interaction for gestural control
1.3.9 Thor Magnusson’s screen-based instruments
1.3.10 Ge Wang’s principles of visual design
1.3.11 Virtual reality musical instruments
1.3.12 Frameworks for musical interface design
1.4 Evaluating musical interface design
1.5 Summary
1.6 Thesis structure
2 What does “Evaluation” mean for the NIME community?
2.1 Abstract
2.2 Introduction
2.3 Background
2.4 Research questions
2.5 Methodology
2.6 Results
2.6.1 Question 1: Evaluated Target
2.6.2 Question 2: Goals of the Evaluation
2.6.3 Question 3: Criteria
2.6.4 Question 4: Approach
2.6.5 Question 5: Duration
2.7 Discussion
2.8 Problems & Limitations
2.9 Conclusion
2.10 Acknowledgments
3 Exploring Playfulness in NIME Design: The Case of Live Looping Tools
3.1 Abstract
3.2 Introduction
3.3 Background
3.3.1 Playfulness in NIME
3.4 Methodology
3.4.1 Choose a case study
3.4.2 Survey of existing tools
3.4.3 Create a design space
3.4.4 Explore potential guidelines for achieving playfulness
3.4.5 Iteratively prototype solutions
3.5 Implementation
3.5.1 Advanced looping capacity
3.5.2 Low input capacity and direct mappings
3.5.3 Transparent and intense visual feedback
3.6 Discussion
3.7 Conclusion
3.8 Acknowledgements
4 Designing Behavioral Show-Control for Artistic Responsive Environments: The Case of the ‘Qualified-Self’ Project
4.1 Abstract
4.2 Introduction
4.3 Related work
4.4 Design study
4.4.1 Requirements elicitation & Survey
4.4.2 Field observation & Interviews
4.4.3 Ideation
4.5 Designing behavioral show-control
4.5.1 Introducing ZenStates
4.5.2 Introducing ZenTrees
4.6 Evaluation
4.6.1 Preliminary “real world” application
4.7 Discussion
4.7.1 Contrasting ZenStates and ZenTrees
4.8 Limitations & Future work
4.9 Conclusion
4.10 Acknowledgment
5 ZenStates: Easy-to-Understand Yet Expressive Specifications for Creative Interactive Environments
5.1 Abstract
5.2 Introduction
5.3 Related work
5.3.1 Non-programmers expressing interactive behaviors
5.3.2 Experts programming interactive behaviors
5.4 Introducing ZenStates
5.4.1 Enriched state machines as specification model
5.4.2 Key features
5.5 Implementation
5.6 Evaluation
5.6.1 Exploratory scenarios
5.6.2 User study
5.7 Limitations
5.8 Conclusion
6 Instrumental Music Skills as Input Primitive for Visual Creative Authoring
6.1 Abstract
6.2 Introduction
6.3 Background
6.3.1 Creative authoring for music
6.3.2 Fostering creativity in authoring systems
6.4 Instrumental Music Skills as Input Primitive
6.4.1 DG1: Building upon existing musical expertise
6.4.2 DG2: Lowering specification model complexity
6.4.3 DG3: Increasing authoring environment directness
6.5 Introducing StateSynth
6.5.1 State machines, states, and tasks
6.5.2 The Blackboard
6.5.3 Transitions
6.5.4 Extensibility
6.6 Evaluation
6.6.1 Procedure
6.6.2 Data analysis
6.7 Results
6.7.1 Expressive ceilings
6.7.2 Learnability thresholds
6.8 Limitations
6.9 Summary and Perspectives
7 Conclusion
7.1 Interface Design Strategies
7.1.1 Continuous representation of the object of interest
7.1.2 Physical actions instead of complex syntax
7.1.3 Rapid, incremental, reversible operations
7.1.4 Impact on the object of interest is immediately visible
7.2 Evaluation Approaches
7.3 Limitations and Criticisms
A Surveying Live Looping Tools
B ZenStates User Study
B.1 Raw data
B.2 Data analysis script
C StateSynth User Study–Raw transcriptions
C.1 Participant 1
C.1.1 Questionnaire
C.1.2 Profiling interview
C.1.3 First meeting
C.1.4 Second meeting
C.1.5 Retrospective meeting
C.2 Participant 2
C.2.1 Questionnaire
C.2.2 Profiling interview
C.2.3 First meeting
C.2.4 Second meeting
C.2.5 Retrospective meeting
D Supplementary video