mubu.*mm: Probabilistic Models for Designing Motion & Sound Relationships

June 29, 2014 #GMM #GMR #HMM #HMR #Mapping-by-Demonstration #Mapping #Max #MuBu

We just released the beta version of mubu.*mm, a set of objects for probabilistic modeling of motion and sound relationships.

Machine Learning for Sonic IxD

Designing the relationships between motion and sound is a crucial aspect of the creation of interactive sonic systems. In this context, Machine Learning has proved to be an efficient design support tool, allowing users to easily build, evaluate and refine gesture recognizers, movement-sound mappings and control strategies. This collection of objects provides a set of tools for building mappings interactively. We propose 4 probabilistic models with complementary properties in terms of multimodality and temporality. We believe that a clear definition and understanding of these properties is crucial to design interactions efficiently.

2 Criteria

We define 2 criteria for understanding the models’ properties: multimodality and temporality. Multimodality: Movement models provide a statistical modeling of the input modality only. They allow for gesture classification, recognition and following. Multimodal models provide a joint representation of motion and sound parameter streams, and are therefore adapted to learning continuous motion-sound mappings. Temporality: Instantaneous models implement a direct relationship between motion and sound features, meaning that the output of the model only depends on the instantaneous motion features. At the contrary, temporal models take into account the temporal evolution of movement parameters. As a result, the output of the model is dependent on the entire history of the input, which is typically important for dynamic gesture recognition.

4 Models

We propose 4 models that intersect these criteria: Gaussian Mixture Models (mubu.gmm), Gaussian Mixture Regression (mubu.gmr), Hierarchical Hidden Markov Models (mubu.hhmm), and Hierarchical Multimodal Hidden Markov Models (mubu.xmm).

Schematic representation of the 4 probabilistic models

Implementation

The models are implemented in an Interactive workflow in the Max programming environment. The implementation is extensively based on MuBu, a multimodal data container used to build, visualize and annotate the training data (see screenshots). The models are designed for a performance use, where the output parameters of the models are continuously updated, allowing continuous real-time recognition and mapping.

Download

The models are part the the MuBu distribution, freely available on Ircam Forumnet. Note that the models have been released as a beta version! Please provide feedback, bug reports and feature requests on the MuBu user group.


References

This work has been presented at NIME 2014. Download the slides here.

  • Jules Françoise, Norbert Schnell, Riccardo Borghesi, and Frédéric Bevilacqua, “Probabilistic Models for Designing Motion and Sound Relationships,” in Proceedings of the 2014 International Conference on New Interfaces for Musical Expression (NIME'14), London, UK, 2014. http://julesfrancoise.com/blog/wp-content/uploads/2014/06/Françoise-et-al.-2014-Probabilistic-Models-for-Designing-Motion-and-Sound-Relationships.pdf.
    Abstract
    We present a set of probabilistic models that support the design of movement and sound relationships in interactive sonic systems. We focus on a mapping--by--demonstration approach in which the relationships between motion and sound are defined by a machine learning model that learns from a set of user examples. We describe four probabilistic models with complementary characteristics in terms of multimodality and temporality. We illustrate the practical use of each of the four models with a prototype application for sound control built using our Max implementation.
    Download Project Page
    Acceptance Rate: 25%

Demo Video

Screenshots

Featured image