Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Similar documents
2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

CONTENT-BASED MELODIC TRANSFORMATIONS OF AUDIO MATERIAL FOR A MUSIC PROCESSING APPLICATION

Importance of Note-Level Control in Automatic Music Performance

Director Musices: The KTH Performance Rules System

TempoExpress, a CBR Approach to Musical Tempo Transformations

Robert Alexandru Dobre, Cristian Negrescu

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

A prototype system for rule-based expressive modifications of audio recordings

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

Rhythm related MIR tasks

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

A case based approach to expressivity-aware tempo transformation

BRAIN-ACTIVITY-DRIVEN REAL-TIME MUSIC EMOTIVE CONTROL

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis

SMS Composer and SMS Conductor: Applications for Spectral Modeling Synthesis Composition and Performance

Introductions to Music Information Retrieval

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

Artificial Social Composition: A Multi-Agent System for Composing Music Performances by Emotional Communication

Analysis of local and global timing and pitch change in ordinary

A Computational Model for Discriminating Music Performers

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

A Case Based Approach to the Generation of Musical Expression

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos

Music Radar: A Web-based Query by Humming System

Quarterly Progress and Status Report. Musicians and nonmusicians sensitivity to differences in music performance

Music Similarity and Cover Song Identification: The Case of Jazz

Guide to Computing for Expressive Music Performance

A Case Based Approach to Expressivity-aware Tempo Transformation

Measuring & Modeling Musical Expression

Automatic Rhythmic Notation from Single Voice Audio Sources

Transcription of the Singing Melody in Polyphonic Music

Music Understanding and the Future of Music

Computational Modelling of Harmony

The song remains the same: identifying versions of the same piece using tonal descriptors

GCSE MUSIC UNIT 3 APPRAISING. Mock Assessment Materials NOVEMBER hour approximately

Topic 10. Multi-pitch Analysis

PULSE-DEPENDENT ANALYSES OF PERCUSSIVE MUSIC

Learning Singer-Specific Performance Rules

Automatic scoring of singing voice based on melodic similarity measures

On the contextual appropriateness of performance rules

Using the MPEG-7 Standard for the Description of Musical Content

METRICAL STRENGTH AND CONTRADICTION IN TURKISH MAKAM MUSIC

USING A PITCH DETECTOR FOR ONSET DETECTION

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal

NEW QUERY-BY-HUMMING MUSIC RETRIEVAL SYSTEM CONCEPTION AND EVALUATION BASED ON A QUERY NATURE STUDY

Improving Beat Tracking in the presence of highly predominant vocals using source separation techniques: Preliminary study

IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM

Transcription An Historical Overview

Music Understanding by Computer 1

Comparative analysis of expressivity in recorded violin performances. Study of the Sonatas and Partitas for solo violin by J. S.

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Computational analysis of rhythmic aspects in Makam music of Turkey

11/1/11. CompMusic: Computational models for the discovery of the world s music. Current IT problems. Taxonomy of musical information

Music Information Retrieval Using Audio Input

Expressive Music Performance Modelling

A PRELIMINARY COMPUTATIONAL MODEL OF IMMANENT ACCENT SALIENCE IN TONAL MUSIC

Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification

Multidimensional analysis of interdependence in a string quartet

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Automatic scoring of singing voice based on melodic similarity measures

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

On Interpreting Bach. Purpose. Assumptions. Results

Neuratron AudioScore. Quick Start Guide

jsymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada

Subjective Emotional Responses to Musical Structure, Expression and Timbre Features: A Synthetic Approach

Chords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm

3/2/11. CompMusic: Computational models for the discovery of the world s music. Music information modeling. Music Computing challenges

MELODY EXTRACTION FROM POLYPHONIC AUDIO OF WESTERN OPERA: A METHOD BASED ON DETECTION OF THE SINGER S FORMANT

CSC475 Music Information Retrieval

The Keyboard. Introduction to J9soundadvice KS3 Introduction to the Keyboard. Relevant KS3 Level descriptors; Tasks.

A Beat Tracking System for Audio Signals

Melody transcription for interactive applications

Week 14 Music Understanding and Classification

Automatic Labelling of tabla signals

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping

A DISCRETE FILTER BANK APPROACH TO AUDIO TO SCORE MATCHING FOR POLYPHONIC MUSIC

Evaluating Melodic Encodings for Use in Cover Song Identification

Melody, Bass Line, and Harmony Representations for Music Version Identification

INSTLISTENER: AN EXPRESSIVE PARAMETER ESTIMATION SYSTEM IMITATING HUMAN PERFORMANCES OF MONOPHONIC MUSICAL INSTRUMENTS

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1

Zooming into saxophone performance: Tongue and finger coordination

Time Signature Detection by Using a Multi Resolution Audio Similarity Matrix

The Keyboard. An Introduction to. 1 j9soundadvice 2013 KS3 Keyboard. Relevant KS3 Level descriptors; The Tasks. Level 4

The Human Features of Music.

ESP: Expression Synthesis Project

Topics in Computer Music Instrument Identification. Ioanna Karydi

Representing, comparing and evaluating of music files

From Score to Performance: A Tutorial to Rubato Software Part I: Metro- and MeloRubette Part II: PerformanceRubette

Automatic music transcription

Interacting with a Virtual Conductor

Statistical Modeling and Retrieval of Polyphonic Music

RUMBATOR: A FLAMENCO RUMBA COVER VERSION GENERATOR BASED ON AUDIO PROCESSING AT NOTE-LEVEL

A FUNCTIONAL CLASSIFICATION OF ONE INSTRUMENT S TIMBRES

Extracting Significant Patterns from Musical Strings: Some Interesting Problems.

Similarity matrix for musical themes identification considering sound s pitch and duration

On music performance, theories, measurement and diversity 1

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

Transcription:

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra Barcelona, Spain {mpuiggros, egomez,rramirez,xserra}@iua.upf.edu Roberto Bresin Speech, music and hearing Royal Insitute of Technology Stockholm, Sweden roberto@kth.se ABSTRACT Expressive performance characterization is traditionally based on the analysis of the main differences between performances, players, playing styles and emotional intentions. This work addresses the characterization of expressive bassoon ornaments by analyzing audio recordings played by a professional bassoonist. This characterization is then used to generate expressive ornaments from symbolic representations by means of Machine Learning. INTRODUCTION Expressive performance characterization analyzes differences in performances, performers, playing styles and emotional intentions (Juslin and Sloboda 2002). Most research focus on studying timing deviations, dynamics and vibrato (see for instance (Sudberg et alt. 2003) and (Bressin and Friberg 2000)). Nevertheless, there is less research devoted to ornamentation. Ornaments are indicated in the score, without any explicit information about timing and dynamics. Some works have already studied the behaviour of ornaments from piano performances (Moore 1992). We study here how this study for the piano can be extended to other instruments, as the bassoon, a woodwind instrument. Due to the unavailability of expressive MIDI extracted from bassoon performances, we analyze directly expressive audio recordings played by a professional musician. METHOD The block diagram of the system is presented in Figure 1. We divide this study in two main stages, analysis and synthesis, which correspond with the main goals of this work. First, to study the behaviour of ornamentation by analyzing timing and dynamics from bassoon recordings. Then, the acquired knowledge is used for the generation of expressive trills in symbolic notation using some machine learning tools. In the analysis stage, we describe the process to describe ornament s behaviour, more precisely trills and appoggiaturas, by means of automatically extracting timing and dynamics information from bassoon recordings. The recordings used in this study belong to a Sonata of Michel Corrette (composer of XVIII century). Each movement is played in three different tempi, obtaining a total of 96 ornaments including trills and appoggiaturas. The result of this analysis is a melodic description for each ornament. In the synthesis stage, we first study the ornament s behaviour using different machine learning methods from the information obtained in the analysis. Finally, and also by another machine learning method, we generate expressive ornaments in symbolic notation, introducing them as notes in the input melody. Figure 1: Block diagram of the system Analysis The analysis stage consists on the melodic description of sound material. As mentioned above, we characterize a set of expressive recordings of a Sonata by Michel Corrette (a baroque epoch s sonata) played by a professional bassoon performer. There are three movements: Adagio, Allegro moderato and Affettuoso. Each movement is played in three different tempi. Adagio is played at 50, 68, 100 bpm and Allegro moderato and Affettuoso is played at 60, 92, 120 bpm.

Fundamental Energy Storage information in the file Detection of onsets of the fundamental Energy onsets Figure 2: Description of the steps of analysis Final onsets selection Final fundamental calculation of final onsets Post-processed (Correction of fundamental ) Thus we have obtained a total of 96 ornaments (trills and appoggiaturas), as a collection to study the different expressive variations from the same ornaments. The analysis is carried out by the algorithm shown in Figure 2. Some of the steps have already been presented in (Gomez 2002, Gomez et al. 2003),. We have adapted the algorithm parameters to the specific characteristics of the bassoon in order to consider pitch range, note duration (between 0.05 and 0.04 seconds as trill s execution is very quickly) and short intervals between notes, 1 or 2 semitones. We first estimate the instantaneous (on a frame basis) fundamental and energy from the audio recordings, only analysing the ornaments obtained from these interpretations. After this computation we compute a perform a segmentation in order to obtain onset, offset and fundamental information for each ornamental note. The onset algorithm is based on (Klapuri 1999). We can see an example in Figure 3. Figure 3: Onsets and offsets detected from instantaneous energy and fundamental. The red lines indicate the onsets and the blue lines the offsets. After detecting all possible onsets, we make a selection of onsets choosing the most suitable ones throw a set of rules. First we verify that notes are consecutive, i.e. there is no overlap between them. When there is an overlap, we have to move the offset in order to make it equal to the next onset, as in Figure 4 and. Figure 4: Correction of detected onsets. The top panel shows the estimated onsets. In the middle panel, overlapped notes have been merged, and in the bottom panel, too-shorts are also deleted. Figure 5: Example of the analysis results.

Having the final onset values, we compute again the fundamental for each of the ornamental notes. Then, we correct fundamental values in order to check the alternation of the notes of the trill and so that the distance between two notes only can be 1 or 2 semitones. Hence, we have obtained the final note s descriptors: onsets, offsets and fundamental. In Figure 5 it is possible to see an example of the final result with all descriptors. We store the descriptors in a text file, as shown in Figure 6. Although these descriptors we also save the context of each ornament: the note anterior and posterior with their respective durations, the beat, tempo and movement. Synthesis The synthesis block deals with the generation of expressive ornamentations by using the results of the analysis part. Load the information of MIDI melody Load information about ornament s context of XML file Load the information about TXT file: characteristics of analyzed ornaments Onset Offset Frequency 0.0000000 0.2449990 392.00 0.2449990 0.2958233 419.784 0.2958233 0.3773240 392.841 0.3773240 0.5108390 419.784 0.5108390 0.7183670 398.419 0.7183670 0.9171880 354.985 0.9171880 1.1494100 392.135 Figure 6: Example of melodic descriptors. The onset and offset are coded in seconds and fundamental in Hz. These descriptors will be used in the synthesis part. Search, by each trill defined in XML, the most similar in TXT file Generate a new ornament using characteristics of select ornament and main note Adapt every note to the tonality of each ornament and correct their final offset To substitude the main note to the ornament Generate the MIDI with the song that contain the generate ornaments Given a score of a melody with indicated ornaments, we define the context of each note that contains an appoggiatura or a trill, using a XML format. Information about the current note includes the note's duration, pitch and metrical position, while information about its context includes the duration of previous and following notes, extension and direction of the intervals between the note and both the previous and the subsequent note and tempo of the performance. Once we define the context, we apply a nearest neighbour algorithm for generating the expressive ornament. The algorithm selects the most similar trill (in terms of musical context) in the training examples and adapts it to the new musical context (e.g. the key of the piece). After finding the ornament with higher similarity, their descriptors are adapted to the characteristics of the input note, pitch and duration, and the new ornamented note is generated. Once we have the descriptors of corresponding ornament we consider the main note s descriptors (beginning and end time and fundamental ). Bearing these parameters in mind, we adapt them to the behaviour of the once already analyzed. We consider if it is an ascending or descending ornament, the duration of each note of the trill and the duration of the main note. We scale duration and fundamental information, taking in a count the tonality of the new melody, and transform it into a MIDI representation. Finally, when we have the new ornament, we insert it into the symbolic representation of the new melody. RESULTS 1.1 Statistical analysis The melody estimation has been successfully adapted to the particular analysis of bassoon ornaments. The statistical analysis of the duration of the ornamental notes reveals a similar behaviour to previous studies on piano (Brown, Judith. 2003). The speed of execution is around 8 notes per second for most of the trills. Figure 8 shows the distribution of the notes classified in the three movements of the analyze piece: Allegro, Affettuoso and Adagio. We can observe that majority group is of 8 notes, as mentioned below. Figure 7: Description of the steps of analysis

Figure 8: Distribution of number of notes per second for the three movements: Allegro, Affettuoso and Adagio. Another interesting result is that we can clearly distinguish two groups of trills. In the first group, that of the slow tempi (notes with long duration), there is a difference among both extreme notes (the initial and the final note) and middle notes. The first and the last note are usually longer than the central ones, as shown in Figure 9. In the second group, for fast tempi (short notes), trills are usually converted into appoggiaturas, as shown in Figure 10. Figure 10: Duration of ornamental notes for the ten longest trills. We observe that the behaviour is the same that for an appoggiatura. The first note is shorter than the second, which acts as the main note. Finally we can sometimes identify some regularity in the execution of central notes duration. In this situation, we can speak about controlled trills, as opposite to non-controlled trills. In Figure 11 we show an example of a controlled trill. Figure 11: Evolution of note duration for a controlled trill. The central notes are played with regularity. Figure 9: Duration of ornamental notes for the ten longest trills. We observe that the first and last note have a longer duration than the rest.

Generation of ornaments Figure 13 and Figure 14 show an example of ornaments generated with this method. Figure 13: Original score without trills. This is a fragment of a bassoon melody of Affettuoso movement and tempo 92 pulsations per second. Figure 14: Final score with generated ornaments, They are indicated with a red line. CONCLUSIONS This study presents an approach for the automatic analysis and generation of expressive ornaments of bassoon using automatic melodic description and machine learning techniques. There seems to be regularities on the trills if we distinguish two groups for long and short trills. Our results agree with previous studies for piano, although it seems to be easier to perform trills in bassoon, because it is softly to play than piano. Ultimately we can reproduce the behaviour in a MIDI synthesizer. Further work is centred in increasing the analyzed collection in order to obtain a robust model and to extent it to other musical instruments. REFERENCES Bresin, R. & Friberg, A. (2000) Emotional Coloring of Computer-Controlled Music Performances. Computer Music Journal, 24(4), pp. 44-63 Brown, Judith C. (2003). Independent component analysis for automatic note extraction from musical trills, Journal of the Acoustic Society of America, 115, pp. 2295-2306. Gómez E. (2002). Melodic description of audio signals for music content processing, PhD predoctoral Thesis, Universitat Pompeu Fabra. Gómez, E. Grachten, M. Amatriain, X. Arcos, J. (2003). Melodic characterization of monophonic recordings for expressive tempo transformations, Proceedings of Stockholm Music Acoustics Conference 2003; Stockholm, Sweden Juslin, Patrik N., Sloboda, John A. (2002). Music and emotion, Oxford University Press. Klapuri, A. (1999). Sound Onset Detection by Applying PrychoacousticKnowledege. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. Moore, G. P. (1992). Piano trills, Music Perception 9(3), pp. 351 359. Sundberg J, Friberg A, and Bresin R (2003) Attempts to reproduce a pianist's expressive timing with Director Musices performance rules, Journal of New Music Research, 32(3), pp. 317-326.