A Computational Model for Discriminating Music Performers

Similar documents
Director Musices: The KTH Performance Rules System

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

Playing Mozart by Analogy: Learning Multi-level Timing and Dynamics Strategies

Importance of Note-Level Control in Automatic Music Performance

Real-Time Control of Music Performance

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

A prototype system for rule-based expressive modifications of audio recordings

Artificial Social Composition: A Multi-Agent System for Composing Music Performances by Emotional Communication

Measuring & Modeling Musical Expression

Computational Models of Expressive Music Performance: The State of the Art

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos

Extracting Significant Patterns from Musical Strings: Some Interesting Problems.

EVIDENCE FOR PIANIST-SPECIFIC RUBATO STYLE IN CHOPIN NOCTURNES

Quarterly Progress and Status Report. Musicians and nonmusicians sensitivity to differences in music performance

WHO IS WHO IN THE END? RECOGNIZING PIANISTS BY THEIR FINAL RITARDANDI

Automatic Laughter Detection

Human Preferences for Tempo Smoothness

Structural Communication

Quarterly Progress and Status Report. Matching the rule parameters of PHRASE ARCH to performances of Träumerei : a preliminary study

Modeling expressiveness in music performance

Modeling memory for melodies

On music performance, theories, measurement and diversity 1

Automatic Laughter Detection

Towards Music Performer Recognition Using Timbre Features

On the contextual appropriateness of performance rules

In Search of the Horowitz Factor

Topics in Computer Music Instrument Identification. Ioanna Karydi

Music Performance Panel: NICI / MMM Position Statement

Improving Frame Based Automatic Laughter Detection

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Zooming into saxophone performance: Tongue and finger coordination

Neural Network Predicating Movie Box Office Performance

A Case Based Approach to the Generation of Musical Expression

An Empirical Comparison of Tempo Trackers

Detecting Musical Key with Supervised Learning

Hidden Markov Model based dance recognition

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting

Proceedings of Meetings on Acoustics

A COMPARISON OF PERCEPTUAL RATINGS AND COMPUTED AUDIO FEATURES

Automatic Rhythmic Notation from Single Voice Audio Sources

Machine Learning of Expressive Microtiming in Brazilian and Reggae Drumming Matt Wright (Music) and Edgar Berdahl (EE), CS229, 16 December 2005

Experiments on gestures: walking, running, and hitting

Modeling and Control of Expressiveness in Music Performance

A structurally guided method for the decomposition of expression in music performance

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

MUSI-6201 Computational Music Analysis

Singer Traits Identification using Deep Neural Network

Temporal coordination in string quartet performance

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Singing accuracy, listeners tolerance, and pitch analysis

STOCHASTIC MODELING OF A MUSICAL PERFORMANCE WITH EXPRESSIVE REPRESENTATIONS FROM THE MUSICAL SCORE

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

ESP: Expression Synthesis Project

Computational Modelling of Harmony

Analysis of local and global timing and pitch change in ordinary

Supervised Learning in Genre Classification

Automatic Reduction of MIDI Files Preserving Relevant Musical Content

Chord Classification of an Audio Signal using Artificial Neural Network

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Regression Model for Politeness Estimation Trained on Examples

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

Getting that Plus grading (A+, B+, C+) AMEB Information Day 2018 Jane Burgess. Music does not excite until it is performed Benjamin Britten, composer

A Beat Tracking System for Audio Signals

TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Relational IBL in classical music

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

CS229 Project Report Polyphonic Piano Transcription

Introduction. Figure 1: A training example and a new problem.

HYBRID NUMERIC/RANK SIMILARITY METRICS FOR MUSICAL PERFORMANCE ANALYSIS

Feature-Based Analysis of Haydn String Quartets

Automatic Construction of Synthetic Musical Instruments and Performers

Music BCI ( )

Audio Feature Extraction for Corpus Analysis

Automatic Piano Music Transcription

A FUNCTIONAL CLASSIFICATION OF ONE INSTRUMENT S TIMBRES

Doctor of Philosophy

jsymbolic 2: New Developments and Research Opportunities

Music Genre Classification and Variance Comparison on Number of Genres

Transcription of the Singing Melody in Polyphonic Music

Outline. Why do we classify? Audio Classification

BRAIN-ACTIVITY-DRIVEN REAL-TIME MUSIC EMOTIVE CONTROL

Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals

Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx

Modeling sound quality from psychoacoustic measures

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

Learning Word Meanings and Descriptive Parameter Spaces from Music. Brian Whitman, Deb Roy and Barry Vercoe MIT Media Lab

Guide to Computing for Expressive Music Performance

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

v end for the final velocity and tempo value, respectively. A listening experiment was carried out INTRODUCTION

MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC

Melodic Outline Extraction Method for Non-note-level Melody Editing

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas

jsymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada

Practice makes less imperfect: the effects of experience and practice on the kinetics and coordination of flutists' fingers

Transcription:

A Computational Model for Discriminating Music Performers Efstathios Stamatatos Austrian Research Institute for Artificial Intelligence Schottengasse 3, A-1010 Vienna stathis@ai.univie.ac.at Abstract In this study, a computational model that aims at the automatic discrimination of different human music performers playing the same piece is presented. The proposed model is based on the note level and does not require any deep (e.g., structural or harmonic, etc.) analysis. A set of measures that attempts to capture both the style of the author and the style of the piece is introduced. The presented approach has been applied to a database of piano sonatas by W.A. Mozart performed by both a French and a Viennese pianist with very encouraging preliminary results. 1 Introduction Studying music is one of the most active research areas in computational musicology. Various empirical approaches attempt to model the of musical pieces by human experts based mainly on elementary structure analysis of music [1], [2]. Little attention has been paid so far to the development of computational tools able to discriminate between music performers without any external assistance. To the best of our knowledge there is no published study dealing with this subject. However, the music performer identification problem offers a good testing ground for the development of computational musicology theories since it is a well defined task where the results of a given approach can be evaluated objectively. Moreover, different approaches can be compared by applying them to the same data and reliable conclusions regarding the accuracy of each approach can be extracted. On the other hand, the conclusions drawn by a performer identification study can be taken into account in the designing of other, more practical and useful, tools that try to solve traditional problems. In this study we try to answer the following questions: Are the differences and similarities between different music performers computationally traceable? What level of analysis is required for extracting reliable classification results? What are the measures that best distinguish between different music performers? Can the existing theories of music be useful in the development of a performer identification system? In this paper, a set of parameters that try to capture the stylistic properties of a given of a musical piece is introduced. The main idea is that information for both the and the musical piece itself should be taken into account. Thus, in addition to parameters dealing with the deviation of the human performer from the score in terms of timing, articulation, and dynamics, the proposed set contains piece-dependent parameters that attempt to represent the stylistic properties of the musical piece. The existing KTH set of generative rules for music [3], [4] is used for providing the piecedependent information that, in essence, includes the deviations of a machine-generated from the score. The proposed approach is based on the note level and does not require any deep (e.g., structural, harmonic, etc.) analysis. Experiments on a database of piano sonatas by W.A. Mozart, performed by both a French and a Viennese pianist, show that the presented tool is able to distinguish accurately between them. The rest of this paper is as follows: Section 2 describes the proposed model in detail. Section 3 includes the experimental results while in Section 4 the conclusions drawn by this study are given and future work directions are proposed. 2 The Proposed Model In order to quantify the of a musical piece, the relative distance between the and the score, in terms of timing, articulation and dynamics, is used. Given two discrete vectors of values x={x 1,, x n } and y={y 1,, y n }, the relative distance D(x, y) between them as used in this paper is defined as follows:

KTH Rule Set human expert machine-generated Parameters: -dependent piece-dependent Classification Figure 1. The proposed methodology. D( x, y) n i= = 1 ( xi yi ) x The three -dependent parameters used in this study, which correspond to deviations in terms of timing, articulation, and dynamics, respectively, are following: D( nominal, measured ) timing D( nominal, measured ) articulation D(SL nominal, SL measured ) dynamics where nominal is the nominal Inter-Onset Interval, extracted from the score, and SL nominal is the default Sound Level, while measured, measured, and SL measured is the inter-onset interval, the Off-Time Duration and the sound level, respectively, as measured in the actual. It has to be noted that only the soprano voice is taken into account. Note also that the off-time duration of a note n i is defined as the difference between the offset of n i and the onset of n i+1. Recent studies show that the relative amount of staccato for one tone is independent from the [5], [6]. However, the distance of from is quite effective for discriminating between performers (see Section 3). The values of the above parameters usually depend on the characteristics of the musical piece. For providing the classifier with appropriate information about the stylistic properties of the piece, a set of similar measures that are obtained by a machine-generated is introduced. To this end, we use a subset of the well-known KTH set of generative rules for music [3], [4], [7]. In more detail, only the rules that can be applied on the note level and do not require any special analysis (e.g., phrase boundary detection, harmonic analysis, etc.) are used. The rules employed in this study are given in Table 1. n i KTH-rule Durational Contrast Double Duration High Loud Leap Articulation Leap Tone Duration Faster Uphill Repetition Articulation Duration Contrast Articulation Punctuation Affected variables, SL SL, Table 1. The KTH rules that have been employed in this study (k=1 for all the rules). The machine-generated is compared with the score and the following piece-dependent parameters are obtained: D( nominal, rule ) timing D( nominal, rule ) articulation D(SL nominal, SL rule ) dynamics where the rule, rule, and SL rule are the interonset interval, the off-time duration and the sound level, respectively, as measured in the rule-generated. Thus, for each of a musical piece a vector of six parameters is extracted. This vector can then be processed by a standard classification method to obtain the most likely performer. The proposed methodology is illustrated in Figure 1. 3 Experiments The ideal testing ground for the presented approach would be a database of enough musical pieces performed several times by many human experts with different musical styles. The available database

Parameters included Guess Entremont Batik Total Actual samples Perfromance-dependent Entremont 33 1 34 parameters only Batik 5 38 43 Performance-dependent and Entremont 32 2 34 piece-dependent parameters Batik 1 42 43 Table 2. Confusion matrix for the cross validated data. Comparable results for using -dependent parameters only and the entire set of parameters. Correct guesses are in boldface. that best matches these requirements is a collection of piano sonatas by W.A. Mozart performed by Philippe Entremont and Roland Batik in machine-readable form. Specifically, the database we used includes parts of the sonatas KV 279, 280, 281, 282, 283, 284, and 333 played by both pianists. Each sonata movement has been divided in sections and repetitions manually provided in total 34 samples for Entremont and 43 samples for Batik 1. Moreover, each sample has been matched against the score [2]. Accuracy (%) 100 98 96 94 92 90 88 original data 96.1 92.2 Performance parameters only cross validated data 97.4 96.1 Performance and piece parameters Figure 2. Accuracy of the proposed model. Comparative results for dependent parameters only and the entire set of parameters. The proposed methodology has been applied to this data set providing a six-parameter vector for each sample. Then, discriminant analysis, a standard technique of multivariate statistics [8], has been used to classify the produced vectors. The data then were cross validated, that is, each sample was considered as unseen case and classified based on the remaining samples (i.e., leave-one-out methodology). The results of the classification procedure are given in the confusion matrix of Table 2. The corresponding classification results when only the dependent parameters are taken into account are given as well. The total classification accuracy for both the original and the cross validated data is given in Figure 2. Note that the original data columns refer to the application of the classification model to the training data (i.e., no unseen cases). It is clear that the -dependent parameters alone can give quite reliable results. However, there is a significant improvement when the piece-dependent parameters are included in the parameter vector. Parameter t value D( nominal, measured ) 2.883 D( nominal, measured ) 8.823 D(SL nominal, SL measured ) 7.951 D( nominal, rule ) 1.321 D( nominal, rule ) 1.731 D(Sl nominal, SL rule ) 2.245 Table 3. Absolute t values for both -dependent and piecedependent parameters. In order to explore the contribution of each parameter to the classification model, we applied linear regression analysis and obtained the t values for each parameter. The absolute t value is an indication of the importance of the parameter. The higher the absolute t value, the more important the contribution of the parameter to the classification model. The results are given in Table 3 and confirm the results of the Table 2 since the dependent parameters proved to be the most significant ones. In more detail, the articulation and the dynamics parameters seem to be the ones that contribute the most to the classification model. From the piece-dependent parameters, the dynamics parameter seems to be the most significant. Moreover, for giving an indication to the reader as concerns the differences between the two pianists in terms of the used parameters, Table 4 shows an interpretation of the standardized coefficients of the regression function. Thus, Entremont s s 1 There are more samples for Batik than Entremont since more repetitions of some sections were available for the former.

Parameter Entremont Batik Timing + Articulation + Dynamics + Table 4. An interpretation of the standardized regression coefficients illustrating the differences between the two pianists. are usually characterized by a higher average deviation of timing and articulation, and a lower average deviation of dynamics than Batik s s. In other words, the greater the average deviation of timing and articulation and the lower the average deviation of dynamics, the more likely for Entremont to be the performer. Parameter t value D( nominal, measured ) 2.721 D( nominal, measured ) 7.461 D(SL nominal, SL measured ) 4.407 D( nominal, rule_dc ) 0.847 D(SL nominal, SL rule_dc ) 0.962 D( nominal, rule_dd ) 0.496 D(SL nominal, SL rule_hl ) 0.597 D( nominal, rule_la ) 0.037 D( nominal, rule_ltd ) 0.043 D( nominal, rule_fu ) 0.476 D( nominal, rule_ra ) 1.802 D( nominal, rule_dca ) 0.013 D( nominal, rule_punc ) 1.539 D( nominal, rule_punc ) 1.092 Table 5. Absolute t values for both -dependent and decomposed piece-dependent parameters. In the last experiment, the contribution of each KTH rule to the classification model is examined. In this case, only one rule is taken into account for producing the machine-generated. The measured parameters correspond to the affected variables of the rule under examination. For instance, the durational contrast rule affects both and SL (see Table 1), so two parameters are obtained. This procedure is followed for each rule providing in total eleven new piece-parameters that replace the three old piece-parameters. Linear regression has been applied to the model consisting of the -dependent parameters and the new decomposed piece-dependent parameters. The absolute t values for each parameter are given in Table 5. As can be seen, the repetition articulation rule, the punctuation rule, and the durational contrast rule provide the most important piece-dependent parameters. On the other hand, the leap articulation rule, the leap tone duration rule, and the durational contrast articulation rule seem to contribute the least to the classification model. 4 Conclusions In this paper we presented a computational model for automatically discriminating music performers. The proposed vector that attempts to capture the stylistic properties of the consists of both -dependent and piece-dependent parameters. These parameters represent average deviations in terms of timing, articulation, and dynamics for the real and for a machinegenerated. Alternative average parameters, e.g., the absolute relative distance, may also contribute significant information and they will be considered in future experiments. Preliminary results that have been presented are very encouraging since the proposed model succeeded on discriminating between two human experts playing the same piano sonatas. However, the proposed approach has to be tested on various heterogeneous data sets comprising more candidate performers for extracting more reliable results. The requirements of the presented method are quite limited since it can be applied on the note level and does not involve any computationally-hard analysis. On the other hand, the high importance of the punctuation rule, as suggested by Table 5, is a strong indication that at least structural analysis could improve considerably the classification results. Note that this rule automatically locates small tone groups and marks them with a lengthening of the last note and a following micropause. Another aspect that has to be examined is the possibility of segmenting a sample into parts of equal length, in notes, and applying the presented methodology to each part rather than the whole sample. In that case, it would be possible to test the proposed model in data sets where only limited training samples are available for each performer. Acknowledgments This work was supported by the EC project HPRN- CT-2000-00115 (MOSART) and the START program of the Austrian Federal Ministry for Education, Science, and Culture (Grant no. Y99-INF). References [1] Repp, B. 1992. Diversity and Commonality in Music Performance: An Analysis of Timing Microstructure in Schumann s Traümerei. Journal of the Acoustical Society of America, 92(5), pp. 2546-2568. [2] Widmer, G. 2001. Using AI and Machine Learning to Study Expressive Music Performance: Project Survey and First Report. AI Communications, 14.

[3] Friberg, A. 1991. Generative Rules for Music Performance: A Formal Description of a Rule System. Computer Music Journal, 15(2), pp. 56-71. [4] Friberg, A. 1995. A Quantitative Rule System for Musical Performance. Doctoral dissertation, Royal Institute of Technology, Sweden. [5] Bresin, R., and Battel, G.U. 2000. Articulation strategies in expressive piano. Analysis of legato, staccato, and repeated notes in s of the Andante movement of Mozart s sonata in G major (K 545). Journal of New Music Research, 29 (3), pp. 211-224. [6] Bresin, R., and Widmer, G. 2000. Production of staccato articulation in Mozart sonatas played on a grand piano. Preliminary results. Speech Music and Hearing Quarterly Progress and Status Report, Stockholm: KTH, 4, pp. 1-6. [7] Friberg, A., Bresin, R., Frydén, L., and Sundberg, J. 1998. Musical Punctuation on the Microlevel: Automatic Identification and Performance of Small Melodic Units. Journal of New Music Research, 27(3), pp. 271-292. [8] Eisenbeis, R., and Avery R. 1972. Discriminant Analysis and Classification Procedures: Theory and Applications. Lexington, Mass.: D.C. Health and Co.