Classification of Timbre Similarity

Similar documents
Subjective Similarity of Music: Data Collection for Individuality Analysis

A New Method for Calculating Music Similarity

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

MUSI-6201 Computational Music Analysis

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Automatic Music Similarity Assessment and Recommendation. A Thesis. Submitted to the Faculty. Drexel University. Donald Shaul Williamson

Music Information Retrieval with Temporal Features and Timbre

Recognising Cello Performers Using Timbre Models

Automatic Rhythmic Notation from Single Voice Audio Sources

Supervised Learning in Genre Classification

Recognising Cello Performers using Timbre Models

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons

Features for Audio and Music Classification

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

WE ADDRESS the development of a novel computational

Perceptual dimensions of short audio clips and corresponding timbre features

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

ON FINDING MELODIC LINES IN AUDIO RECORDINGS. Matija Marolt

An Accurate Timbre Model for Musical Instruments and its Application to Classification

ISMIR 2008 Session 2a Music Recommendation and Organization

MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND

Music Recommendation from Song Sets

Topics in Computer Music Instrument Identification. Ioanna Karydi

A FUNCTIONAL CLASSIFICATION OF ONE INSTRUMENT S TIMBRES

HIDDEN MARKOV MODELS FOR SPECTRAL SIMILARITY OF SONGS. Arthur Flexer, Elias Pampalk, Gerhard Widmer

Topic 10. Multi-pitch Analysis

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

MODELS of music begin with a representation of the

Improving Timbre Similarity : How high s the sky?

A SEGMENTAL SPECTRO-TEMPORAL MODEL OF MUSICAL TIMBRE

The song remains the same: identifying versions of the same piece using tonal descriptors

Effects of acoustic degradations on cover song recognition

Automatic morphological description of sounds

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. X, NO. X, MONTH Unifying Low-level and High-level Music Similarity Measures

Musical instrument identification in continuous recordings

Polyphonic Audio Matching for Score Following and Intelligent Audio Editors

COMBINING FEATURES REDUCES HUBNESS IN AUDIO SIMILARITY

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

Unifying Low-level and High-level Music. Similarity Measures

Music Genre Classification and Variance Comparison on Number of Genres

Research Article A Model-Based Approach to Constructing Music Similarity Functions

Towards Music Performer Recognition Using Timbre Features

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

pitch estimation and instrument identification by joint modeling of sustained and attack sounds.

Singer Traits Identification using Deep Neural Network

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

TYING SEMANTIC LABELS TO COMPUTATIONAL DESCRIPTORS OF SIMILAR TIMBRES

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Music Information Retrieval Community

Instrument Timbre Transformation using Gaussian Mixture Models

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Analytic Comparison of Audio Feature Sets using Self-Organising Maps

HIT SONG SCIENCE IS NOT YET A SCIENCE

An Examination of Foote s Self-Similarity Method

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

Detection of genre-specific musical instruments: The case of the mellotron

A Survey of Audio-Based Music Classification and Annotation

Transcription of the Singing Melody in Polyphonic Music

Environmental sound description : comparison and generalization of 4 timbre studies

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Time Series Models for Semantic Music Annotation Emanuele Coviello, Antoni B. Chan, and Gert Lanckriet

HUMANS have a remarkable ability to recognize objects

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Music Segmentation Using Markov Chain Methods

/$ IEEE

LEARNING SPECTRAL FILTERS FOR SINGLE- AND MULTI-LABEL CLASSIFICATION OF MUSICAL INSTRUMENTS. Patrick Joseph Donnelly

CS229 Project Report Polyphonic Piano Transcription

MUSICAL INSTRUMENTCLASSIFICATION USING MIRTOOLBOX

ISSN ICIRET-2014

Contextual music information retrieval and recommendation: State of the art and challenges

CONCATENATIVE SYNTHESIS FOR NOVEL TIMBRAL CREATION. A Thesis. presented to. the Faculty of California Polytechnic State University, San Luis Obispo

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Limitations of interactive music recommendation based on audio content

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

Hong Kong University of Science and Technology 2 The Information Systems Technology and Design Pillar,

638 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010

Multipitch estimation by joint modeling of harmonic and transient sounds

A MUSIC CLASSIFICATION METHOD BASED ON TIMBRAL FEATURES

MIRAI: Multi-hierarchical, FS-tree based Music Information Retrieval System

Outline. Why do we classify? Audio Classification

DERIVING A TIMBRE SPACE FOR THREE TYPES OF COMPLEX TONES VARYING IN SPECTRAL ROLL-OFF

A Survey on: Sound Source Separation Methods

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Analyzing the Relationship Among Audio Labels Using Hubert-Arabie adjusted Rand Index

Improving the description of instrumental sounds by using ontologies and automatic content analysis

Violin Timbre Space Features

Comparison Parameters and Speaker Similarity Coincidence Criteria:

SONG-LEVEL FEATURES AND SUPPORT VECTOR MACHINES FOR MUSIC CLASSIFICATION

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Instrument identification in solo and ensemble music using independent subspace analysis

A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

Transcription:

Classification of Timbre Similarity Corey Kereliuk McGill University March 15, 2007 1 / 16

1 Definition of Timbre What Timbre is Not What Timbre is A 2-dimensional Timbre Space 2 3 Considerations Common Approaches 4 Long-term Statistics Modeling the Global Spectrum 5 6 7 8 2 / 16

What Timbre is Not What Timbre is Not What Timbre is A 2-dimensional Timbre Space A unified definition of timbre seems elusive Timbre tends to be the psychoacoustician s multidimensional waste-basket category for everything that cannot be labeled as pitch or loudness (McAdams79) OED definition: The character or quality of a musical or vocal sound (distinct from its pitch and intensity) depending upon the particular voice or instrument producing it, and distinguishing it from sounds proceeding from other sources Timbre refers to the color or quality of sounds, and is typically divorced conceptually from pitch and loudness (Wessel79) All of these definitions describe timbre by saying what it is not 3 / 16

What Timbre is What Timbre is Not What Timbre is A 2-dimensional Timbre Space Perceptual research on timbre has demonstrated that the spectral energy distribution and temporal variation in this distribution provide the acoustical determinants of our perception of sound quality (Wessel79) Wessel collected perceptual dissimilarities through a series of listening tests: Listeners were played two sounds and asked to rate how similar (on a scale [0-9]) the two sounds were This produced n(n 1)/2 observations (in this case n = 24 orchestra instruments) which were organized into a 24x24 dissimilarity matrix A multi-dimensional scaling algorithm was used to create a 2-dimensional timbre space, in which the dissimilarity between instrument timbres was proportional to their euclidean 4 / 16

What Timbre is Not What Timbre is A 2-dimensional Timbre Space 5 / 16

Psychoacoustic studies Musicological analyses Source separation Instrument identification Content-based management systems for the navigation of large catalogues Composition Identifying bird calls from the same species Speaker identification etc. 6 / 16

Considerations Common Approaches Considerations Whether to focus on monophonic or polyphonic timbres? Whether to use local or global features? Which local/global features to use (infinite possibilities) Perceptual relevance of results 7 / 16

Considerations Common Approaches Common Approaches Monophonic timbre similarity is relatively well understood There is still much to be discoverd about polyphonic timbre similarity Commonly used tools: Mel-Frequency Cepstrum Coefficients (MFCCs) Spectral Centroid Log-attack-time Principle Component Analysis (PCA) Spectral Flatness (Degree of noisy-ness) k-nn GMMs, HMMs, GAs, NNs 8 / 16

Long-term Statistics Modeling the Global Spectrum Long-term Statistics In order to get a sense of the global spectral envelope of a signal: Compute the MFCC on N sequential frames Average the N frames together One might expect the result to be flat or noisy, however, it turns out that a global shape emerges, which tends to be quite specific to a given texture 9 / 16

Long-term Statistics Long-term Statistics Modeling the Global Spectrum Figure: Global Spectral Shape(Aucouturier 2005) 10 / 16

Modeling the Global Spectrum Long-term Statistics Modeling the Global Spectrum Aucouturier (2005) proposes modeling the MFCCs as a mixture of Gaussians: p(f t ) = M π m ℵ(F t, µ m, Γ m ) (1) m=1 Here the feature vector F t at time t (MFCCs in this case) is modeled as the sum of M Gaussians with mean µ m and variance Γ m The GMM is initialized by k-mean clustering and trained using the classic EM algorithm 11 / 16

Definition of Timbre Long-term Statistics Modeling the Global Spectrum Modeling the Global Spectrum Figure: GMM Clustering (Aucouturier 2005) 12 / 16

In order to compare the timbral similarity of two songs: A GMM is computed for each song A large number of sampling points are evaluated to compute the likelihood that they could have come from the song under comparison This is illustrated by the following equation: D(A, B) = N logp(si A A) + i=1 N logp(si A B) i=1 N logp(si B B) i=1 N logp(si B A) N is the number of sampling points used D is a probabilistic distance measure assessing the similarity between song A and song B 13 / 16 i=1

Global Timbral Similarity Implemented in CUIDADO music browser A query for Ahmad Jamal - L instant de Verite a jazz piano recording returns similarity results which all contain romantic-styled piano. For example, New Orleans Jazz (G. Mirabassi), Classical Piano (Schumann, Chopin) Some of the most interesting results are unexpected (different genres and cultural backgrounds) 14 / 16

Finding an evaluation metric for this type of system would be difficult The MIR community has hotly debated the subject of evaluation At this time standard test databases need to be developed in order to compare different techniques There is also the question of what exactly defines similarity? Comparing to hand segmented/clustered results might not be adequate since unexpected results (false-negatives) might be missed 15 / 16

The End 16 / 16