Perceptual dimensions of short audio clips and corresponding timbre features

Similar documents
A new tool for measuring musical sophistication: The Goldsmiths Musical Sophistication Index

Perceptual dimensions of short audio clips and corresponding timbre features

Measuring the Facets of Musicality: The Goldsmiths Musical Sophistication Index. Daniel Müllensiefen Goldsmiths, University of London

The Musicality of Non-Musicians: Measuring Musical Expertise in Britain

Classification of Timbre Similarity

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

FANTASTIC: A Feature Analysis Toolbox for corpus-based cognitive research on the perception of popular music

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

MUSI-6201 Computational Music Analysis

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WORKSHOP Approaches to Quantitative Data For Music Researchers

TYING SEMANTIC LABELS TO COMPUTATIONAL DESCRIPTORS OF SIMILAR TIMBRES

Release Year Prediction for Songs

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

Exploring Relationships between Audio Features and Emotion in Music

A Study on Cross-cultural and Cross-dataset Generalizability of Music Mood Regression Models

DERIVING A TIMBRE SPACE FOR THREE TYPES OF COMPLEX TONES VARYING IN SPECTRAL ROLL-OFF

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Subjective Similarity of Music: Data Collection for Individuality Analysis

Modeling memory for melodies

Expressive information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Audio Feature Extraction for Corpus Analysis

Singer Traits Identification using Deep Neural Network

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

Topics in Computer Music Instrument Identification. Ioanna Karydi

Automatic Music Clustering using Audio Attributes

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Recognising Cello Performers Using Timbre Models

Music Recommendation from Song Sets

Proceedings of Meetings on Acoustics

A Categorical Approach for Recognizing Emotional Effects of Music

Recognition of leitmotives in Richard Wagner s music: An item response theory approach

Enhancing Music Maps

Recognising Cello Performers using Timbre Models

Creating a Feature Vector to Identify Similarity between MIDI Files

A COMPARISON OF PERCEPTUAL RATINGS AND COMPUTED AUDIO FEATURES

Music Complexity Descriptors. Matt Stabile June 6 th, 2008

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

Journal of Research in Personality

Music Genre Classification

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

Supervised Learning in Genre Classification

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface

Features for Audio and Music Classification

Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates

Automatic Laughter Detection

GLM Example: One-Way Analysis of Covariance

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

Improving Frame Based Automatic Laughter Detection

Singer Recognition and Modeling Singer Error

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS

Outline. Why do we classify? Audio Classification

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

TOWARDS AFFECTIVE ALGORITHMIC COMPOSITION

Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution

AudioRadar. A metaphorical visualization for the navigation of large music collections

Speech and Speaker Recognition for the Command of an Industrial Robot

An empirical field study on sing- along behaviour in the North of England

Subjective Emotional Responses to Musical Structure, Expression and Timbre Features: A Synthetic Approach

A Survey of Audio-Based Music Classification and Annotation

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

CS229 Project Report Polyphonic Piano Transcription

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

The Role of Time in Music Emotion Recognition: Modeling Musical Emotions from Time-Varying Music Features

LEARNING TO CONTROL A REVERBERATOR USING SUBJECTIVE PERCEPTUAL DESCRIPTORS

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

GOOD-SOUNDS.ORG: A FRAMEWORK TO EXPLORE GOODNESS IN INSTRUMENTAL SOUNDS

Psychophysical quantification of individual differences in timbre perception

Unifying Low-level and High-level Music. Similarity Measures

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. X, NO. X, MONTH Unifying Low-level and High-level Music Similarity Measures

Modelling Perception of Structure and Affect in Music: Spectral Centroid and Wishart s Red Bird

Tempo and Beat Analysis

Mood Tracking of Radio Station Broadcasts

Analytic Comparison of Audio Feature Sets using Self-Organising Maps

A Large Scale Experiment for Mood-Based Classification of TV Programmes

HIT SONG SCIENCE IS NOT YET A SCIENCE

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Composer Style Attribution

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

THE POTENTIAL FOR AUTOMATIC ASSESSMENT OF TRUMPET TONE QUALITY

ECONOMICS 351* -- INTRODUCTORY ECONOMETRICS. Queen's University Department of Economics. ECONOMICS 351* -- Winter Term 2005 INTRODUCTORY ECONOMETRICS

Melody Retrieval On The Web

An interdisciplinary approach to audio effect classification

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

Singer Identification

Sound Quality Analysis of Electric Parking Brake

Discriminant Analysis. DFs

Transcription:

Perceptual dimensions of short audio clips and corresponding timbre features Jason Musil, Budr El-Nusairi, Daniel Müllensiefen Department of Psychology, Goldsmiths, University of London

Question How do listeners make similarity judgements when comparing very short music clips? Assumption: For really short clips sound is most important

Background Related real world behaviours: Scanning the radio dial Browsing large music collection Instant recognition of favourite songs Psychological studies on short audio clips: Genre identification (Gjerdingen & Perrot, 2008; Mace et al., 2012) Identification of artist and title (Krumhansl, 2010)

A New Test The Sound Similarity Test: Part of Goldsmiths Musical Sophistication test battery* Testing ability to extract and compare information from short and unfamiliar audio clips => Familiarity with breadth of musical styles No correlation with formal musical training No use of genre labels, no use of rating scales => nonverbal similarity classification task Clips chosen as representative (All Music Guide) pieces from 4 meta-styles (Rentfrow & Gosling, 2003) * Documentation and online implementation at: http://www.gold.ac.uk/music-mind-brain/gold-msi/

Test Interface

Data Test variants: BBC implementation: 16 clips (400ms) from 4 genres, (n=138,469) Lab implementations (differ by clip length and excerpt, n ~ 130) A400 A800 B400 B800

Data for acoustic analysis B800 data set: 800ms from 4 genres n=131 Raw data: 131 16x16 similarity matrices Aggregate congruent with genre provenance

Data for acoustic analysis

Question How do listeners make similarity judgements when comparing very short music clips? Are there any acoustic features that explain listeners judgements?

Analysis Plan 1. Extract main perceptual dimensions from similarity data: Multi-dimensional Scaling 2. Describe music clips by acoustic features: The Echonest timbre descriptors 3. Predict perceptual coordinates by acoustic features: Statistical regression

1. Multi-dimensional Scaling non-metric MDS 3-dimensional solution stress: 6.52

2. Echonest Timbre Descriptors Based on short audio segments (2-5) 12 coefficients per segment, partially interpretable (1=loudness, 2=brightness, 3=flatness, 4=attack, etc.) 12 means and 12 variances per clip as acoustic features plus #segments

3. Predicting Perceptual Dimensions from Acoustic Features Problems: k > n : 16 objects, 25 features (Potentially) non-linear relationships Solution 1: Random Forest regression (non-linear, handles k>n, sensitive to small influences and complex interactions)

Random Forest Variable Importance according to random forest Predicting dim. 1 (R 2 =.058) Predicting dim. 2 (R 2 =.215) Predicting dim. 3 (R 2 =.263)

Problems Interpretation / documentation of Echonest timbre coefficients 5 and 9 unclear No simple model for perceptual dimension 3

Solution 2 Partial-Least Squares regression (handles k>n very well, linear, no interactions) Use well-documented features: Two variants of MFCCs plus stand-alone features (spectral centroid, spectral spread, flatness etc.) from Queen Mary s Vamp plug-in set

Partial Least Squares Regression Results: From CV: 27% of variance explained in Perceptual Dimension 1 Dimension 2, 3 not explained at all Both sets of MFCCs are most important features

Summary Perceptual dimension 1 and 2 are closely related to Echonest timbre coefficients 5 and 9. Perceptual dimension 1 is predicted by ensemble of MFCC features Model fits are moderate at best (R 2 ~.25)

Conclusions Human similarity judgements of short audio clips show some commonality with statistical model using acoustic features At least one dimension isn t explained at all by lowlevel features => higher order information (e.g. rhythm, harmony, instrumentation, style) or even valence and arousal? => There is a lot more in short music clips that low-level features can t capture

Next Steps Try alternatives for acoustic modelling Construct new test based on acoustic model: Select new pool of sound clips Design easy and difficult version of sorting task according to acoustic model distance (on dimension 1) Test participants with easy /difficult versions and in genres they are un/familiar with.

Perceptual dimensions of short audio clips and corresponding timbre features Jason Musil, Budr El-Nusairi, Daniel Müllensiefen Department of Psychology, Goldsmiths, University of London

Item-wise analysis