Beat-Synchronous Chroma Representations for Music Analysis

Similar documents
Searching for Similar Phrases in Music Audio

Lecture 11: Chroma and Chords

Extracting and Using Music Audio Information

Lecture 12: Alignment and Matching

Data Driven Music Understanding

Music Information Retrieval for Jazz

Data Driven Music Understanding

Effects of acoustic degradations on cover song recognition

Outline. Why do we classify? Audio Classification

MUSI-6201 Computational Music Analysis

Beat Tracking by Dynamic Programming

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Extracting Information from Music Audio

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

Transcription of the Singing Melody in Polyphonic Music

Week 14 Music Understanding and Classification

THE importance of music content analysis for musical

Music Similarity and Cover Song Identification: The Case of Jazz

Singing Pitch Extraction and Singing Voice Separation

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

Supervised Learning in Genre Classification

Topic 10. Multi-pitch Analysis

Query By Humming: Finding Songs in a Polyphonic Database

The song remains the same: identifying versions of the same piece using tonal descriptors

Statistical Modeling and Retrieval of Polyphonic Music

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Music Radar: A Web-based Query by Humming System

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Music Complexity Descriptors. Matt Stabile June 6 th, 2008

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

Beethoven, Bach, and Billions of Bytes

Trevor de Clercq. Music Informatics Interest Group Meeting Society for Music Theory November 3, 2018 San Antonio, TX

Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Chord Recognition. Aspects of Music. Musical Chords. Harmony: The Basis of Music. Musical Chords. Musical Chords. Music Processing.

Tempo and Beat Analysis

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Music Structure Analysis

Content-based music retrieval

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION

Timing In Expressive Performance

Automatic music transcription

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Robert Alexandru Dobre, Cristian Negrescu

Analysis of local and global timing and pitch change in ordinary

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

Lecture 15: Research at LabROSA

AUDIO COVER SONG IDENTIFICATION: MIREX RESULTS AND ANALYSES

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

Aspects of Music. Chord Recognition. Musical Chords. Harmony: The Basis of Music. Musical Chords. Musical Chords. Piece of music. Rhythm.

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

CSC475 Music Information Retrieval

Automatic Labelling of tabla signals

Music Segmentation Using Markov Chain Methods

Automatic Piano Music Transcription

Efficient Vocal Melody Extraction from Polyphonic Music Signals

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

A Survey of Audio-Based Music Classification and Annotation

Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Topic 4. Single Pitch Detection

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

A repetition-based framework for lyric alignment in popular songs

MUSIC is a ubiquitous and vital part of the lives of billions

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

Analyzer Documentation

AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE

A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

gresearch Focus Cognitive Sciences

IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Automatic Rhythmic Notation from Single Voice Audio Sources

Video-based Vibrato Detection and Analysis for Polyphonic String Music

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Introductions to Music Information Retrieval

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Voice & Music Pattern Extraction: A Review

SONG-LEVEL FEATURES AND SUPPORT VECTOR MACHINES FOR MUSIC CLASSIFICATION

Analysing Musical Pieces Using harmony-analyser.org Tools

Audio Feature Extraction for Corpus Analysis

National Park Service Photo. Utah 400 Series 1. Digital Routing Switcher.

MELODY EXTRACTION FROM POLYPHONIC AUDIO OF WESTERN OPERA: A METHOD BASED ON DETECTION OF THE SINGER S FORMANT

Automatic Music Similarity Assessment and Recommendation. A Thesis. Submitted to the Faculty. Drexel University. Donald Shaul Williamson

Speech To Song Classification

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.

Predictability of Music Descriptor Time Series and its Application to Cover Song Detection

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

Structured training for large-vocabulary chord recognition. Brian McFee* & Juan Pablo Bello

Computational Modelling of Harmony

Transcription:

Beat-Synchronous hroma Representations for Music nalysis an Ellis Laboratory for Recognition and Organization of Speech and udio ept. Electrical Eng., olumbia Univ., NY US dpwe@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1. hroma eatures. Beat Tracking 3. Matching over Songs. rtist Identification Beat-hroma Representations - Ellis 7-5-3-1 /3

Beyond Ms... Ms have been useful in udio Music IR timbral similarity artist I, segmentation, thumbnailing, singing... Separate tradition of Symbolic MIR melody matching, chord detection, meter analysis It s time to bring them together... with robust audio mid-level representations... that capture tonal (melodic-harmonic) content freq / khz 3 Let It Be / Beatles / verse 1 freq / khz 3 Let It Be / Nick ave / verse 1 1 1 6 8 1 time / sec = beat-synchronous features Beat-hroma Representations - Ellis 7-5-3 - /3 6 8 1 time / se

Piano scale 1. hroma eatures hroma features map spectral energy into one canonical octave freq / khz 3 1 i.e. 1 semitone bins 6 8 1 time / sec 1 3 5 6 7 time / frames an resynthesize as Shepard Tones all octaves at once level / db Piano tic scale 1 Shepard tone spectra -1 - -3 - -5-6 5 1 15 5 freq / Hz Beat-hroma Representations - Ellis 7-5-3-3 /3 freq / khz 3 1 I Shepard tone resynth 6 8 1 time / sec

alculating hroma eatures Method 1: Map every STT bin blurs non-tonal energy freq / khz 3 1 5 1 15 fft bin 6 8 1 time / sec 5 1 15 5 3 time / frame Method : Map only STT peaks still blurry at low frequencies freq / khz 3 1 5 1 15 fft bin 6 8 1 time / sec 5 1 15 5 3 time / frame Method 3: Instantaneous requency / t escapes frequency resolution limit ( 3 ) 1 freq / khz Beat-hroma Representations - Ellis 7-5-3 - /3 6 8 1 time / sec 5 1 15 5 3 time / frame

. Beat Tracking (1) oal: One feature vector per beat (tatum) for tempo normalization, efficiency Onset Strength Envelope sumf(max(, difft(log X(t, f) ))) freq / mel 3 1 5 1 15 time / sec utocorr. + window global tempo estimate 168.5 BPM 1 3 5 6 7 8 9 1 lag / ms samples Beat-hroma Representations - Ellis 7-5-3-5 /3

Beat Tracking () ynamic Programming finds beat times {t i } optimizes i O(t i ) + i W((t i+1 t i p )/ ) where O(t) is onset strength envelope (local score) W(t) is a log-aussian window (transition cost) p is the default beat period per measured tempo incrementally find best predecessor at every time backtrace from largest final score to get beats *(t) O(t) τ t *(t) = γ O(t) + (1 γ)max{w((τ τ p )/β)*(τ)} τ P(t) = argmax{w((τ τ p )/β)*(τ)} τ Beat-hroma Representations - Ellis 7-5-3-6 /3

freq / Bark band freq / Bark band Beat Tracking Results P will bridge gaps (non-causal) 3 1 there is always a best path... nd place in MIREX 6 Beat Tracking compared to McKinney & Moelants human data 3 1 lanis Morissette - ll I Want - gap + beats 18 18 186 188 19 19 time / sec test (Bragg) - McKinney + Moelants Subject data Subject # 5 1 time / s 15 Beat-hroma Representations - Ellis 7-5-3-7 /3

Beat-Synchronous hroma eatures Beat + features / 3ms frames average within each beat compact; sufficient? &# 3,5-.-6,7 %# $# "# 89/,)-/),9:); # ;8+-1*9/ ;8+-1*9/ "$ "# ( ' & $ #! "# )*+,-.-/, "! "$ "# ( ' & $! "# "! $# $! %# %! )*+,-.-1,)/ Beat-hroma Representations - Ellis 7-5-3-8 /3

freq / khz 3. over Song etection over Songs = reinterpretation of a piece different instrumentation, character no match with timbral features 3 Let It Be - The Beatles Let It Be / Beatles / verse 1 freq / khz 3 Let It Be - Nick ave Let It Be / Nick ave / verse 1 with raham Poliner 1 1 6 8 1 time / sec Need a different representation! beat-synchronous features Beat-sync features 6 8 1 Beat-sync features time / se 5 1 15 5 beats 5 1 15 5 beat Beat-hroma Representations - Ellis 7-5-3-9 /3

bins E Matching (1): Little ragments over versions may change song structure multiple local matches at different alignments Match query and target as many small pieces? extract Query 1 3 5 beats cross-correlate andidate how big are the pieces? how do we combine individual scores? do we have all day? bins E 1 3 5 beats Beat-hroma Representations - Ellis 7-5-3-1/3

Matching (): lobal orrelation ross-correlate entire beat- matrices... at all possible transpositions implicit combination of match quality and duration bins bins skew / semitones E E +5 Elliott Smith - Between the Bars 1 3 5 beats @81 BPM len Phillips - Between the Bars ross-correlation -5-5 - -3 - -1 1 3 skew / beats One good matching fragment is sufficient...? Beat-hroma Representations - Ellis 7-5-3-11/3

iltered ross-orrelation Raw correlation not as important as precise local match looking for large contrast at ±1 beat skew i.e. high-pass filter skew / semitones ross-correlation +5-5 -5 - -3 - -1 1 3 skew / beats ross-correlation @ skew = + semitones.6 raw.. filtered -5 - -3 - -1 1 3 skew / beats Beat-hroma Representations - Ellis 7-5-3-1/3

Results (1): Ellis 3 set 3 pairs of cover songs from uspop +... one correct match per query Query Take_Me_To_The_River/annie_lennox Let_It_Be/nick_cave I_Love_You/faith_hill I_an_t_et_No_Satisfaction/rolling_stones Hush/milli_vanilli rand_illusion/styx old_ust_woman/sheryl_crow od_only_knows/brian_wilson aith/limp_bizkit Enjoy_The_Silence/tori_amos ay_tripper/cheap_trick ome_together/beatles ocaine/nazareth laudette/roy_orbison ecilia/simon_and_garfunkel aroline_no/brian_wilson Blue_ollar_Man/styx Between_The_Bars/glen_phillips Before_You_ccuse_Me/eric_clapton merica/simon_and_garfunkel ll_long_the_watchtower/dave_matthews_band ddicted_to_love/tina_turner bracadabra/sugar_ray over Songs - dpwe3-1/3 correct b d l m Be Be Bl a e l o o a En a o o r Hu I_ I_ Le Ta Beat-hroma Representations - Ellis 7-5-3-13/3 Test

Results (): MIREX 6 over song contest 3 songs x 11 versions of each (!) (data has not been disclosed) # true covers in top 1 8 systems compared ( cover song + similarity) ound 761/33 = 3% recall next best: 11% guess: 3% song-set (each row is one query song) 1 3 5 6 7 8 9 1 11 1 13 1 15 16 17 18 19 1 3 5 6 7 8 9 3 MIREX 6 over Song Results: # overs retrieved per song per system S E KL1 KL KWL KWT LR TP cover song systems similarity systems 8 6 correct matches retrieved Beat-hroma Representations - Ellis 7-5-3-1/3

Where are the matches? Look inside global cross-correlation to find matching fragments... xcorr = t f ( 1 (t, f) (t, f)) - view along time Let It Be / Beatles (beats 11-1) 5 1 15 5 3 35 Let It Be / Nick ave (beats 13-3) time / beats 5 1 15 5 3 35 time / beats.. -. 5 1 15 5 3 35 time / beats Beat-hroma Representations - Ellis 7-5-3-15/3

What are the mistakes? alse reject - missed true match cover version is too different, beat tracking wrong... alse alarm - invalid match ocaine (lapton) vs. Satisfaction (Stones) Eric lapton - ocaine - beats 17:17 1 3 5 6 7 8 9 1 Rolling Stones - Satisfaction - beats 1:111 1 3 5 6 7 8 9 1 1-1 - 1 3 5 6 7 8 9 1 Beat-hroma Representations - Ellis 7-5-3-16/3

. rtist Identification (I) Baseline system: Bag of (timbral) frames M frames, model as aussian or MM distance by likelihood or KL ataset: [Mandel et al. 6] 18 artists x 5 or 6 albums each 18x3 albums for training, 18x for test, 1x1 dev u tina_turner roxette rolling_stones queen pink_floyd metallica madonna green_day genesis garth_brooks fleetwood_mac depeche_mode dave_matthews_band ence_clearwater_revival bryan_adams beatles aerosmith ae be br crdade fl gage gr mamepi qu ro ro t track 15 1 5-5 -1 true u ti ro ro qu pi me ma gr ge ga fl de da cr br be ae aebebr cr dade fl gage grmamepi quro ro t recog 15 1 5 Beat-hroma Representations - Ellis 7-5-3-17/3

Beat hroma eatures for I? rtists may use tonality in particular ways... density, variety particular chords (influence of instruments on features) Northern Lad (1998) @ 1:35 (tatum=38 BPM) 1 1 8 6 6 8 ars and uitars (5) @ 1:5 (tatum=333 BPM) 1 1 8 6 6 8 Try bag-of-frames on beat- rep n use several consecutive beats? key-normalization of each piece? Beat-hroma Representations - Ellis 7-5-3-18/3

Key Normalization ould try matching at all possible rotations.... or just transpose every piece initially single aussian model of one piece find ML rotation of other pieces model all transposed pieces iterate until convergence aligned Taxman Eleanor Rigby I'm Only Sleeping Love You To ligned lobal model Yellow Submarine She Said She Said ood ay Sunshine nd Your Bird an Sing aligned Beat-hroma Representations - Ellis 7-5-3-19/3

Timbre+hroma I Preliminary Mandel18 rtist I accuracy: eature Model T win cc Exec. time M ullov 1 8% 1 s M 6 MM 1 33% 195 s hroma ullov 1 15% 6 s hroma ullov 1% 117 s hroma 6MM 1 % 85 s hroma 6MM 15% s hromakn ullov 1 17% 11 s hromakn ullov 1% 58 s hromakn 6MM 1 5% 533 s hromakn 6MM 16% 583 s M + hroma fusion 5% Beat-hroma Representations - Ellis 7-5-3 - /3

rtist ragments Idea: ind the most discriminant beat- fragments per artist k-means cluster 16 beat fragments within piece with ourtenay otton keep fragments largest ratio (avg. similarity to same artist)/(avg. sim. to others) classify test pieces by I of best-scoring fragment! Beat-hroma Representations - Ellis 7-5-3-1/3

rtist ragment Results Preliminary, 5 way artist I, ~3% correct need to search more fragments way to choose phrase beginnings? a basis set for all tonal content?! Beat-hroma Representations - Ellis 7-5-3 - /3

onclusions and uture Work Beat-synchronous features are successful for matching cover songs captures melody-harmony, not instruments urther uses: Beat- fragments as musical building blocks e.g. VQ over large body of music find recurrent motifs artist identification? ode available! oogle matlab features Beat-hroma Representations - Ellis 7-5-3-3/3