Beat-Synchronous Chroma Representations for Music Analysis

Size: px

Start display at page:

Download "Beat-Synchronous Chroma Representations for Music Analysis"

Brett McDonald
5 years ago
Views:

Beat-Synchronous hroma Representations for Music nalysis an Ellis Laboratory for Recognition and Organization of Speech and udio ept. Electrical Eng., olumbia Univ.

1 Beat-Synchronous hroma Representations for Music nalysis an Ellis Laboratory for Recognition and Organization of Speech and udio ept. Electrical Eng., olumbia Univ., NY US 1. hroma eatures. Beat Tracking 3. Matching over Songs. rtist Identification Beat-hroma Representations - Ellis /3

2 Beyond Ms... Ms have been useful in udio Music IR timbral similarity artist I, segmentation, thumbnailing, singing... Separate tradition of Symbolic MIR melody matching, chord detection, meter analysis It s time to bring them together... with robust audio mid-level representations... that capture tonal (melodic-harmonic) content freq / khz 3 Let It Be / Beatles / verse 1 freq / khz 3 Let It Be / Nick ave / verse time / sec = beat-synchronous features Beat-hroma Representations - Ellis / time / se

3 Piano scale 1. hroma eatures hroma features map spectral energy into one canonical octave freq / khz 3 1 i.e. 1 semitone bins time / sec time / frames an resynthesize as Shepard Tones all octaves at once level / db Piano tic scale 1 Shepard tone spectra freq / Hz Beat-hroma Representations - Ellis /3 freq / khz 3 1 I Shepard tone resynth time / sec

alculating hroma eatures Method 1: Map every STT bin blurs non-tonal energy freq / khz 3 1 5 1 15 fft bin 6 8 1

15 fft bin 6 8 1 time / sec 5 1 15 5 3 time / frame Method 3: Instantaneous requency / t escapes frequency

4 alculating hroma eatures Method 1: Map every STT bin blurs non-tonal energy freq / khz fft bin time / sec time / frame Method : Map only STT peaks still blurry at low frequencies freq / khz fft bin time / sec time / frame Method 3: Instantaneous requency / t escapes frequency resolution limit ( 3 ) 1 freq / khz Beat-hroma Representations - Ellis / time / sec time / frame

5 . Beat Tracking (1) oal: One feature vector per beat (tatum) for tempo normalization, efficiency Onset Strength Envelope sumf(max(, difft(log X(t, f) ))) freq / mel time / sec utocorr. + window global tempo estimate BPM lag / ms samples Beat-hroma Representations - Ellis /3

6 Beat Tracking () ynamic Programming finds beat times {t i } optimizes i O(t i ) + i W((t i+1 t i p )/ ) where O(t) is onset strength envelope (local score) W(t) is a log-aussian window (transition cost) p is the default beat period per measured tempo incrementally find best predecessor at every time backtrace from largest final score to get beats *(t) O(t) τ t *(t) = γ O(t) + (1 γ)max{w((τ τ p )/β)*(τ)} τ P(t) = argmax{w((τ τ p )/β)*(τ)} τ Beat-hroma Representations - Ellis /3

7 freq / Bark band freq / Bark band Beat Tracking Results P will bridge gaps (non-causal) 3 1 there is always a best path... nd place in MIREX 6 Beat Tracking compared to McKinney & Moelants human data 3 1 lanis Morissette - ll I Want - gap + beats time / sec test (Bragg) - McKinney + Moelants Subject data Subject # 5 1 time / s 15 Beat-hroma Representations - Ellis /3

8 Beat-Synchronous hroma eatures Beat + features / 3ms frames average within each beat compact; sufficient? &# 3,5-.-6,7 %# $# "# 89/,)-/),9:); # ;8+-1*9/ ;8+-1*9/ "$ "# ( ' & $ #! "# )*+,-.-/, "! "$ "# ( ' & $! "# "! $# $! %# %! )*+,-.-1,)/ Beat-hroma Representations - Ellis /3

9 freq / khz 3. over Song etection over Songs = reinterpretation of a piece different instrumentation, character no match with timbral features 3 Let It Be - The Beatles Let It Be / Beatles / verse 1 freq / khz 3 Let It Be - Nick ave Let It Be / Nick ave / verse 1 with raham Poliner time / sec Need a different representation! beat-synchronous features Beat-sync features Beat-sync features time / se beats beat Beat-hroma Representations - Ellis /3

10 bins E Matching (1): Little ragments over versions may change song structure multiple local matches at different alignments Match query and target as many small pieces? extract Query beats cross-correlate andidate how big are the pieces? how do we combine individual scores? do we have all day? bins E beats Beat-hroma Representations - Ellis /3

semitones E E +5 Elliott Smith - Between the Bars 1 3 5 beats @81 BPM len Phillips - Between the Bars

11 Matching (): lobal orrelation ross-correlate entire beat- matrices... at all possible transpositions implicit combination of match quality and duration bins bins skew / semitones E E +5 Elliott Smith - Between the Bars BPM len Phillips - Between the Bars ross-correlation skew / beats One good matching fragment is sufficient...? Beat-hroma Representations - Ellis /3

12 iltered ross-orrelation Raw correlation not as important as precise local match looking for large contrast at ±1 beat skew i.e. high-pass filter skew / semitones ross-correlation skew / beats skew = + semitones.6 raw.. filtered skew / beats Beat-hroma Representations - Ellis /3

13 Results (1): Ellis 3 set 3 pairs of cover songs from uspop +... one correct match per query Query Take_Me_To_The_River/annie_lennox Let_It_Be/nick_cave I_Love_You/faith_hill I_an_t_et_No_Satisfaction/rolling_stones Hush/milli_vanilli rand_illusion/styx old_ust_woman/sheryl_crow od_only_knows/brian_wilson aith/limp_bizkit Enjoy_The_Silence/tori_amos ay_tripper/cheap_trick ome_together/beatles ocaine/nazareth laudette/roy_orbison ecilia/simon_and_garfunkel aroline_no/brian_wilson Blue_ollar_Man/styx Between_The_Bars/glen_phillips Before_You_ccuse_Me/eric_clapton merica/simon_and_garfunkel ll_long_the_watchtower/dave_matthews_band ddicted_to_love/tina_turner bracadabra/sugar_ray over Songs - dpwe3-1/3 correct b d l m Be Be Bl a e l o o a En a o o r Hu I_ I_ Le Ta Beat-hroma Representations - Ellis /3 Test

14 Results (): MIREX 6 over song contest 3 songs x 11 versions of each (!) (data has not been disclosed) # true covers in top 1 8 systems compared ( cover song + similarity) ound 761/33 = 3% recall next best: 11% guess: 3% song-set (each row is one query song) MIREX 6 over Song Results: # overs retrieved per song per system S E KL1 KL KWL KWT LR TP cover song systems similarity systems 8 6 correct matches retrieved Beat-hroma Representations - Ellis /3

15 Where are the matches? Look inside global cross-correlation to find matching fragments... xcorr = t f ( 1 (t, f) (t, f)) - view along time Let It Be / Beatles (beats 11-1) Let It Be / Nick ave (beats 13-3) time / beats time / beats time / beats Beat-hroma Representations - Ellis /3

16 What are the mistakes? alse reject - missed true match cover version is too different, beat tracking wrong... alse alarm - invalid match ocaine (lapton) vs. Satisfaction (Stones) Eric lapton - ocaine - beats 17: Rolling Stones - Satisfaction - beats 1: Beat-hroma Representations - Ellis /3

genesis garth_brooks fleetwood_mac depeche_mode dave_matthews_band ence_clearwater_revival bryan_adams beatles aerosmith ae be br crdade fl gage gr mamepi qu ro ro

17 . rtist Identification (I) Baseline system: Bag of (timbral) frames M frames, model as aussian or MM distance by likelihood or KL ataset: [Mandel et al. 6] 18 artists x 5 or 6 albums each 18x3 albums for training, 18x for test, 1x1 dev u tina_turner roxette rolling_stones queen pink_floyd metallica madonna green_day genesis garth_brooks fleetwood_mac depeche_mode dave_matthews_band ence_clearwater_revival bryan_adams beatles aerosmith ae be br crdade fl gage gr mamepi qu ro ro t track true u ti ro ro qu pi me ma gr ge ga fl de da cr br be ae aebebr cr dade fl gage grmamepi quro ro t recog Beat-hroma Representations - Ellis /3

8 ars and uitars (5) @ 1:5 (tatum=333 BPM) 1 1 8 6 6 8 Try

18 Beat hroma eatures for I? rtists may use tonality in particular ways... density, variety particular chords (influence of instruments on features) Northern Lad 1:35 (tatum=38 BPM) ars and uitars 1:5 (tatum=333 BPM) Try bag-of-frames on beat- rep n use several consecutive beats? key-normalization of each piece? Beat-hroma Representations - Ellis /3

19 Key Normalization ould try matching at all possible rotations.... or just transpose every piece initially single aussian model of one piece find ML rotation of other pieces model all transposed pieces iterate until convergence aligned Taxman Eleanor Rigby I'm Only Sleeping Love You To ligned lobal model Yellow Submarine She Said She Said ood ay Sunshine nd Your Bird an Sing aligned Beat-hroma Representations - Ellis /3

20 Timbre+hroma I Preliminary Mandel18 rtist I accuracy: eature Model T win cc Exec. time M ullov 1 8% 1 s M 6 MM 1 33% 195 s hroma ullov 1 15% 6 s hroma ullov 1% 117 s hroma 6MM 1 % 85 s hroma 6MM 15% s hromakn ullov 1 17% 11 s hromakn ullov 1% 58 s hromakn 6MM 1 5% 533 s hromakn 6MM 16% 583 s M + hroma fusion 5% Beat-hroma Representations - Ellis /3

21 rtist ragments Idea: ind the most discriminant beat- fragments per artist k-means cluster 16 beat fragments within piece with ourtenay otton keep fragments largest ratio (avg. similarity to same artist)/(avg. sim. to others) classify test pieces by I of best-scoring fragment! Beat-hroma Representations - Ellis /3

22 rtist ragment Results Preliminary, 5 way artist I, ~3% correct need to search more fragments way to choose phrase beginnings? a basis set for all tonal content?! Beat-hroma Representations - Ellis /3

23 onclusions and uture Work Beat-synchronous features are successful for matching cover songs captures melody-harmony, not instruments urther uses: Beat- fragments as musical building blocks e.g. VQ over large body of music find recurrent motifs artist identification? ode available! oogle matlab features Beat-hroma Representations - Ellis /3

Searching for Similar Phrases in Music Audio

Searching for Similar Phrases in Music udio an Ellis Laboratory for Recognition and Organization of Speech and udio ept. Electrical Engineering, olumbia University, NY US http://labrosa.ee.columbia.edu/