Advanced Course Computer Science Music Processing Summer Term 2009 Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Music Structure Analysis Music segmentation pitch content (e.g., melody, harmony) music texture (e.g., timbre, instrumentation, sound) rhythm Detection of repeating sections, phrases, motives song structure (e.g., intro, versus, chorus) musical form (e.g., sonata, symphony, concerto) Detection of other hidden relationships 2 Given: CD recording Goal: Automatic extraction of the repetitive structure (or of the musical form) Example: Brahms Hungarian Dance No. 5 (Ormandy) Dannenberg/Hu (ISMIR 2002) Peeters/Burthe/Rodet (ISMIR 2002) Cooper/Foote (ISMIR 2002) Goto (ICASSP 2003) Chai/Vercoe (ACM Multimedia 2003) Lu/Wang/Zhang (ACM Multimedia 2004) Bartsch/Wakefield (IEEE Trans. Multimedia 2005) Goto (IEEE Trans. Audio 2006) Müller/Kurth (EURASIP 2007) Rhodes/Casey (ISMIR 2007) Peeters (ISMIR 2007) 3 4 Audio features Cost measure and cost matrix self-similarity matrix Path extraction (pairwise similarity of segments) Global structure (clustering, grouping) Audio = 12-dimensional normalized chroma vector Local cost measure cost matrix quadratic self-similarity matrix 5 6
7 8 9 10 11 12
13 14 Similarity cluster Challenge: Presence of musical variations Fragmented paths and gaps Paths of poor quality Regions of constant (low) cost Curved paths Idea: Enhancement of path structure 15 16 Shostakovich Waltz 2, Jazz Suite No. 2 (Chailly) Idea: Usage of contextual information (Foote 1999) Comparison of entire sequences length of sequences enhanced cost matrix smoothing effect 17 18
(Shostakovich) (Shostakovich) Cost matrix Enhanced cost matrix 19 20 (Brahms) (Brahms) Cost matrix Enhanced cost matrix Problem: Relative tempo differences are smoothed out 21 22 Idea: Smoothing along various directions and minimizing over all directions th direction of smoothing enhanced cost matrix w.r.t. Usage of eight slope values tempo changes of -30 to +40 percent 23 24
Cost matrix Cost matrix with Filtering along main diagonal 25 26 Cost matrix with Filtering along 8 different directions and minimizing Start with initial point Extend path in greedy fashion Remove path neighborhood 27 28 Cost matrix Enhanced cost matrix 29 30
Enhanced cost matrix Thresholded 31 32 Thresholded, upper left 33 Path removal 34 Path removal 35 Path removal 36
Extracted paths 37 Extracted paths after postprocessing 38 How can one derive the global structure from pairwise relations? 39 40 Taks: Computation of similarity clusters Path relations Problem: Missing and inconsistent path relations Strategy: Approximate transitive hull 41 42
Path relations Path relations 43 44 Path relations Path relations 45 46 Path relations Example: Zager & Evans In The Year 2525 Final result Ground truth 47 48
Goto (ICASSP 2003) Cyclically shift chroma vectors in one sequence Compare shifted sequence with original sequence Perform for each of the twelve shifts a separate structure analysis Combine the results Goto (ICASSP 2003) Cyclically shift chroma vectors in one sequence Compare shifted sequence with original sequence Perform for each of the twelve shifts a separate structure analysis Combine the results Müller/Clausen (ISMIR 2007) Integrate all cyclic information in one transposition-invariant self-similarity matrix Perform one joint structure analysis 49 50 Example: Zager & Evans In The Year 2525 Example: Zager & Evans In The Year 2525 Original: 51 Original: Shifted: 52 53 54
55 56 Minimize over all twelve matrices 57 Thresholded self-similarity matrix 58 Path extraction 59 Path extraction Computation of similarity clusters 60
Stabilizing effect Stabilizing effect (thresholded) 61 (thresholded) 62 Stabilizing effect Transposition-invariant matrix Minimizing shift index Transposition-invariant self-similarity matrix (thresholded) 63 64 Transposition-invariant matrix Minimizing shift index Transposition-invariant matrix Minimizing shift index = 0 65 66
Transposition-invariant matrix Minimizing shift index = 1 Transposition-invariant matrix Minimizing shift index = 2 67 68 Example: Beethoven Tempest Serra/Gomez (ICASSP 2008): Used for Cover Song ID Discrete structure suitable for indexing? 69 70 Example: Beethoven Tempest Conclusions: Challenge: Musical variations Timbre, dynamics, tempo Musical key cyclic chroma shifts Major/minor Differences at note level / improvisations Transposition-invariant self-similarity matrix 71 72
Conclusions: Strategy: Matrix enhancement Conclusions: Challenge: Hierarchical structure of music Filtering techniques / contextual information Cooper/Foote (ISMIR 2002) Müller/Kurth (ICASSP 2006) Transposition-invariant similarity matrices Goto (ICASSP 2003) Müller/Clausen (ISMIR 2007) Higher-order similarity matrices Peeters (ISMIR 2007) Rhodes/Casey (ISMIR 2007) 73 74 System: SmartMusicKiosk (Goto) System: SyncPlayer/AudioStructure 75 76