Music Structure Analysis

Similar documents
Music Structure Analysis

Music Structure Analysis

Audio Structure Analysis

Audio Structure Analysis

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

AUTOMATED METHODS FOR ANALYZING MUSIC RECORDINGS IN SONATA FORM

Meinard Müller. Beethoven, Bach, und Billionen Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Audio Structure Analysis

Methods for the automatic structural analysis of music. Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010

Beethoven, Bach und Billionen Bytes

Tempo and Beat Analysis

Informed Feature Representations for Music and Motion

Grouping Recorded Music by Structural Similarity Juan Pablo Bello New York University ISMIR 09, Kobe October 2009 marl music and audio research lab

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Music Information Retrieval

AUDIO-BASED MUSIC STRUCTURE ANALYSIS

Tempo and Beat Tracking

Audio. Meinard Müller. Beethoven, Bach, and Billions of Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

AUDIO-BASED MUSIC STRUCTURE ANALYSIS

Beethoven, Bach, and Billions of Bytes

Music Processing Introduction Meinard Müller

New Developments in Music Information Retrieval

Popular Song Summarization Using Chorus Section Detection from Audio Signal

CS 591 S1 Computational Audio

ANALYZING MEASURE ANNOTATIONS FOR WESTERN CLASSICAL MUSIC RECORDINGS

FREISCHÜTZ DIGITAL: A CASE STUDY FOR REFERENCE-BASED AUDIO SEGMENTATION OF OPERAS

Further Topics in MIR

Effects of acoustic degradations on cover song recognition

MUSI-6201 Computational Music Analysis

A repetition-based framework for lyric alignment in popular songs

A FORMALIZATION OF RELATIVE LOCAL TEMPO VARIATIONS IN COLLECTIONS OF PERFORMANCES

Music Information Retrieval (MIR)

Music Radar: A Web-based Query by Humming System

Music Segmentation Using Markov Chain Methods

MATCHING MUSICAL THEMES BASED ON NOISY OCR AND OMR INPUT. Stefan Balke, Sanu Pulimootil Achankunju, Meinard Müller

MODELS of music begin with a representation of the

Computational Modelling of Harmony

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

Chord Classification of an Audio Signal using Artificial Neural Network

MAKE YOUR OWN ACCOMPANIMENT: ADAPTING FULL-MIX RECORDINGS TO MATCH SOLO-ONLY USER RECORDINGS

RETRIEVING AUDIO RECORDINGS USING MUSICAL THEMES

Book: Fundamentals of Music Processing. Audio Features. Book: Fundamentals of Music Processing. Book: Fundamentals of Music Processing

MAKE YOUR OWN ACCOMPANIMENT: ADAPTING FULL-MIX RECORDINGS TO MATCH SOLO-ONLY USER RECORDINGS

TOWARDS AUTOMATED EXTRACTION OF TEMPO PARAMETERS FROM EXPRESSIVE MUSIC RECORDINGS

AUTOMASHUPPER: AN AUTOMATIC MULTI-SONG MASHUP SYSTEM

Aspects of Music. Chord Recognition. Musical Chords. Harmony: The Basis of Music. Musical Chords. Musical Chords. Piece of music. Rhythm.

Towards Supervised Music Structure Annotation: A Case-based Fusion Approach.

Analysing Musical Pieces Using harmony-analyser.org Tools

SHEET MUSIC-AUDIO IDENTIFICATION

Music Processing Audio Retrieval Meinard Müller

Music Information Retrieval

Research Article Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

Audio Feature Extraction for Corpus Analysis

The song remains the same: identifying versions of the same piece using tonal descriptors

An Examination of Foote s Self-Similarity Method

A MID-LEVEL REPRESENTATION FOR CAPTURING DOMINANT TEMPO AND PULSE INFORMATION IN MUSIC RECORDINGS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Lecture 9 Source Separation

/$ IEEE

USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

Citation for published version (APA): Jensen, K. K. (2005). A Causal Rhythm Grouping. Lecture Notes in Computer Science, 3310,

Statistical Modeling and Retrieval of Polyphonic Music

SCORE-INFORMED IDENTIFICATION OF MISSING AND EXTRA NOTES IN PIANO RECORDINGS

Automatic Identification of Samples in Hip Hop Music

Music Information Retrieval (MIR)

Subjective Similarity of Music: Data Collection for Individuality Analysis

Data-Driven Solo Voice Enhancement for Jazz Music Retrieval

IMPROVING MARKOV MODEL-BASED MUSIC PIECE STRUCTURE LABELLING WITH ACOUSTIC INFORMATION

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Query By Humming: Finding Songs in a Polyphonic Database

STRUCTURAL CHANGE ON MULTIPLE TIME SCALES AS A CORRELATE OF MUSICAL COMPLEXITY

A Multimodal Way of Experiencing and Exploring Music

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

Robert Alexandru Dobre, Cristian Negrescu

Music Information Retrieval. Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University

Krzysztof Rychlicki-Kicior, Bartlomiej Stasiak and Mykhaylo Yatsymirskyy Lodz University of Technology

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION

DESIGN AND CREATION OF A LARGE-SCALE DATABASE OF STRUCTURAL ANNOTATIONS

DISCOVERY OF REPEATED VOCAL PATTERNS IN POLYPHONIC AUDIO: A CASE STUDY ON FLAMENCO MUSIC. Univ. of Piraeus, Greece

Toward Automatic Music Audio Summary Generation from Signal Analysis

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

A REAL-TIME SIGNAL PROCESSING FRAMEWORK OF MUSICAL EXPRESSIVE FEATURE EXTRACTION USING MATLAB

Musical Examination to Bridge Audio Data and Sheet Music

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Melody Retrieval On The Web

Interacting with a Virtual Conductor

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

arxiv: v1 [cs.ir] 2 Aug 2017

Towards Automated Processing of Folk Song Recordings

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Shades of Music. Projektarbeit

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

SIMULTANEOUS SEPARATION AND SEGMENTATION IN LAYERED MUSIC

MUSIC SHAPELETS FOR FAST COVER SONG RECOGNITION

Transcription:

Overview Tutorial Music Structure Analysis Part I: Principles & Techniques (Meinard Müller) Coffee Break Meinard Müller International Audio Laboratories Erlangen Universität Erlangen-Nürnberg meinard.mueller@audiolabs-erlangen.de Jordan B. L. Smith Electronic Engineering and Computer Science Queen Mary University of London j.smith@qmul.ac.uk Part II: Evaluation & Annotation (Jordan Smith) Music Structure Analysis Music Structure Analysis Music Structure Analysis Music Structure Analysis A1 A2 B1 B2 C A3 B3 B4 I V1 V2 V3 V4 V5 V6 V7 B V8 O

Music Structure Analysis Example: Folk Song Field Recording (Nederlandse Liederenbank) Music Structure Analysis Example: Weber, Song (No. 4) from Der Freischütz Introduction Stanzas Dialogues....... Kleiber 0 50 100 150 200 Ackermann 20 40 60 80 100 120 Music Structure Analysis General goal: Divide an audio recording into temporal segments corresponding to musical parts and group these segments into musically meaningful categories. Examples: Stanzas of a folk song Music Structure Analysis General goal: Divide an audio recording into temporal segments corresponding to musical parts and group these segments into musically meaningful categories. Challenge: There are many different principles for creating relationships that form the basis for the musical structure. Intro, verse, chorus, bridge, outro sections of a pop song Exposition, development, recapitulation, coda of a sonata Musical form ABACADA of a rondo Homogeneity: Novelty: Repetition: Consistency in tempo, instrumentation, key, Sudden changes, surprising elements Repeating themes, motives, rhythmic patterns, Music Structure Analysis Overview Novelty Homogeneity Repetition Introduction Thanks: s Self-Similarity Matrices Audio Thumbnailing Converting Path to Block Structures Clausen, Ewert, Kurth, Grohganz, Dannenberg, Goto Grosche, Jiang Paulus, Klapuri Peeters, Kaiser, Serra, Gómez, Smith, Fujinaga, Wiering, Wand, Sunkel, Jansen

Overview Introduction s Self-Similarity Matrices Audio Thumbnailing Converting Path to Block Structures Thanks: Clausen, Ewert, Kurth, Grohganz, Dannenberg, Goto Grosche, Jiang Paulus, Klapuri Peeters, Kaiser, Serra, Gómez, Smith, Fujinaga, Wiering, Wand, Sunkel, Jansen General goal: Convert an audio recording into a mid-level representation that captures certain musical properties while supressing other properties. Timbre / Instrumentation Tempo / Rhythm Pitch / Harmony General goal: Convert an audio recording into a mid-level representation that captures certain musical properties while supressing other properties. Example: Chromatic scale C1 C2 C3 C4 24 36 48 60 C5 72 C6 84 C7 96 C8 108 Timbre / Instrumentation Tempo / Rhythm Pitch / Harmony Waveform Amplitude Example: Chromatic scale Example: Chromatic scale C1 24 Spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 C1 24 Spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Frequency (Hz) Frequency (Hz) Intensity (db) Intensity (db) Frequency (Hz) Frequency (Hz) Intensity (db) Intensity (db)

Example: Chromatic scale Example: Chromatic scale C1 24 Spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 C1 C2 C3 C4 24 36 48 60 Log-frequency spectrogram C5 72 C6 84 C7 96 C8 108 C8: 4186 Hz C8: 4186 Hz C7: 2093 Hz C7: 2093 Hz C6: 1046 Hz Intensity (db) C6: 1046 Hz C5: 523 Hz C4: 261 Hz C3: 131 Hz Intensity (db) C5: 523 Hz C4: 261 Hz C3: 131 Hz Example: Chromatic scale Example: Chromatic scale C1 C2 C3 C4 24 36 48 60 Log-frequency spectrogram C5 72 C6 84 C7 96 C8 108 C1 C2 C3 C4 24 36 48 60 Log-frequency spectrogram C5 72 C6 84 C7 96 C8 108 Pitch (MIDI note number) Intensity (db) Pitch (MIDI note number) Intensity (db) Chroma C Example: Chromatic scale Example: Chromatic scale C1 C2 C3 C4 24 36 48 60 Log-frequency spectrogram C5 72 C6 84 C7 96 C8 108 C1 24 C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Pitch (MIDI note number) Intensity (db) Chroma representation Chroma Intensity (db) Chroma C #

Chroma (Harmony) Feature extraction A1 A2 B1 B2 C A3 B3 B4 A1 A2 B1 B2 C A3 B3 B4 B b G Chroma (Harmony) Feature extraction B G B b G Chroma (Harmony) Feature extraction D D D G minor G minor G minor G major G minor A1 A2 B1 B2 C A3 B3 B4 A1 A2 B1 B2 C A3 B3 B4 Overview Introduction s Self-Similarity Matrices General idea: Compare each element of the feature sequence with each other element of the feature sequence based on a suitable similarity measure. Quadratic self-similarity matrix Audio Thumbnailing Converting Path to Block Structures

G major G major Faster Slower

Idealized SSM Faster Slower Idealized SSM Blocks: Homogeneity Block Enhancement Feature smoothing Coarsening Paths: Repetition Corners: Novelty Block Enhancement Feature smoothing Coarsening Block Enhancement Feature smoothing Coarsening

Path Enhancement Path Enhancement Diagonal smoothing Path Enhancement Diagonal smoothing Multiple filtering Path Enhancement Diagonal smoothing Multiple filtering Thresholding (relative) Scaling & penalty Further Processing Path extraction Further Processing Path extraction Pairwise relations 1 2 3 4 5 6 7 100 200 300 400

Further Processing Further Processing Path extraction Pairwise relations Grouping (transitivity) Path extraction Pairwise relations Grouping (transitivity) 1 2 3 4 5 6 7 1 2 3 4 5 6 7 100 200 300 400 100 200 300 400 100 200 300 400 I V1 V2 V3 V4 V5 V6 V7 B V8 O Missing relations because of transposed sections Idea: Cyclic shift of one of the chroma sequences One semitone up

Idea: Cyclic shift of one of the chroma sequences Idea: Overlay & Maximize Transposition-invariant SSM Two semitones up Note: Order of enhancement steps important! Similarity Matrix Toolbox Maximization Smoothing & Maximization Meinard Müller, Nanzhu Jiang, Harald Grohganz SM Toolbox: MATLAB Implementations for Computing and Enhancing Similarity Matrices http://www.audiolabs-erlangen.de/resources/mir/smtoolbox/ Overview Introduction s Self-Similarity Matrices Audio Thumbnailing Thanks: Jiang, Grosche Peeters Cooper, Foote Goto Levy, Sandler Mauch Sapp Audio Thumbnailing General goal: Determine the most representative section ( Thumbnail ) of a given music recording. I V1 V2 V3 V4 V5 V6 V7 B V8 O A1 A2 B1 B2 C A3 B3 B4 Converting Path to Block Structures Thumbnail is often assumed to be the most repetitive segment

Audio Thumbnailing Two steps 1. Path extraction Both steps are problematic! Paths of poor quality (fragmented, gaps) Block-like structures Curved paths 2. Grouping Noisy relations (missing, distorted, overlapping) Transitivity computation difficult Main idea: Do both, path extraction and grouping, jointly One optimization scheme for both steps Stabilizing effect Efficient Audio Thumbnailing Main idea: Do both path extraction and grouping jointly For each audio segment we define a fitness value This fitness value expresses how well the segment explains the entire audio recording The segment with the highest fitness value is considered to be the thumbnail As main technical concept we introduce the notion of a path family 200 1 Enhanced SSM Path over segment 180 160 0.5 140 0 120 100 0.5 80 60 1 40 1.5 20 0 0 50 100 150 200 2 Path over segment Path over segment Induced segment Score is high Path over segment Path over segment Induced segment Score is high A second path over segment Induced segment Score is not so high

Path over segment Path over segment Induced segment Score is high Path family A path family over a segment is a family of paths such that the induced segments do not overlap. A second path over segment Induced segment Score is not so high A third path over segment Induced segment Score is very low Path family A path family over a segment is a family of paths such that the induced segments do not overlap. This is not a path family! Path family A path family over a segment is a family of paths such that the induced segments do not overlap. This is a path family! (Even though not a good one) Optimal path family Optimal path family Consider over the segment the optimal path family, i.e., the path family having maximal overall score. Call this value: Score(segment) Note: This optimal path family can be computed using dynamic programming.

Optimal path family Consider over the segment the optimal path family, i.e., the path family having maximal overall score. Call this value: Score(segment) Furthermore consider the amount covered by the induced segments. Call this value: Coverage(segment) P := R := Score(segment) Coverage(segment) Self-explanation are trivial! Self-explanation are trivial! Subtract length of segment P := Score(segment) P := Score(segment) - length(segment) R := Coverage(segment) R := Coverage(segment) - length(segment) Self-explanation are trivial! Subtract length of segment Normalization (segment) F := 2 P R / (P + R) P := Normalize( Score(segment) - length(segment) ) R := Normalize( Coverage(segment) - length(segment) ) [0,1] [0,1] P := Normalize( Score(segment) - length(segment) ) R := Normalize( Coverage(segment) - length(segment) ) [0,1] [0,1]

Thumbnail Scape Plot Thumbnail Scape Plot Segment length Segment length (segment) Segment length Segment length Segment center Segment center Segment center Segment center Thumbnail Scape Plot Thumbnail Scape Plot Segment length Segment length Segment center Segment center Note: Self-explanations are ignored fitness is zero Thumbnail Scape Plot Thumbnail Scape Plot Segment length Segment center A1 A2 B1 B2 C A3 B3 B4 Thumbnail := segment having the highest fitness

Thumbnail Scape Plot Thumbnail Scape Plot A1 A2 B1 B2 C A3 B3 B4 A1 A2 B1 B2 C A3 B3 B4 Thumbnail Scape Plot Scape Plot A1 A2 B1 B2 C A3 B3 B4 Scape Plot Scape Plot Coloring according to clustering result (grouping) Coloring according to clustering result (grouping) A1 A2 B1 B2 C A3 B3 B4

Thumbnail Scape Plot Thumbnail Scape Plot I V1 V2 V3 V4 V5 V6 V7 B V8 O I V1 V2 V3 V4 V5 V6 V7 B V8 O Overview Introduction s Self-Similarity Matrices Thanks: Foote Serra, Grosche, Arcos Goto Tzanetakis, Cook General goals: Find instances where musical changes occur. Find transition between subsequent musical parts. Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM. Audio Thumbnailing Converting Path to Block Structures Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM. Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM.

Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM. Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM. Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM. Idea (Foote): Use checkerboard-like kernel function to detect corner points on main diagonal of SSM. Novelty function using Novelty function using Novelty function using Idea: Find instances where structural changes occur. Combine global and local aspects within a unifying framework Structure features Structure features Enhanced SSM

Structure features Enhanced SSM Time-lag SSM Structure features Enhanced SSM Time-lag SSM Cyclic time-lag SSM Structure features Enhanced SSM Time-lag SSM Cyclic time-lag SSM Columns as features Example: Chopin Mazurka Op. 24, No. 1 SSM Time-lag SSM Example: Chopin Mazurka Op. 24, No. 1 SSM Example: Chopin Mazurka Op. 24, No. 1 SSM Time-lag SSM Time-lag SSM

Example: Chopin Mazurka Op. 24, No. 1 SSM Time-lag SSM Overview Introduction s Self-Similarity Matrices Audio Thumbnailing Thanks: Grohganz, Clausen Kaiser Peeters Dubnov, Apel Serra, Grosche, Arcos Structure-based novelty function Converting Path to Block Structures Converting Path to Block Structures Motivation Converting Path to Block Structures Motivation Perform joint analysis using repetitive as well as homogeneous aspects Homogeneity SSM NMF Clustering Make homogeneity-based methods applicable to repetition-based analysis Repetition SSM NMF Clustering Converting Path to Block Structures Procedure Enhanced SSM Converting Path to Block Structures Procedure Enhanced SSM Thresholding & image processing

Converting Path to Block Structures Converting Path to Block Structures Procedure Procedure Enhanced SSM Thresholding & image processing Eigenvalue decomposition Enhanced SSM Thresholding & image processing Eigenvalue decomposition Weigthing Converting Path to Block Structures Converting Path to Block Structures Procedure Procedure Enhanced SSM Thresholding & image processing Eigenvalue decomposition Weigthing Clustering & smoothing Enhanced SSM Thresholding & image processing Eigenvalue decomposition Weigthing Clustering & smoothing Columns as features Converting Path to Block Structures Converting Path to Block Structures Procedure Procedure Enhanced SSM Thresholding & image processing Eigenvalue decomposition Weigthing Clustering & smoothing Columns as features SSM from these features Enhanced SSM Thresholding & image processing Eigenvalue decomposition Weigthing Clustering & smoothing Columns as features SSM from these features Final matrix show paths as blocks

Conclusions Conclusions Score Audio MIDI Representations Structure Analysis Structure Analysis Conclusions Conclusions Score Audio MIDI Score Audio MIDI Representations Representations Musical Aspects Structure Analysis Musical Aspects Structure Analysis Segmentation Principles Harmony Timbre Tempo Harmony Timbre Tempo Repetition Homogeneity Novelty Conclusions Conclusions Temporal and Hierarchical Context Combined Approaches Audio Score MIDI Representations Hierarchical Approaches Harmony Musical Aspects Timbre Structure Analysis Tempo Segmentation Principles Repetition Homogeneity Novelty Evaluation Explaining Structure MIREX SALAMI-Project Smith, Chew

Overview Part I: Part II: Principles & Techniques (Meinard Müller) Coffee Break Evaluation & Annotation (Jordan Smith) Book Project A First Course on Music Processing Textbook (approx. 500 pages) 1. Music Representations 2. Fourier Analysis of Signals 3. Music Synchronization 4. Music Structure Analysis 5. Chord Recognition 6. Tempo and Beat Tracking 7. Content-based Audio Retrieval 8. Music Transcription To appear (plan): End of 2015 Need people for proofreading and testing References W. CHAI AND B. VERCOE, Music thumbnailing via structural analysis, in Proceedings of the ACM International Conference on Multimedia, Berkeley, CA, USA, 2003, pp. 223 226. M. COOPER AND J. FOOTE, Automatic music summarization via similarity analysis, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Paris, France, 2002, pp. 81 85. R. B. DANNENBERG AND M. GOTO, Music structure analysis from acoustic signals, in Handbook of Signal Processing in Acoustics, D. Havelock, S. J. FOOTE, Visualizing music and audio using self-similarity, in Proceedings of the ACM International Conference on Multimedia, Orlando, FL, USA, 1999, pp. 77 80. J. FOOTE, Automatic audio segmentation using a measure of audio novelty, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), New York, NY, USA, 2000, pp. 452 455. M. GOTO, A chorus section detection method for musical audio signals and its application to a music listening station, IEEE Transactions on Audio, Speech and Language Processing, 14 (2006), pp. 1783 1794 H. GROHGANZ, M. CLAUSEN, N. JIANG, AND M. MÜLLER, Converting path structures into block structures using eigenvalue decompositions of self-similarity matrices, in Proceedings of the 14th International Conference on Music Information Retrieval (ISMIR), Curitiba, Brazil, 2013, pp. 209 214. K. JENSEN, Multiple scale music segmentation using rhythm, timbre, and harmony, EURASIP Journal on Advances in Signal Processing, 2007 (2007). F. KAISER AND T. SIKORA, Music structure discovery in popular music using non-negative matrix factorization, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, 2010, pp. 429 434. References M. LEVY, M. SANDLER, AND M. A. CASEY, Extraction of high-level musical structure from audio data and its application to thumbnail generation, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, 2006, pp. 13 16. H. LUKASHEVICH, Towards quantitative measures of evaluating song segmentation, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Philadelphia, USA, 2008, pp. 375 380. M. MÜLLER AND M. CLAUSEN, Transposition-invariant self-similarity matrices, in Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 47 50. M. MÜLLER AND N. JIANG, A scape plot representation for visualizing repetitive structures of music recordings, in Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), Porto, Portugal, 2012, pp. 97 102. M. MÜLLER, N. JIANG, AND H. GROHGANZ, SM Toolbox: MATLAB implementations for computing and enhancing similiarty matrices, in Proceedings of the 53rd AES Conference on Semantic Audio, London, GB, 2014. M. MÜLLER, N. JIANG, AND P. GROSCHE, A robust fitness measure for capturing repetitions in music recordings with applications to audio thumbnailing, IEEE Transactions on Audio, Speech & Language Processing, 21 (2013), pp. 531 543. M. MÜLLER AND F. KURTH, Enhancing similarity matrices for music audio analysis, in Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, 2006, pp. 437 440. M. MÜLLER AND F. KURTH, Towards structural analysis of audio recordings in the presence of musical variations, EURASIP Journal on Advances in Signal Processing, 2007 (2007). References J. PAULUS AND A. P. KLAPURI, Music structure analysis using a probabilistic fitness measure and a greedy search algorithm, IEEE Transactions on Audio, Speech, and Language Processing, 17 (2009), pp. 1159 1170. J. PAULUS, M. MÜLLER, AND A. P. KLAPURI, Audio-based music structure analysis, in Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR), Utrecht, The Netherlands, 2010, pp. 625 636. G. PEETERS, Deriving musical structure from signal analysis for music audio summary generation: sequence and state approach, in Computer Music Modeling and Retrieval, vol. 2771 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2004, pp. 143 166. G. PEETERS, Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 35 40. C. RHODES AND M. A. CASEY, Algorithms for determining and labelling approximate hierarchical self-similarity, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 41 46. J. SERRÀ, M. MÜLLER, P. GROSCHE, AND J. L. ARCOS, Unsupervised detection of music boundaries by time series structure features, in Proceedings of the AAAI International Conference on Artificial Intelligence, Toronto, Ontario, Canada, 2012, pp. 1613 1619. J. B. L. SMITH, J. A. BURGOYNE, I. FUJINAGA, D. D. ROURE, AND J. S. DOWNIE, Design and creation of a large-scale database of structural annotations, in Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Miami, FL, USA, 2011, pp. 555 560. J. B. L. SMITH AND E. CHEW, Using quadratic programming to estimate feature relevance in structural analyses of music, in Proceedings of the ACM International Conference on Multimedia, 2013, pp. 113 122. References M. SUNKEL, S. JANSEN, M. WAND, E. EISEMANN, H.-P. SEIDEL, Learning Line Features in 3D Geometry, in Computer Graphics Forum (Proc. Eurographics), 2011. D. TURNBULL, G. LANCKRIET, E. PAMPALK, AND M. GOTO, A supervised approach for detecting boundaries in music using difference features and boosting, in Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 2007, pp. 51 54. G. TZANETAKIS AND P. COOK, Multifeature audio segmentation for browsing and annotation, in Proceedings of the IEEEWorkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Platz, NY, USA, 1999, pp. 103 106.

Acknowledgement Michael Clausen (Bonn University) Jonathan Driedger (Universität Erlangen-Nürnberg) Sebastian Ewert (Bonn University) Harald Grohganz (Bonn University) Peter Grosche (Saarland University) Nanzhu Jiang (Universität Erlangen-Nürnberg) Verena Konz (Saarland University) Frank Kurth (Fraunhofer-FKIE, Wachtberg ) Thomas Prätzlich (Universität Erlangen-Nürnberg) Joan Serrà (Artificial Intelligence Research Institute) This work has been supported by the German Research Foundation (DFG MU 2682/5-1). The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institut für Integrierte Schaltungen IIS.