Methods for the automatic structural analysis of music. Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010

Similar documents
Music Structure Analysis

Music Structure Analysis

Audio Structure Analysis

MODELS of music begin with a representation of the

Audio Structure Analysis

AUDIO-BASED MUSIC STRUCTURE ANALYSIS

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

AUDIO-BASED MUSIC STRUCTURE ANALYSIS

Music Structure Analysis

The song remains the same: identifying versions of the same piece using tonal descriptors

Audio Structure Analysis

Music Segmentation Using Markov Chain Methods

CS 591 S1 Computational Audio

Shades of Music. Projektarbeit

Tempo and Beat Analysis

/$ IEEE

Automatic Piano Music Transcription

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Subjective Similarity of Music: Data Collection for Individuality Analysis

Content-based music retrieval

Grouping Recorded Music by Structural Similarity Juan Pablo Bello New York University ISMIR 09, Kobe October 2009 marl music and audio research lab

MUSI-6201 Computational Music Analysis

New Developments in Music Information Retrieval

Content-based Music Structure Analysis with Applications to Music Semantics Understanding

Rhythm related MIR tasks

Chord Classification of an Audio Signal using Artificial Neural Network

Towards Supervised Music Structure Annotation: A Case-based Fusion Approach.

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

An Examination of Foote s Self-Similarity Method

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Research Article Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony

Citation for published version (APA): Jensen, K. K. (2005). A Causal Rhythm Grouping. Lecture Notes in Computer Science, 3310,

Statistical Modeling and Retrieval of Polyphonic Music

The Effect of DJs Social Network on Music Popularity

IMPROVING MARKOV MODEL-BASED MUSIC PIECE STRUCTURE LABELLING WITH ACOUSTIC INFORMATION

A repetition-based framework for lyric alignment in popular songs

Music Radar: A Web-based Query by Humming System

Music structure information is

Music Similarity and Cover Song Identification: The Case of Jazz

A comparison and evaluation of approaches to the automatic formal analysis of musical audio

IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio

Piano Transcription MUMT611 Presentation III 1 March, Hankinson, 1/15

Transcription of the Singing Melody in Polyphonic Music

Toward Automatic Music Audio Summary Generation from Signal Analysis

Introductions to Music Information Retrieval

Popular Song Summarization Using Chorus Section Detection from Audio Signal

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data

A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

Computational Modelling of Harmony

Is Music Structure Annotation Multi-Dimensional? A Proposal for Robust Local Music Annotation.

LyricAlly: Automatic Synchronization of Acoustic Musical Signals and Textual Lyrics

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Acoustic Scene Classification

HarmonyMixer: Mixing the Character of Chords among Polyphonic Audio

TOWARD AUTOMATED HOLISTIC BEAT TRACKING, MUSIC ANALYSIS, AND UNDERSTANDING

Semi-supervised Musical Instrument Recognition

Wipe Scene Change Detection in Video Sequences

STRUCTURAL ANALYSIS AND SEGMENTATION OF MUSIC SIGNALS

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval

Enhancing Music Maps

Analysing Musical Pieces Using harmony-analyser.org Tools

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Melody Retrieval On The Web

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

Polyphonic Audio Matching for Score Following and Intelligent Audio Editors

Appendix A Types of Recorded Chords

Semantic Segmentation and Summarization of Music

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Singer Traits Identification using Deep Neural Network

Musical Examination to Bridge Audio Data and Sheet Music

Classification of Timbre Similarity

Informed Feature Representations for Music and Motion

HIDDEN MARKOV MODELS FOR SPECTRAL SIMILARITY OF SONGS. Arthur Flexer, Elias Pampalk, Gerhard Widmer

EVALUATION OF FEATURE EXTRACTORS AND PSYCHO-ACOUSTIC TRANSFORMATIONS FOR MUSIC GENRE CLASSIFICATION

TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS

AUTOMASHUPPER: AN AUTOMATIC MULTI-SONG MASHUP SYSTEM

IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM

Repeating Pattern Extraction Technique(REPET);A method for music/voice separation.

Music out of Digital Data

A MID-LEVEL REPRESENTATION FOR CAPTURING DOMINANT TEMPO AND PULSE INFORMATION IN MUSIC RECORDINGS

AUTOMATED METHODS FOR ANALYZING MUSIC RECORDINGS IN SONATA FORM

SIMAC: SEMANTIC INTERACTION WITH MUSIC AUDIO CONTENTS

STRUCTURAL CHANGE ON MULTIPLE TIME SCALES AS A CORRELATE OF MUSICAL COMPLEXITY

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION

A Study on Music Genre Recognition and Classification Techniques

Audio-based Music Segmentation Using Multiple Features

Obtaining General Chord Types from Chroma Vectors

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

Structural Analysis of Large Amounts of Music Information

Final Project MUMT 621

A REAL-TIME SIGNAL PROCESSING FRAMEWORK OF MUSICAL EXPRESSIVE FEATURE EXTRACTION USING MATLAB

Transcription:

1 Methods for the automatic structural analysis of music Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010

2 The problem Going from sound to structure

2 The problem Going from sound to structure

2 The problem Going from sound to structure A B B C B C B D

2 The problem Going from sound to structure INT. VERSE VERSE BRIDGE VERSE BRIDGE VERSE OUT.

3 My objective today: To describe the variety of methods out there To illustrate three ways of subdividing the field

4 Three ways to look at the field: By hypotheses about what structure is By hypotheses about how structure is expressed By techniques to do structure analysis Sequences vs. states Timbre vs. harmony (vs. rhythm vs. lyrics...) Similarity matrix vs. clustering

5 Outline 1. Two hypotheses: States Sequences 2. A word on features: Timbre, harmony, etc. Dynamic features 3. Two techniques: Similarity matrix Clustering models

6 Features (fast) Timbral features: instrumentation, vocal quality, etc. Pitch features: what notes and chords are being played Rhythmic features: pulse periods MFCC, MPEG-7 DESCRIPTORS CHROMA VECTOR, FUNDAMENTAL FREQUENCY RHYTHMOGRAM

6 Features (fast) Timbral features: instrumentation, vocal quality, etc. Pitch features: what notes and chords are being played Rhythmic features: pulse periods Lyrics MFCC, MPEG-7 DESCRIPTORS CHROMA VECTOR, FUNDAMENTAL FREQUENCY RHYTHMOGRAM LYRICS

7 Features What makes sections different? INTRO CHO. VERSE... PAUL SIMON: CAN T RUN BUT

8 Features What makes sections similar?... VERSE SOLO... THE BEATLES: BABY IT S YOU

9 Technique 1: Similarity Matrices

start end end start end 10

start end end start end 10

start end end start end 10

start end i j D(i,j) end start end 11

start end i j D(i,j) end j start i end 12

start end i j k D(i,k) end k start i end 13

start end D(i,j) end j start i end 14 image: Foote 2000a

start end D(i,k) end k start i end 15 image: Foote 2000a

Similarity Matrices 16 image: Foote 2000b audio: http://youtube.com/watch?v=botw2vs4w5w

19 Similarity matrices They can show us stuff: Points of novelty THE BEATLES: FLYING

Novelty detection 20 image: Foote 2000b

Novelty detection Novelty scores 21 images: Foote 2000b

Novelty detection Novelty scores 21 images: Foote 2000b

Novelty detection Novelty scores 22 images: Foote 2000b

Novelty detection Novelty scores 22 images: Foote 2000b

Novelty detection Novelty scores 23 images: Foote 2000b

Novelty detection Novelty scores 24 images: Foote 2000b

25 STATES OR SEQUENCES? O V C V C V V I I V V C V C V O

26 STATES VIEW O V C V C V V I I V V C V C V O

27 SEQUENCES VIEW O V C V C V V I I V V C V C V O

28 THE BEATLES: YESTERDAY O V C V C V V I I V V C V C V O

29 SEQUENCE WORKFLOW image: Paulus 2009

30 STRIPE SEARCH TIME-LAG FILTER THRESHOLD ERODE / DILATE GROUND TRUTH

31 STATE WORKFLOW image: Paulus 2009

32 BLOCK SEARCH

32 BLOCK SEARCH

33 Outline 1. Two hypotheses: States Sequences 2. A word on features: Timbre, harmony, etc. Dynamic features 3. Two techniques: Similarity matrix Clustering models

34 Technique 2: Clustering Models

35 Clustering image: Foote 2000a

35 Clustering image: Foote 2000a

36 Clustering image: Foote 2000a

36 Clustering GROUP 2 GROUP 1 GROUP 3 GROUP 4 image: Foote 2000a

37 Clustering 1 3 3 3 1 1 4 1 2 4 1 2 2... image: Foote 2000a

38 Clustering with HMM image: Aucouturier 2001

38 Clustering with HMM A C AC AC B C B... image: Aucouturier 2001

39 CLUSTERING image: Paulus 2009

40 TEMPORAL CLUSTERING image: Paulus 2009

Clustering as mid-level representation Jordan B. L. Smith! 26 March 2010! 41 images: Levy and Sandler 2008

42 Clustering as mid-level representation images: Abdallah et al. 2005

42 Clustering as mid-level representation images: Abdallah et al. 2005

42 Clustering as mid-level representation images: Abdallah et al. 2005

42 Clustering as mid-level representation images: Abdallah et al. 2005

43 Clustering as mid-level representation image: Levy and Sandler 2008

44 Features again Most features: static each frame described by a vector no information about temporal extent Solution: dynamic features

45 Dynamic features Information about timing or context: Histograms (just saw) Frame-wise derivatives (many) Difference features (Turnbull et al. 2007) FFTs of features (Peeters 2004) Dynamic Texture Models (Barrington 2009)

46 Outline Summary 1. Two hypotheses: States Sequences 2. A word on features: Timbre, harmony, etc. Dynamic features 3. Two techniques: Similarity matrix Clustering models

47 Discussion What can supervised learning do for structure analysis? Are either of the states or sequences hypotheses correct? Which of these methods can solve the Bohemian Rhapsody problem? (i.e., through-composed or ABCD music)

48 Supervised learning Paulus & Klapuri 2010: applying semantic labels to analyses Turnbull et al. 2007: learning what boundaries look like image: Paulus & Klapuri 2010

49 Thank you! And thanks to:

50 Image credits Abdallah, S., K.!Noland, M.!Sandler, M.!Casey, and C.!Rhodes. 2005. Theory and evaluation of a Bayesian music structure extractor. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), London, 420-5. Aucouturier, J.-J. 2001, July. Segmentation of musical signals, and applications to the analysis of musical structure. Master's thesis, Kings College, University of London. Foote, J. 2000a. Arthur: Retrieving orchestral music by long-term structure. In Proceedings of the International Symposium on Music Information Retrieval (ISMIR), Plymouth, MA, USA. Foote, J. 2000b. Automatic audio segmentation using a measure of audio novelty. In Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), 452-5. Levy, M., and M.!Sandler. 2008, Feb. Structural segmentation of musical audio by constrained clustering. IEEE Transactions on Audio, Speech, and Language Processing!16 (2): 318-26. Paulus, J. 2009. Signal processing methods for drum transcription and music structure analysis. Ph.D. thesis, Tampere University of Technology, Tampere, Finland.

51 References Abdallah, S., M.!Sandler, C.!Rhodes, and M.!Casey. 2006. Using duration models to reduce fragmentation in audio segmentation. Machine Learning!65 (2-3): 485-515. Aucouturier, J.-J. 2001, July. Segmentation of musical signals, and applications to the analysis of musical structure. Master's thesis, Kings College, University of London. Barrington, L., A.!Chan, and G.!Lanckriet. 2009. Dynamic texture models of music. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Washington, DC, USA, 1589-92. IEEE Computer Society. Chai, W. 2005, September. Automated analysis of musical structure. Ph. D. thesis, Massachusetts Institute of Technology, MA, USA. Foote, J. 2000a. Arthur: Retrieving orchestral music by longterm structure. In Proceedings of the International Symposium on Music Information Retrieval (ISMIR), Plymouth, MA, USA. Foote, J. 2000b. Automatic audio segmentation using a measure of audio novelty. In Proceedings of the IEEE International Conference on Multimedia & Expo (ICME), 452-5. Foote, J., and M.!Cooper. 2003. Media segmentation using selfsimilarity decomposition. In M.!Yeung, R.!Lienhart, and C.-S. Li (Eds.), Proceedings of the SPIE: Storage and Retrieval for Media Databases, Volume 5021, Santa Clara, CA, USA, 167-75. SPIE. Goto, M. 2003a. A chorus-section detecting method for musical audio signals. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Volume!5, 437-40. Jehan, T. 2005. Hierarchical multi-class self similarities. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, United States, 311-4. Levy, M., and M.!Sandler. 2008, Feb. Structural segmentation of musical audio by constrained clustering. IEEE Transactions on Audio, Speech, and Language Processing!16 (2): 318-26. Logan, B., and S.!Chu. 2000. Music summarization using key phrases. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Volume!2, Washington D.C., USA, 749-52. IEEE Computer Society. Maddage, N., C.!Xu, M.!Kankanhalli, and X.!Shao. 2004. Content-based music structure analysis with applications to music semantics understanding. In Proceedings of the ACM International Conference on Multimedia, New York, NY, United States, 112-9. Paulus, J. 2009. Signal processing methods for drum transcription and music structure analysis. Ph.D. thesis, Tampere University of Technology, Tampere, Finland. Peeters, G. 2004. Deriving musical structures from signal analysis for music audio summary generation: sequence and state approach. In G.!Goos, J.!Hartmanis, and J.!van Leeuwen (Eds.), Computer Music Modeling and Retrieval, Volume 2771, 169-85. Springer Berlin / Heidelberg. Shiu, Y., H.!Jeong, and C.-C.!J. Kuo. 2006b. Similarity matrix processing for music structure analysis. In Proceedings of the ACM Workshop on Audio and Music Computing Multimedia (AMCMM), New York, NY, USA, 69-76. ACM. Turnbull, D., G.!Lanckriet, E.!Pampalk, and M.!Goto. 2007. A supervised approach for detecting boundaries in music using difference features and boosting. In Proceedings of the International Conference on Music Information Retrieval (ISMIR), Vienna, Austria, 51-4.