Lecture 11: Chroma and Chords

Similar documents
Music Information Retrieval for Jazz

Searching for Similar Phrases in Music Audio

Data Driven Music Understanding

Beat-Synchronous Chroma Representations for Music Analysis

Data Driven Music Understanding

Aspects of Music. Chord Recognition. Musical Chords. Harmony: The Basis of Music. Musical Chords. Musical Chords. Piece of music. Rhythm.

Chord Classification of an Audio Signal using Artificial Neural Network

Extracting and Using Music Audio Information

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Chord Recognition. Aspects of Music. Musical Chords. Harmony: The Basis of Music. Musical Chords. Musical Chords. Music Processing.

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

MUSI-6201 Computational Music Analysis

Week 14 Music Understanding and Classification

Content-based music retrieval

Topic 10. Multi-pitch Analysis

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

Content-based Music Structure Analysis with Applications to Music Semantics Understanding

Automatic music transcription

Automatic Labelling of tabla signals

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

Sparse Representation Classification-Based Automatic Chord Recognition For Noisy Music

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Automatic Rhythmic Notation from Single Voice Audio Sources

Tempo and Beat Analysis

Polyphonic Audio Matching for Score Following and Intelligent Audio Editors

A New Method for Calculating Music Similarity

Lecture 12: Alignment and Matching

MODELS of music begin with a representation of the

Query By Humming: Finding Songs in a Polyphonic Database

A Robust Mid-level Representation for Harmonic Content in Music Signals

Extracting Information from Music Audio

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Semantic Segmentation and Summarization of Music

Effects of acoustic degradations on cover song recognition

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

Classification-Based Melody Transcription

An Accurate Timbre Model for Musical Instruments and its Application to Classification

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

Automatic Piano Music Transcription

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Recognising Cello Performers using Timbre Models

Topic 4. Single Pitch Detection

Piano Transcription MUMT611 Presentation III 1 March, Hankinson, 1/15

A DISCRETE MIXTURE MODEL FOR CHORD LABELLING

Supervised Learning in Genre Classification

MUSIC is a ubiquitous and vital part of the lives of billions

AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC

Transcription of the Singing Melody in Polyphonic Music

Singing Pitch Extraction and Singing Voice Separation

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

Book: Fundamentals of Music Processing. Audio Features. Book: Fundamentals of Music Processing. Book: Fundamentals of Music Processing

Analysing Musical Pieces Using harmony-analyser.org Tools

Music Radar: A Web-based Query by Humming System

Music Similarity and Cover Song Identification: The Case of Jazz

AUDIO-BASED COVER SONG RETRIEVAL USING APPROXIMATE CHORD SEQUENCES: TESTING SHIFTS, GAPS, SWAPS AND BEATS

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Methods for the automatic structural analysis of music. Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010

Toward Automatic Music Audio Summary Generation from Signal Analysis

Classification-based melody transcription

Music structure information is

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION. Gregory Sell and Pascal Clark

ON THE USE OF PERCEPTUAL PROPERTIES FOR MELODY ESTIMATION

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Efficient Vocal Melody Extraction from Polyphonic Music Signals

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

/$ IEEE

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Melody transcription for interactive applications

Introductions to Music Information Retrieval

Singer Traits Identification using Deep Neural Network

Lecture 15: Research at LabROSA

POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Recognising Cello Performers Using Timbre Models

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

MUSIC CONTENT ANALYSIS : KEY, CHORD AND RHYTHM TRACKING IN ACOUSTIC SIGNALS

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Outline. Why do we classify? Audio Classification

Lecture 10 Harmonic/Percussive Separation

Semi-supervised Musical Instrument Recognition

A probabilistic framework for audio-based tonal key and chord recognition

THE importance of music content analysis for musical

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Jazz Melody Generation and Recognition

Computational Modelling of Harmony

Structured training for large-vocabulary chord recognition. Brian McFee* & Juan Pablo Bello

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Transcription:

LN 4896 MUSI SINL PROSSIN Lecture 11: hroma and hords 1. eatures for Music udio 2. hroma eatures 3. hord Recognition an llis ept. lectrical ngineering, olumbia University dpwe@ee.columbia.edu http://www.ee.columbia.edu/~dpwe/e4896/ 4896 Music Signal Processing (an llis) 213-4-8-1 /18

1. eatures for Music udio hallenges of large music databases how to find what we want... uclidean metaphor music tracks as points in space What are the dimensions? sound - timbre, instruments M melody, chords hroma rhythm, tempo Rhythmic bases 4896 Music Signal Processing (an llis) 213-4-8-2 /18

Ms The standard feature for speech recognition Logan 2 Sound spectra audspec cepstra Ms T X[k] Mel scale freq. warp log X[k] IT Truncate.5.5.25.255.26.265.27 time / s 1 5 x 1 4 1 2 3 freq / Hz 2 1 5 1 15 freq / Mel 1 5 5 1 15 freq / Mel 2 2 1 2 3 quefrency 4896 Music Signal Processing (an llis) 213-4-8-3 /18

M xample Resynthesize by imposing spectrum on noise Ms capture instruments, not notes freq / Hz Let It Be - log-freq specgram (LIB-1) 6 14 3 coefficient Ms 12 1 8 6 4 2 freq / Hz Noise excited M resynthesis (LIB-2) 6 14 3 5 4896 Music Signal Processing (an llis) 1 15 2 25 time / sec 213-4-8-4 /18

M rtist lassification 2 rtists x 6 albums each train models on 5 albums, classify tracks from last Model as M mean + covariance per artist single aussian model 2 (mean) + 1 x 19 (covariance) parameters 55% correct (guessing ~5%) onfusion: Ms (acc 55.13%) u2 tori_amos suzanne_vega steely_dan roxette radiohead queen prince metallica madonna led_zeppelin green_day garth_brooks fleetwood_mac depeche_mode dave_matthews_b cure creedence_c_r beatles aerosmith true de da cu cr be ae gr fl ga qu pr me le ma llis 27 to su st ro ra 4896 Music Signal Processing (an llis) 213-4-8-5 /18 u2

2. hroma eatures What about modeling tonal content (notes)? melody spotting chord recognition cover songs... Ms exclude tonal content Polyphonic 75 transcription 7 65 is too hard 6 e.g. sinusoidal 55 5 tracking: Recognized 45 confused by True 4 harmonics hroma features as solution... MII note number 22 24 26 28 3 32 34 4896 Music Signal Processing (an llis) 213-4-8-6 /18

hroma eatures Idea: Project all energy onto 12 semitones regardless of octave maintains main musical distinction invariant to musical equivalence no need to worry about harmonics? ujishima 1999 chroma freq / khz 4 3 2 1 chroma 5 1 15 fft bin 2 4 6 8 time / sec 5 1 15 2 25 time / frame (b) = N M k= B(12 log 2 (k/k ) b)w (k) X[k] W(k) is weighting, B(b) selects every ~ mod12 4896 Music Signal Processing (an llis) 213-4-8-7 /18

Better hroma Problems: blurring of bins close to edges limitation of T bin resolution Solutions: peak picking - only keep energy at center of peaks chroma ( ) freq / khz 4 3 2 1 chroma 2 freq / Hz 2 4 6 8 time / sec 5 1 15 2 time / frame Instantaneous requency - high-resolution estimates adapt tuning center based on histogram of pitches 4896 Music Signal Processing (an llis) 213-4-8-8 /18

level / db hroma Resynthesis hroma describes the notes in an octave... but not the octave an resynthesize by presenting all octaves... with a smooth envelope Shepard tones - octave is ambiguous y b (t) = M o=1 12 Shepard tone spectra -1-2 -3-4 -5-6 5 1 15 2 25 freq / Hz freq / khz endless sequence illusion W (o + b b ) cos 2o+ 12 w t 12 4 3 2 1 Shepard tone resynth llis & Poliner 27 2 4 6 8 1 time / sec 4896 Music Signal Processing (an llis) 213-4-8-9 /18

hroma xample Simple Shepard tone resynthesis can also reimpose broad spectrum from Ms freq / Hz Let It Be - log-freq specgram (LIB-1) 6 14 3 chroma bin hroma features B freq / Hz Shepard tone resynthesis of chroma (LIB-3) 6 14 3 freq / Hz M-filtered shepard tones (LIB-4) 6 14 3 5 4896 Music Signal Processing (an llis) 1 15 2 25 time / sec 213-4-8-1/18

Beat-Synchronous hroma rastically reduce data size by recording one chroma frame per beat Let It Be - log-freq specgram (LIB-1) Bartsch & Wakefield 21 freq / Hz 6 14 3 Onset envelope + beat times chroma bin B Beat-synchronous chroma Beat-synchronous chroma + Shepard resynthesis (LIB-6) freq / Hz 6 14 3 5 1 15 2 25 time / sec 4896 Music Signal Processing (an llis) 213-4-8-11/18

chroma bin 3. hord Recognition Beat synchronous chroma look like chords 5 1 15 2 -- B-- -- ---... can we transcribe them? Two approaches manual templates (prior knowledge) learned models (from training data) time / sec 4896 Music Signal Processing (an llis) 213-4-8-12/18

hord Recognition System nalogous to speech recognition aussian models of features for each chord Hidden Markov Models for chord transitions Sheh & llis 23 Beat track udio 1-16 Hz BP hroma beat-synchronous chroma features HMM Viterbi chord labels test train 25-4 Hz BP hroma Root normalize aussian Unnormalize 24 auss models B B maj B c min Labels Resample b B a g ount transitions 24x24 transition matrix f e d c B B c d e f g a b 4896 Music Signal Processing (an llis) 213-4-8-13/18

HMMs Hidden Markov Models are good for inferring hidden states.8.8 S.1 underlying Markov.1 B generative model.1.1.1.1 each state has.1 emission distribution.7 observations tell us something about state... infer smoothed state sequence p(x q) p(x q).8.6.4.2.8.6.4.2 mission distributions q = q = B q = q = q = B q = 1 2 3 4 observation x 4896 Music Signal Processing (an llis) 213-4-8-14/18 x n x n 3 2 1 3 2 1 p(q n+1 q n ) q n S B State sequence BBBBBBBBBBBBBBBBBB Observation sequence q n+1 S B 1.8.1.1.1.8.1.1.1.7.1 1 S B B B B B B B B B B B B B B B 1 2 3 time step n

S S B.9.1 Model M 1.7.2 q q 1 q 2 q 3 q 4 S S B S B B S B B B HMM Inference HMM defines emission distribution p(x q) and transition probabilities p(q n q n 1 ) Likelihood of observed given state sequence: p({x n } {q n })= p(x n q n )p(q n q n 1 ) B.1.8.2 S B.9.1.7.2.1.8.2 1 x n p(x B) p(x ) States B Observations x 1, x 2, x 3.1.9 S 1 2 3 4 time n ll possible 3-emission paths Q k from S to 1 2 3 Paths.8.2.7 n Observation likelihoods p(x q) x 1 x 2 x 3 2.5.2.1 q{ B.1 2.2 2.3 p(q M) = Πn p(q n q n-1 ) p(x Q,M) = Πn p(x n q n ) p(x,q M).9 x.7 x.7 x.1 =.441.9 x.7 x.2 x.2 =.252.9 x.2 x.8 x.2 =.288.1 x.8 x.8 x.2 =.128 2.5 x.2 x.1 =.5 2.5 x.2 x 2.3 = 1.15 2.5 x 2.2 x 2.3 = 12.65.1 x 2.2 x 2.3 =.56.22.29.3643.65 Σ =.119 Σ = p(x M) =.42 n.8.2.2.1.7 By dynamic programming, we can also identify the best state sequence given just the observations 4896 Music Signal Processing (an llis) 213-4-8-15/18

Key Normalization hord transitions depend on key of piece dominant, relative minor, etc... Taxman leanor Rigby I'm Only Sleeping hord transition probabilities should be key-relative estimate main key of piece rotate all chroma features learn models aligned chroma Love You To ligned lobal model Yellow Submarine She Said She Said ood ay Sunshine nd Your Bird an Sing aligned chroma 4896 Music Signal Processing (an llis) 213-4-8-16/18

hord Recognition Often works: udio freq / Hz 2416 761 Let It Be/6-Let It Be 24 round truth chord Beatsynchronous chroma Recognized B :min :min/b7 :maj7 :maj6 :min a a :min/b7 :maj7 2 4 6 8 1 12 14 16 18 But only about 6% of the time 4896 Music Signal Processing (an llis) 213-4-8-17/18

Summary Music udio eatures capture information useful for classification hroma eatures 12 bins to robustly summarize notes hord Recognition Sometimes easy, sometimes subtle 4896 Music Signal Processing (an llis) 213-4-8-18/18

References B. Logan, Mel frequency cepstral coefficients for music modeling, in Proc. Int. Symp. Music Inf. Retrieval ISMIR, Plymouth, September 2.. llis, lassifying Music udio with Timbral and hroma eatures, in Proc. Int. Symp. Music Inf. Retrieval ISMIR-7, pp. 339-34, Vienna, October 27. T. ujishima, Realtime chord recognition of musical sound: system using common lisp music, In Proc. Int. omp. Music onf., pp. 464 467, Beijing, 1999.. llis and. Poliner, Identifying over Songs With hroma eatures and ynamic Programming Beat Tracking, Proc. ISSP-7, pp. IV-1429-1432, Hawai'i, pril 27. M.. Bartsch and. H. Wakefield, To catch a chorus: Using chroma-based representations for audio thumbnailing, in Proc. I WSP, Mohonk, October 21.. Sheh and. llis, hord Segmentation and Recognition using M-Trained Hidden Markov Models, Int. Symp. Music Inf. Retrieval ISMIR-3, pp. 185-191, Baltimore, October 23. 4896 Music Signal Processing (an llis) 213-4-8-19/18