Music Information Retrieval

Similar documents
Audio. Meinard Müller. Beethoven, Bach, and Billions of Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Further Topics in MIR

Music Processing Introduction Meinard Müller

Beethoven, Bach, and Billions of Bytes

Beethoven, Bach und Billionen Bytes

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

Music Information Retrieval (MIR)

Meinard Müller. Beethoven, Bach, und Billionen Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Tempo and Beat Tracking

Music Information Retrieval (MIR)

Tempo and Beat Analysis

Music Representations

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Music Processing Audio Retrieval Meinard Müller

Music Representations

Lecture 9 Source Separation

Book: Fundamentals of Music Processing. Audio Features. Book: Fundamentals of Music Processing. Book: Fundamentals of Music Processing

Introductions to Music Information Retrieval

Audio Structure Analysis

Informed Feature Representations for Music and Motion

Automatic music transcription

Music Structure Analysis

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Music Structure Analysis

Lecture 10 Harmonic/Percussive Separation

Music Structure Analysis

Data-Driven Solo Voice Enhancement for Jazz Music Retrieval

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

A MID-LEVEL REPRESENTATION FOR CAPTURING DOMINANT TEMPO AND PULSE INFORMATION IN MUSIC RECORDINGS

THE importance of music content analysis for musical

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

MUSIC is a ubiquitous and vital part of the lives of billions

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Voice & Music Pattern Extraction: A Review

SHEET MUSIC-AUDIO IDENTIFICATION

Topic 10. Multi-pitch Analysis

Transcription of the Singing Melody in Polyphonic Music

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

MATCHING MUSICAL THEMES BASED ON NOISY OCR AND OMR INPUT. Stefan Balke, Sanu Pulimootil Achankunju, Meinard Müller

Music Similarity and Cover Song Identification: The Case of Jazz

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Semantic Audio. Semantic audio is the relatively young field concerned with. International Conference. Erlangen, Germany June, 2017

Outline. Why do we classify? Audio Classification

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

gresearch Focus Cognitive Sciences

Music Information Retrieval

A REAL-TIME SIGNAL PROCESSING FRAMEWORK OF MUSICAL EXPRESSIVE FEATURE EXTRACTION USING MATLAB

Data Driven Music Understanding

Singer Traits Identification using Deep Neural Network

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

/$ IEEE

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

Music Radar: A Web-based Query by Humming System

Automatic Labelling of tabla signals

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

CSC475 Music Information Retrieval

ON FINDING MELODIC LINES IN AUDIO RECORDINGS. Matija Marolt

Query By Humming: Finding Songs in a Polyphonic Database

Music Information Retrieval

EVALUATION OF A SCORE-INFORMED SOURCE SEPARATION SYSTEM

Audio Structure Analysis

Robert Alexandru Dobre, Cristian Negrescu

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

Audio Structure Analysis

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

The Effect of DJs Social Network on Music Popularity

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Effects of acoustic degradations on cover song recognition

AUDIO MATCHING VIA CHROMA-BASED STATISTICAL FEATURES

ANALYZING MEASURE ANNOTATIONS FOR WESTERN CLASSICAL MUSIC RECORDINGS

Subjective Similarity of Music: Data Collection for Individuality Analysis

NOTE-LEVEL MUSIC TRANSCRIPTION BY MAXIMUM LIKELIHOOD SAMPLING

A Survey on: Sound Source Separation Methods

Supervised Learning in Genre Classification

Soundprism: An Online System for Score-Informed Source Separation of Music Audio Zhiyao Duan, Student Member, IEEE, and Bryan Pardo, Member, IEEE

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS

Music Information Retrieval. Juan P Bello

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

MUSI-6201 Computational Music Analysis

IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

A TIMBRE-BASED APPROACH TO ESTIMATE KEY VELOCITY FROM POLYPHONIC PIANO RECORDINGS

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

Music Information Retrieval Community

Score-Informed Source Separation for Musical Audio Recordings: An Overview

Automatic Classification of Instrumental Music & Human Voice Using Formant Analysis

Statistical Modeling and Retrieval of Polyphonic Music

TIMBRE-CONSTRAINED RECURSIVE TIME-VARYING ANALYSIS FOR MUSICAL NOTE SEPARATION

Video-based Vibrato Detection and Analysis for Polyphonic String Music

RETRIEVING AUDIO RECORDINGS USING MUSICAL THEMES

POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING

Hidden Markov Model based dance recognition

A Multimodal Way of Experiencing and Exploring Music

TOWARDS AUTOMATED EXTRACTION OF TEMPO PARAMETERS FROM EXPRESSIVE MUSIC RECORDINGS

Singing Pitch Extraction and Singing Voice Separation

DISCOVERING MORPHOLOGICAL SIMILARITY IN TRADITIONAL FORMS OF MUSIC. Andre Holzapfel

CHAPTER 6. Music Retrieval by Melody Style

Transcription:

Music Information Retrieval When Music Meets Computer Science Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Berlin MIR Meetup 20.03.2017

Meinard Müller 2001 PhD, Bonn University 2002/2003 Postdoc, Keio University, Japan 2007 Habilitation, Bonn University Information Retrieval for Music and Motion 2007-2012 Senior Researcher Max-Planck Institut für Informatik, Saarland 2012: Professor Semantic Audio Processing Universität Erlangen-Nürnberg

Group Members Stefan Balke Christian Dittmar Patricio López-Serrano Christof Weiß Frank Zalkow Thomas Prätzlich

Group Members Stefan Balke Christian Dittmar Patricio López-Serrano Christof Weiß Frank Zalkow Thomas Prätzlich

Book: Fundamentals of Music Processing Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., 30 illus. in color, hardcover ISBN: 978-3-319-21944-8 Springer, 2015 Accompanying website: www.music-processing.de

International Audio Laboratories Erlangen

International Audio Laboratories Erlangen Audio

International Audio Laboratories Erlangen Audio Coding 3D Audio Audio Psychoacoustics Music Processing

Music

Music Information Retrieval Sheet Music (Image) CD / MP3 (Audio) MusicXML (Text) Dance / Motion (Mocap) Music MIDI Singing / Voice (Audio) Music Film (Video) Music Literature (Text)

Music Information Retrieval Signal Processing Musicology Music User Interfaces Machine Learning Information Retrieval Library Sciences

Piano Roll Representation

Player Piano (1900)

Piano Roll Representation (MIDI) J.S. Bach, C-Major Fuge (Well Tempered Piano, BWV 846) Time Pitch

Piano Roll Representation (MIDI) Query: Goal: Find all occurrences of the query

Piano Roll Representation (MIDI) Query: Goal: Find all occurrences of the query Matches:

Music Retrieval Database Query Hit Audio-ID Version-ID Category-ID Bernstein (1962) Beethoven, Symphony No. 5 Beethoven, Symphony No. 5: Bernstein (1962) Karajan (1982) Gould (1992) Beethoven, Symphony No. 9 Beethoven, Symphony No. 3 Haydn Symphony No. 94

Music Synchronization: Audio-Audio Beethoven s Fifth

Music Synchronization: Audio-Audio Beethoven s Fifth Orchester (Karajan) Piano (Scherbakov) Time (seconds)

Music Synchronization: Audio-Audio Beethoven s Fifth Orchester (Karajan) Piano (Scherbakov) Time (seconds)

Application: Interpretation Switcher

Music Synchronization: Image-Audio Audio Image

Music Synchronization: Image-Audio Audio Image

How to make the data comparable? Audio Image

How to make the data comparable? Image Processing: Optical Music Recognition Audio Image

How to make the data comparable? Image Processing: Optical Music Recognition Audio Image Audio Processing: Fourier Analysis

How to make the data comparable? Image Processing: Optical Music Recognition Audio Image Audio Processing: Fourier Analysis

Application: Score Viewer

Music Processing Coarse Level What do different versions have in common? Fine Level What are the characteristics of a specific version?

Music Processing Coarse Level What do different versions have in common? What makes up a piece of music? Fine Level What are the characteristics of a specific version? What makes music come alive?

Music Processing Coarse Level What do different versions have in common? What makes up a piece of music? Identify despite of differences Fine Level What are the characteristics of a specific version? What makes music come alive? Identify the differences

Music Processing Coarse Level What do different versions have in common? What makes up a piece of music? Identify despite of differences Example tasks: Audio Matching Cover Song Identification Fine Level What are the characteristics of a specific version? What makes music come alive? Identify the differences Example tasks: Tempo Estimation Performance Analysis

Performance Analysis Schumann: Träumerei Performance: Time (seconds)

Performance Analysis Schumann: Träumerei Score (reference): Performance: Time (seconds)

Performance Analysis Schumann: Träumerei Score (reference): Strategy: Compute score-audio synchronization and derive tempo curve Performance: Time (seconds)

Performance Analysis Schumann: Träumerei Score (reference): Tempo Curve: Musical tempo (BPM) Musical time (measures)

Performance Analysis Schumann: Träumerei Score (reference): Tempo Curves: Musical tempo (BPM) Musical time (measures)

Performance Analysis Schumann: Träumerei Score (reference): Tempo Curves: Musical tempo (BPM) Musical time (measures)

Performance Analysis Schumann: Träumerei Score (reference): Tempo Curves: Musical tempo (BPM)? Musical time (measures)

Performance Analysis Schumann: Träumerei What can be done if no reference is available? Tempo Curves: Musical tempo (BPM) Musical time (measures)

Music Processing Relative Given: Several versions Absolute Given: One version

Music Processing Relative Given: Several versions Comparison of extracted parameters Absolute Given: One version Direct interpretation of extracted parameters

Music Processing Relative Given: Several versions Comparison of extracted parameters Extraction errors have often no consequence on final result Absolute Given: One version Direct interpretation of extracted parameters Extraction errors immediately become evident

Music Processing Relative Given: Several versions Comparison of extracted parameters Extraction errors have often no consequence on final result Example tasks: Music Synchronization Genre Classification Absolute Given: One version Direct interpretation of extracted parameters Extraction errors immediately become evident Example tasks: Music Transcription Tempo Estimation

Tempo Estimation and Beat Tracking Basic task: Tapping the foot when listening to music

Tempo Estimation and Beat Tracking Basic task: Tapping the foot when listening to music Example: Queen Another One Bites The Dust Time (seconds)

Tempo Estimation and Beat Tracking Basic task: Tapping the foot when listening to music Example: Queen Another One Bites The Dust Time (seconds)

Tempo Estimation and Beat Tracking Example: Happy Birthday to you Pulse level: Measure

Tempo Estimation and Beat Tracking Example: Happy Birthday to you Pulse level: Tactus (beat)

Tempo Estimation and Beat Tracking Example: Happy Birthday to you Pulse level: Tatum (temporal atom)

Tempo Estimation and Beat Tracking Example: Chopin Mazurka Op. 68-3 Pulse level: Quarter note Tempo:???

Tempo Estimation and Beat Tracking Example: Chopin Mazurka Op. 68-3 Pulse level: Quarter note Tempo: 50-200 BPM Tempo curve Tempo (BPM) 200 50 Time (beats)

Tempo Estimation and Beat Tracking Which temporal level? Local tempo deviations Sparse information (e.g., only note onsets available) Vague information (e.g., extracted note onsets corrupt)

Tempo Estimation and Beat Tracking Spectrogram Steps: 1. Spectrogram Frequency (Hz) Time (seconds)

Tempo Estimation and Beat Tracking Compressed Spectrogram Steps: 1. Spectrogram 2. Log Compression Frequency (Hz) Time (seconds)

Tempo Estimation and Beat Tracking Difference Spectrogram Steps: 1. Spectrogram 2. Log Compression 3. Differentiation Frequency (Hz) Time (seconds)

Tempo Estimation and Beat Tracking Steps: 1. Spectrogram 2. Log Compression 3. Differentiation 4. Accumulation Novelty Curve Time (seconds)

Tempo Estimation and Beat Tracking Steps: 1. Spectrogram 2. Log Compression 3. Differentiation 4. Accumulation Novelty Curve Local Average Time (seconds)

Tempo Estimation and Beat Tracking Steps: 1. Spectrogram 2. Log Compression 3. Differentiation 4. Accumulation 5. Normalization Novelty Curve Time (seconds)

Tempo Estimation and Beat Tracking Tempo (BPM) Intensity

Tempo Estimation and Beat Tracking Tempo (BPM) Intensity

Tempo Estimation and Beat Tracking Tempo (BPM) Intensity

Tempo Estimation and Beat Tracking Tempo (BPM) Intensity

Tempo Estimation and Beat Tracking Tempo (BPM) Intensity Time (seconds)

Tempo Estimation and Beat Tracking Novelty Curve Predominant Local Pulse (PLP) Time (seconds)

Tempo Estimation and Beat Tracking Light effects Music recommendation DJ Audio editing

Why is Music Processing Challenging? Example: Chopin, Mazurka Op. 63 No. 3

Why is Music Processing Challenging? Example: Chopin, Mazurka Op. 63 No. 3 Waveform Amplitude Time (seconds)

Why is Music Processing Challenging? Example: Chopin, Mazurka Op. 63 No. 3 Waveform / Spectrogram Frequency (Hz) Time (seconds)

Why is Music Processing Challenging? Example: Chopin, Mazurka Op. 63 No. 3 Waveform / Spectrogram Performance Tempo Dynamics Note deviations Sustain pedal

Why is Music Processing Challenging? Example: Chopin, Mazurka Op. 63 No. 3 Waveform / Spectrogram Performance Tempo Dynamics Note deviations Sustain pedal Polyphony Main Melody Additional melody line Accompaniment

Source Separation Decomposition of audio stream into different sound sources Central task in digital signal processing Cocktail party effect

Source Separation Decomposition of audio stream into different sound sources Central task in digital signal processing Cocktail party effect Several input signals Sources are assumed to be statistically independent

Source Separation (Music) Main melody, accompaniment, drum track Instrumental voices Individual note events Only mono or stereo Time Sources are often highly dependent Time

Harmonic-Percussive Decomposition Mixture:

Harmonic-Percussive Decomposition Mixture: Clearly harmonic sounds Clearly percussive sounds Harmonic component Percussive component

Harmonic-Percussive Decomposition Mixture: Clearly harmonic sounds Clearly percussive sounds Harmonic component Residual component Percussive component

Harmonic-Percussive Decomposition Mixture: Clearly harmonic sounds of singing voice and accompaniment Noise-like sounds Vibrato/glissando sounds Drum hits Fricatives & plosives in singing voice Harmonic component Residual component Percussive component Literature: [Driedger/Müller/Disch, ISMIR 2014] Demo: https://www.audiolabs-erlangen.de/resources/2014-ismir-exthpsep/

Singing Voice Extraction Original Recording Singing voice Accompaniment

Singing Voice Extraction Frequency Time Original recording HPR F0 annotation Harmonic component Percussive component Residual component MR TR SL Harmonic portion singing voice Harmonic portion accompaniment Fricatives singing voice Instrument onsets accompaniment + + Vibrato & formants singing voice Diffuse instruments sounds accompaniment Estimate singing voice Estimate accompaniment

Score-Informed Source Separation Exploit musical score to support separation process Pitch Pitch Pitch Time Time Time

Parametric Model Approach Rebuild spectrogram information Estimate Parameters Render Frequency (Hz) Frequency (Hz) Time (seconds) Time (seconds)

NMF (Nonnegative Matrix Factorization) M K N 0 0 0 M K

NMF (Nonnegative Matrix Factorization) M K M N K Magnitude Spectrogram Templates Activations Templates: Pitch + Timbre Activations: Onset time + Duration How does it sound When does it sound

NMF-Decomposition Initialized template Initialized activations Frequency Note number Note number Time Random initialization

NMF-Decomposition Initialized template Initialized activations Frequency Note number Frequency Note number Learnt templates Learnt activations Note number Time Random initialization No semantic meaning

NMF-Decomposition Initialized template Initialized activations Frequency Note number Note number Time Constrained initialization

NMF-Decomposition Initialized template Initialized activations Frequency Note number Note number Template constraint for p=55 Time Activation constraints for p=55 Constrained initialization

NMF-Decomposition Initialized template Initialized activations Frequency Note number Learnt templates Learnt activations Frequency Note number Org Model Note number Time Constrained initialization NMF as refinement

Score-Informed Audio Decomposition Application: Audio editing 1600 1600 1200 1200 800 800 400 400 6 7 8 9 6 7 8 9 Frequency (Hertz) 580 523 500 0 0.5 1 Time (seconds) Frequency (Hertz) 580 554 500 0 0.5 1 Time (seconds)

Informed Drum-Sound Decomposition Remix: Literature: [Dittmar/Müller, IEEE/ACM-TASLP 2016] Demo: https://www.audiolabs-erlangen.de/resources/mir/2016-ieee-taslp-drumseparation

Loop Decomposition of EDM Decomposition Patterns Activations Literature: [López-Serrano/Dittmar/Müller, ISMIR 2016] Demo: https://www.audiolabs-erlangen.de/resources/mir/2016-ismir-emloop

Audio Mosaicing Target signal: Beatles Let it be Source signal: Bees Mosaic signal: Let it Bee Literature: [Driedger/Müller, ISMIR 2015] Demo: https://www.audiolabs-erlangen.de/resources/mir/2015-ismir-letitbee

NMF-Inspired Audio Mosaicing Non-negative matrix factorization (NMF) Non-negative matrix Components Activations. = fixed learned learned Proposed audio mosaicing approach Target s spectrogram Source s spectrogram Activations Mosaic s spectrogram Frequency. = Time source Frequency fixed Time source fixed Time target learned Time target

NMF-Inspired Audio Mosaicing Spectrogram target Spectrogram source Activation matrix Spectrogram mosaic Frequency Frequency Time source. = Frequency Time target Time source Time target Time target

This image cannot currently be displayed. NMF-Inspired Audio Mosaicing Spectrogram target Spectrogram source Iterative updates Activation matrix Spectrogram mosaic Frequency Frequency Time source. = Frequency Time target Time source Time target Time target Preserve temporal context Core idea: support the development of sparse diagonal activation structures

NMF-Inspired Audio Mosaicing Spectrogram target Spectrogram source Activation matrix Spectrogram mosaic Frequency Frequency Time source. = Frequency Time target Time source Time target Time target

NMF-Inspired Audio Mosaicing Spectrogram target Spectrogram source Activation matrix Spectrogram mosaic Frequency Frequency Time source. = Frequency Time target Time source Time target Time target

Audio Mosaicing Target signal: Chic Good times Source signal: Whales Mosaic signal

Audio Mosaicing Target signal: Adele Rolling in the Deep Source signal: Race car Mosaic signal

Motivic Similarity

Motivic Similarity B A C H

Summary Music information retrieval Audio decomposition techniques Machine learning Teaching Academic training of students Fundamental research Music applications & musicology Multimedia scenarios Web-based interfaces

Book: Fundamentals of Music Processing Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015 Accompanying website: www.music-processing.de

Book: Fundamentals of Music Processing Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015 Accompanying website: www.music-processing.de

MIR-Related Events in Germany AES Conference on Semantic Audio 22 24 June 2017 Erlangen GI Jahrestagung 25 29 September 2017 Chemnitz Workshop: Musik trifft Informatik 26 September 2017 Tutorial: Musikverarbeitung 25 September 2017