Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Similar documents
Soundprism: An Online System for Score-Informed Source Separation of Music Audio Zhiyao Duan, Student Member, IEEE, and Bryan Pardo, Member, IEEE

Topic 10. Multi-pitch Analysis

Music Information Retrieval (MIR)

NOTE-LEVEL MUSIC TRANSCRIPTION BY MAXIMUM LIKELIHOOD SAMPLING

Introductions to Music Information Retrieval

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Music Alignment and Applications. Introduction

Tempo and Beat Analysis

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

EVALUATION OF A SCORE-INFORMED SOURCE SEPARATION SYSTEM

Robert Alexandru Dobre, Cristian Negrescu

ALIGNING SEMI-IMPROVISED MUSIC AUDIO WITH ITS LEAD SHEET

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Lecture 9 Source Separation

THE importance of music content analysis for musical

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Audio. Meinard Müller. Beethoven, Bach, and Billions of Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Music Information Retrieval

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

Beethoven, Bach, and Billions of Bytes

Book: Fundamentals of Music Processing. Audio Features. Book: Fundamentals of Music Processing. Book: Fundamentals of Music Processing

Voice & Music Pattern Extraction: A Review

Further Topics in MIR

Automatic music transcription

Music Information Retrieval (MIR)

Music Processing Introduction Meinard Müller

Tempo and Beat Tracking

Refined Spectral Template Models for Score Following

TOWARDS EXPRESSIVE INSTRUMENT SYNTHESIS THROUGH SMOOTH FRAME-BY-FRAME RECONSTRUCTION: FROM STRING TO WOODWIND

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

A prototype system for rule-based expressive modifications of audio recordings

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

2. AN INTROSPECTION OF THE MORPHING PROCESS

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Music Representations

HUMANS have a remarkable ability to recognize objects

Music Information Retrieval for Jazz

Polyphonic Audio Matching for Score Following and Intelligent Audio Editors

Music Structure Analysis

MUSIC is a ubiquitous and vital part of the lives of billions

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

Event-based Multitrack Alignment using a Probabilistic Framework

A PROBABILISTIC SUBSPACE MODEL FOR MULTI-INSTRUMENT POLYPHONIC TRANSCRIPTION

Lecture 15: Research at LabROSA

/$ IEEE

Data Driven Music Understanding

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Automatic Transcription of Polyphonic Vocal Music

Data Driven Music Understanding

MUSI-6201 Computational Music Analysis

Automatic Rhythmic Notation from Single Voice Audio Sources

Lecture 10 Harmonic/Percussive Separation

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Music Radar: A Web-based Query by Humming System

Lecture 11: Chroma and Chords

Music Information Retrieval

Beethoven, Bach und Billionen Bytes

Query By Humming: Finding Songs in a Polyphonic Database

A repetition-based framework for lyric alignment in popular songs

Chord Classification of an Audio Signal using Artificial Neural Network

Drum Source Separation using Percussive Feature Detection and Spectral Modulation

Score-Informed Source Separation for Musical Audio Recordings: An Overview

HarmonyMixer: Mixing the Character of Chords among Polyphonic Audio

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Polyphonic music transcription through dynamic networks and spectral pattern identification

A Bootstrap Method for Training an Accurate Audio Segmenter

Searching for Similar Phrases in Music Audio

AN EFFICIENT TEMPORALLY-CONSTRAINED PROBABILISTIC MODEL FOR MULTIPLE-INSTRUMENT MUSIC TRANSCRIPTION

Experiments on musical instrument separation using multiplecause

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

SIMULTANEOUS SEPARATION AND SEGMENTATION IN LAYERED MUSIC

SHEET MUSIC-AUDIO IDENTIFICATION

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

SCORE-INFORMED VOICE SEPARATION FOR PIANO RECORDINGS

Single Channel Vocal Separation using Median Filtering and Factorisation Techniques

A DISCRETE FILTER BANK APPROACH TO AUDIO TO SCORE MATCHING FOR POLYPHONIC MUSIC

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

Harmonising Chorales by Probabilistic Inference

Chord Recognition. Aspects of Music. Musical Chords. Harmony: The Basis of Music. Musical Chords. Musical Chords. Music Processing.

CSC475 Music Information Retrieval

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

Automatic Piano Music Transcription

Transcription An Historical Overview

CPU Bach: An Automatic Chorale Harmonization System

Improving Polyphonic and Poly-Instrumental Music to Score Alignment

AUTOMASHUPPER: AN AUTOMATIC MULTI-SONG MASHUP SYSTEM

Dave Jones Design Phone: (607) Lake St., Owego, NY USA

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

MUSIC TRANSCRIPTION USING INSTRUMENT MODEL

Topics in Computer Music Instrument Identification. Ioanna Karydi

REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

Classification of Timbre Similarity

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Transcription:

Topic 11 Score-Informed Source Separation (chroma slides adapted from Meinard Mueller)

Why Score-informed Source Separation? Audio source separation is useful Music transcription, remixing, search Non-satisfying results if only using audio Score provides some info that one can use E.g., conductor, learn to sing in a choir Lots of scores are out there

Musical Score in MIDI score sheet MIDI score

Would it be trivial? instantiation abstraction Is map-informed tourism trivial (for machine)?

Remaining Tasks Score tells us what musical objects to look for, but not where to look nor what they sound like. Problems How to align audio with score? How to represent them? How to separate the signal?

Audio/Score representations for alignment Represent in the same way Spectrum Only good for monophonic music Chroma feature Good for polyphonic music Pitch info Ideal for both monophonic and polyphonic music Relies on good multi-pitch estimation techniques

Amplitude Chroma Feature Spectral energy of the 12 pitch classes 12-d vector C C# D D# E F F# G G# A A# B C2 C3 C4 C5 C6 Log-frequency

Spectrogram

Log-frequency Spectrogram

Chromagram

Normalized Chromagram

Chromagram of Polyphonic Music

Dynamic Time Warping Find the path with lowest cost Distance matrix A path should Monotonic Step size 1 From (1,1) to (n,m) How many possible paths?

Possible Progression Three ways for a path to get to (n,m) in one step

A Nice Property Let d(i, j) be the distance matrix Let C(n, m) be the lowest cost from (1,1) to (n,m) Then C(1,1) = d(1,1) C(n, m) = min C n 1, m + d(n, m) C n 1, m 1 + d(n, m) C n, m 1 + d(n, m)

Dynamic Programming! Calculate the lowest cost matrix C(i, j) Starting from C 1,1 Then calculate C(1,2), C(2,1) Then C 1,3, C 2,2, C(3,1) Finally, calculate C(n, m) Remember how you calculated, and trace back to get the path

Two SISS Systems for Polyphonic Music Score-informed NMF Chroma feature to represent audio Dynamic time warping for alignment NMF-based separation Offline [Ewert et al., 2009] [Ewert & Muller, 2012] Soundprism Multi-pitch info of audio Particle filtering for alignment Pitch-based separation Online [Duan & Pardo, 2011]

Score-informed NMF [Ewert & Muller, 2012] Polyphonic audio Aligned MIDI score Score sheet

When score info is not used

When dictionary is initialized by score notes Initial W Initial H Final W Final H

When activation is initialized by score notes Initial W Initial H Final W Final H

When both W and H are initialized Initial W Initial H Final W Final H

Also Considering Onset Models Initial W Initial H Final W Final H

Experiments MIDI-synthesized piano music with randomly imposed alignment errors Audio has accurate pitch, simple timbre Separate left/right hand notes

Advantages Discussions smart initialization of W and H Detailed timbre model using NMF Onset modeling Disadvantages May be hard to deal with multi-instrument polyphonic audio The same note can have different pitch and timbre How many dictionary elements do we need then?

Soundprism Multi-pitch info of audio [Duan & Pardo, 2011] Particle filtering for alignment Pitch-based separation Online

Align Audio with Score Tempo (BPM) Score position (beats)

A State Space Model Observs Audio frame y y1 n 1 y n Inference by particle filtering States Tempo Score position v 1 s 1 x 1 v n 1 s n 1 x n 1? v n s n x n Hidden Markov Process

Transition Model Dynamical system Position: Tempo: where If the score position x n just passed a score note onset otherwise

Observation Model tempo Deterministic Probabilistic p(y n θ n ) p(y n θ n ) is the multi-pitch estimation model trained from thousands of random chords

Tempo Tempo Online Inference by Particle Filtering In n-th frame, estimate posterior p(s n Y 1:n ) from past observations Y 1:n = (y 1,, y n ) Update p(s n Y 1:n ) from p(s n 1 Y 1:n 1 ) with a fixed number of particles Move by p(s n s n 1 ) (i.e. the dynamic equations), resample by p(y n s n ) Frame n-1 Frame n Score position Score position

Source Separation 1. Accurately estimate performed pitches ˆn Around score pitches ˆ n s.t. arg max p( y [ n n ) 50cents, n 50cents]

Amplitude Reconstruct Source Signals 2. Allocate mixture s spectral energy Non-harmonic bins To all sources, evenly Non-overlapping harmonic bins To the active source, solely Overlapping harmonic bins To active sources, in inverse proportion to the square of harmonic numbers 3. IFFT with mixture s pahse to time domain Frequency bins Harmonic positions for Source 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 0 1 0 0 1 0 Harmonic positions for Source 2

Experiments 10 pieces of J.S. Bach 4-part chorales Audio played by violin, clarinet, saxophone and bassoon, separately recorded and then mixed. MIDI score downloaded Ground-truth alignment manually annotated 150 combinations = 40 solos + 60 duets + 40 trios + 10 quartets

Source Separation Results 1. Proposed 2. Ideally-aligned 3. Ganseman et al 10 (offline algorithm) 4. Multi-pitch estimation & streaming-based separation (without score) 5. Oracle Average input SDR 0dB -3dB -4.78dB

Soundprism Single-channel polyphonic music Source 1 Source 2 Source N J. Brahms, Clarinet Quintet in B minor, op.115. 3rd movement

Interactive Music Editing

Advantages Discussions Online system, potential for real-time applications Can deal with multi-instrument polyphonic audio Multi-pitch info is used Disadvantages Multi-pitch model cannot distinguish different parts of a note No onset modeling, alignment not precise No timbre modeling in separation