Lecture 10 Harmonic/Percussive Separation

Similar documents
Lecture 9 Source Separation

Further Topics in MIR

Music Information Retrieval

Data Driven Music Understanding

Voice & Music Pattern Extraction: A Review

BETTER BEAT TRACKING THROUGH ROBUST ONSET AGGREGATION

Singing Pitch Extraction and Singing Voice Separation

Short-Time Fourier Transform

THE importance of music content analysis for musical

Single Channel Vocal Separation using Median Filtering and Factorisation Techniques

Automatic music transcription

Introductions to Music Information Retrieval

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

GENRE SPECIFIC DICTIONARIES FOR HARMONIC/PERCUSSIVE SOURCE SEPARATION

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

A Survey on: Sound Source Separation Methods

ON DRUM PLAYING TECHNIQUE DETECTION IN POLYPHONIC MIXTURES

CS229 Project Report Polyphonic Piano Transcription

Audio. Meinard Müller. Beethoven, Bach, and Billions of Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Music Database Retrieval Based on Spectral Similarity

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

POLYPHONIC PIANO NOTE TRANSCRIPTION WITH NON-NEGATIVE MATRIX FACTORIZATION OF DIFFERENTIAL SPECTROGRAM

AUTOMATIC CONVERSION OF POP MUSIC INTO CHIPTUNES FOR 8-BIT PIXEL ART

Experiments on musical instrument separation using multiplecause

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Drum Source Separation using Percussive Feature Detection and Spectral Modulation

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

TIMBRE-CONSTRAINED RECURSIVE TIME-VARYING ANALYSIS FOR MUSICAL NOTE SEPARATION

/$ IEEE

Topic 10. Multi-pitch Analysis

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION

Deep learning for music data processing

BEAT HISTOGRAM FEATURES FROM NMF-BASED NOVELTY FUNCTIONS FOR MUSIC CLASSIFICATION

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS

TIMBRE REPLACEMENT OF HARMONIC AND DRUM COMPONENTS FOR MUSIC AUDIO SIGNALS

Book: Fundamentals of Music Processing. Audio Features. Book: Fundamentals of Music Processing. Book: Fundamentals of Music Processing

LOW-RANK REPRESENTATION OF BOTH SINGING VOICE AND MUSIC ACCOMPANIMENT VIA LEARNED DICTIONARIES

Long-term Average Spectrum in Popular Music and its Relation to the Level of the Percussion

Data Driven Music Understanding

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Tempo and Beat Analysis

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

MUSI-6201 Computational Music Analysis

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Automatic Piano Music Transcription

Efficient Vocal Melody Extraction from Polyphonic Music Signals

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Music Complexity Descriptors. Matt Stabile June 6 th, 2008

Lyrics Classification using Naive Bayes

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Singer Traits Identification using Deep Neural Network

Effects of acoustic degradations on cover song recognition

SINGING VOICE MELODY TRANSCRIPTION USING DEEP NEURAL NETWORKS

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Soundprism: An Online System for Score-Informed Source Separation of Music Audio Zhiyao Duan, Student Member, IEEE, and Bryan Pardo, Member, IEEE

A Survey of Audio-Based Music Classification and Annotation

AN ADAPTIVE KARAOKE SYSTEM THAT PLAYS ACCOMPANIMENT PARTS OF MUSIC AUDIO SIGNALS SYNCHRONOUSLY WITH USERS SINGING VOICES

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Improving Polyphonic and Poly-Instrumental Music to Score Alignment

COMBINING MODELING OF SINGING VOICE AND BACKGROUND MUSIC FOR AUTOMATIC SEPARATION OF MUSICAL MIXTURES

SYNTHESIS FROM MUSICAL INSTRUMENT CHARACTER MAPS

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

Lecture 15: Research at LabROSA

AUTOMATIC MUSIC TRANSCRIPTION WITH CONVOLUTIONAL NEURAL NETWORKS USING INTUITIVE FILTER SHAPES. A Thesis. presented to

AUTOM AT I C DRUM SOUND DE SCRI PT I ON FOR RE AL - WORL D M USI C USING TEMPLATE ADAPTATION AND MATCHING METHODS

Semi-supervised Musical Instrument Recognition

Transcription and Separation of Drum Signals From Polyphonic Music

CS 591 S1 Computational Audio

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE

Topics in Computer Music Instrument Identification. Ioanna Karydi

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

Real-time spectrum analyzer. Gianfranco Miele, Ph.D

SIMULTANEOUS SEPARATION AND SEGMENTATION IN LAYERED MUSIC

Robert Alexandru Dobre, Cristian Negrescu

Content-based music retrieval

Timing In Expressive Performance

Tempo and Beat Tracking

Psychoacoustics. lecturer:

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Crash Course in Digital Signal Processing

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

SINGING VOICE ANALYSIS AND EDITING BASED ON MUTUALLY DEPENDENT F0 ESTIMATION AND SOURCE SEPARATION

The Effect of DJs Social Network on Music Popularity

Beethoven, Bach, and Billions of Bytes

advanced spectral processing

Analysis and Clustering of Musical Compositions using Melody-based Features

SINGING voice analysis is important for active music

MODAL ANALYSIS AND TRANSCRIPTION OF STROKES OF THE MRIDANGAM USING NON-NEGATIVE MATRIX FACTORIZATION

Singing Voice separation from Polyphonic Music Accompanient using Compositional Model

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

Elements of Music. How can we tell music from other sounds?

DOWNBEAT TRACKING WITH MULTIPLE FEATURES AND DEEP NEURAL NETWORKS

Audio Source Separation: "De-mixing" for Production

Transcription:

10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 10 Harmonic/Percussive Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica

Syllabus W9: multiple instruments separation => dictionary based methods: nonnegative matrix factorization (NMF) and friends W10: harmonic/percussive separation (HPSS) => median filtering and friends W11: singing voice separation => low-rank based methods: robust principal component analysis (RPCA) and friends

Motivation: Drum Transcription The drum track in popular music conveys information about tempo, rhythm, style, and possibly the structure of a song From Wikipedia

Motivation: Beat, Tempo, Rhythmic Pattern The drum track in popular music conveys information about tempo, rhythm, style, and possibly the structure of a song Transcription and separation of drum signals from polyphonic music, TASLP 2008 Techniques for machine understanding of live drum performances, TR 2012

Motivation: Drum Pattern Analysis https://youtu.be/lm_oz7p8twe?t=12m28s A corpus-based study of rhythm patterns, ISMIR 2012 Drum transcription via Classification of bar-level rhythmic patterns, ISMIR 2014

Motivation: HPSS as a Pre-Processing Step (a) original (b) harmonic (c) percussive Figure from [Mueller, FPM, Chapter 8, Springer 2015]

Supervised NMF Approach ENST drum dataset http://www.tsi.telecom-paristech.fr/aao/en/2010/02/19/enstdrums-an-extensive-audio-visual-database-for-drum-signalsprocessing/ IDMT-SMT-Drums dataset http://www.idmt.fraunhofer.de/en/business_units/smt/drums. html Nonnegative matrix partial cofactorization for spectral and temporal drum source separation, JSTSP 2011

Unsupervised Median-Filtering Approach Simple DSP techniques; no ML Intuition stable harmonic or stationary components form horizontal ridges on the spectrogram percussive components form vertical ridges with a broadband frequency response Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram, EUSIPCO 2008 Harmonic/percussive separation using median filtering, DAFx 2010

Filtering Filtering: mean, max, median finite filter length e.g. filter of length 3 input: [0 0 1 0 1 0 1 1 0 1 1 1 0] median: [ 0 0 1 0 0 1 1 1 1 1 1 ] Pooling max: [ 1 1 1 1 1 1 1 1 1 1 1 ] filter length = number of samples e.g. [9 0 1 3 1 2 5] -> {mean=3, max=9, median=?} (note: to calculate median you need to sort the values)

HPSS via Median Filtering ideal harmonic signal ideal percussive signal violin Figure from [Mueller, FPM, Chapter 8, Springer 2015] castanets

HPSS via Median Filtering Figure from [Mueller, FPM, Chapter 8, Springer 2015]

HPSS via Median Filtering Figure from [Mueller, FPM, Chapter 8, Springer 2015]

HPSS via Median Filtering violin + castanets smaller filter length larger filter length Figure from [Mueller, FPM, Chapter 8, Springer 2015]

HPSS via Median Filtering violin + castanets binary mask soft mask Figure from [Mueller, FPM, Chapter 8, Springer 2015]

Parameters Window size, hop size Reconstruction method Filter length Q1: Given sampling rate = 44.1 khz, FFT window size = 4096 samples, hopsize = 1024 samples, what s the physical meaning of using a vertical (percussive) median filter length=17? Q2: What s the physical meaning of using a horizontal (harmonic) median filter length=17?

Implementation http://bmcfee.github.io/librosa/generated/librosa.dec ompose.hpss.html (previous page) Q1: 183 Hz Q2: 0.395 second

Limitation Supervised Method?

Extension: Adding a Residual Component http://www.audiolabserlangen.de/resources/2014 -ISMIR-ExtHPSep/ The harmonic component contains the violin, the percussive component contains the castanets, and the residual contains the applause Harmonic Residual Percussive Extending harmonic-percussive separation of audio signals, ISMIR 2014

Extension: Separating the Vocals V component is smooth in time, nonsmooth in frequency, on short-framed (but not long-framed) STFT domain

Extension: Separating the Vocals Singing voice: intermediate component between harmonic and percussive Perform the two HPSS on spectrograms with two different time-frequency resolutions Singing voice enhancement in monaural music signals based on two-stage harmonic/ percussive sound separation on multiple resolution spectrograms, TASLP 2014

Extension: Separating the Vocals

Extension: Smoothing Frame-level predic on note-level prediction Methods: median filtering, or HMM http://c4dm.eecs.qmul.ac.uk/ismir15-amt-tutorial/