Lecture 15: Research at LabROSA

Similar documents
Data Driven Music Understanding

Music Information Retrieval for Jazz

The Million Song Dataset

Lecture 12: Alignment and Matching

Data Driven Music Understanding

Effects of acoustic degradations on cover song recognition

Lecture 9 Source Separation

Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University

Music Alignment and Applications. Introduction

Using Genre Classification to Make Content-based Music Recommendations

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Music Genre Classification

Voice & Music Pattern Extraction: A Review

MUSI-6201 Computational Music Analysis

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

THE importance of music content analysis for musical


Extracting Information from Music Audio

Large-Scale Pattern Discovery in Music. Thierry Bertin-Mahieux

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Tempo and Beat Analysis

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Pitch-Synchronous Spectrogram: Principles and Applications

Automatic Music Genre Classification

A Survey of Audio-Based Music Classification and Annotation

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Breakscience. Technological and Musicological Research in Hardcore, Jungle, and Drum & Bass

Outline. Why do we classify? Audio Classification

Analyzing the Relationship Among Audio Labels Using Hubert-Arabie adjusted Rand Index

REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

Topic 10. Multi-pitch Analysis

Speech and Speaker Recognition for the Command of an Industrial Robot

Recognising Cello Performers using Timbre Models

Supporting Information

Music Understanding and the Future of Music

Music Similarity and Cover Song Identification: The Case of Jazz

MUSIC/AUDIO ANALYSIS IN PYTHON. Vivek Jayaram

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Content-based music retrieval

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Features for Audio and Music Classification

Recognising Cello Performers Using Timbre Models

Music Information Retrieval

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Searching for Similar Phrases in Music Audio

Music Source Separation

Introductions to Music Information Retrieval

COMBINING MODELING OF SINGING VOICE AND BACKGROUND MUSIC FOR AUTOMATIC SEPARATION OF MUSICAL MIXTURES

Analysing Musical Pieces Using harmony-analyser.org Tools

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Music Perception with Combined Stimulation

Release Year Prediction for Songs

CSC475 Music Information Retrieval

Week 14 Music Understanding and Classification

arxiv: v1 [cs.sd] 5 Apr 2017

Music Information Retrieval. Juan P Bello

Learning to Tag from Open Vocabulary Labels

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

Lecture 10 Harmonic/Percussive Separation

MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

Music Information Retrieval

Music Mood Classication Using The Million Song Dataset

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

SIMULTANEOUS SEPARATION AND SEGMENTATION IN LAYERED MUSIC

2014 Music Performance GA 3: Aural and written examination

Music Genre Classification and Variance Comparison on Number of Genres

NOTE-LEVEL MUSIC TRANSCRIPTION BY MAXIMUM LIKELIHOOD SAMPLING

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

1 Introduction to PSQM

Autotagger: A Model For Predicting Social Tags from Acoustic Features on Large Music Databases

Further Topics in MIR

Singer Recognition and Modeling Singer Error

Music Information Retrieval. Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University

Soundprism: An Online System for Score-Informed Source Separation of Music Audio Zhiyao Duan, Student Member, IEEE, and Bryan Pardo, Member, IEEE

BETTER BEAT TRACKING THROUGH ROBUST ONSET AGGREGATION

Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval

Digital Signal. Continuous. Continuous. amplitude. amplitude. Discrete-time Signal. Analog Signal. Discrete. Continuous. time. time.

Predicting Hit Songs with MIDI Musical Features

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS

Music out of Digital Data

Automatic Rhythmic Notation from Single Voice Audio Sources

MAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

SINGING VOICE MELODY TRANSCRIPTION USING DEEP NEURAL NETWORKS

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Data-Driven Solo Voice Enhancement for Jazz Music Retrieval

Transcription:

ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 15: Research at LabROSA 1. Sources, Mixtures, & Perception 2. Spatial Filtering 3. Time-Frequency Masking 4. Model-Based Separation Dan Ellis Dept. Electrical Engineering, Columbia University dpwe@ee.columbia.edu http://www.ee.columbia.edu/~dpwe/e4896/ E4896 Music Signal Processing (Dan Ellis) 214-5-5-1 /19

Sparse + Low-Rank + NMF Optimization to decompose spectogram: minimize s.t. Zhuo Chen S 1 + L + D KL (Y S L H W) Y = S + L + H W Y H W L S E4896 Music Signal Processing (Dan Ellis) 214-5-5-2 /19

Beta Process NMF Automatically choose how many components to use Liang, Hoffman X = D(S Z)+E E4896 Music Signal Processing (Dan Ellis) 214-5-5-3 /19

Music Complexity Colin Raffel How can we capture musical patterns in the Million Song Dataset? Network analysis of quantized simultaneities after Serrà et al. 212 from Serrà, Corral, Boguña, Haro, & Arcos, 212 E4896 Music Signal Processing (Dan Ellis) 214-5-5-4 /19

Large-Scale Cover Recognition 1 Thierry Bertin-Mahieux How can we find covers in 1M songs? @ 1 sec / comparison, one search = 11.5 CPU-days full N 2 mining = 16, CPU-years Need a hashing technique landmark-based description of chroma patches Euclidean space projection? E4896 Music Signal Processing (Dan Ellis) 214-5-5-5 /19

Large-Scale Cover Recognition 2 2D Fourier Transform Magnitude (2DFTM) fixed-size feature to capture essence of chromagram: First results on finding covers in 1M songs Thierry Bertin-Mahieux Average rank meanap random 5,. jumpcodes 2 38,369.2 2DFTM (5 PC) 137,117.2 E4896 Music Signal Processing (Dan Ellis) 214-5-5-6 /19

Jazz Discography Project How can MIR help organize jazz collections? our tools are quite genre-specific e.g. beat tracker is fine for pop, useless for Jazz 4 3 2 1 8 6 4 2 84 86 88 9 92 94 96 98 E4896 Music Signal Processing (Dan Ellis) 214-5-5-7 /19

Local Tagging MFCC-statistics classifiers on 5 sec windows trained from MajorMiner data 1 Soul Eyes freq / Hz 2416 1356 761 427 24 135 _9s club trance end drum_bass singing horns punk samples silence quiet noise solo strings indie house alternative r_b funk soft ambient british distortion drum_machine country keyboard saxophone fast instrumental electronica 8s voice beat slow rap hip_hop jazz piano techno dance female bass vocal pop electronic rock synth male guitar drum 5 1 15 2 25 3 4 8 12 16 2 24 28 32 time / s E4896 Music Signal Processing (Dan Ellis) 214-5-5-8 /19 1.5 1.5.5 1 1.5 2

Onset Correlation Ahead of or behind the beat? Brian McFee Tony Williams Elvin Jones E4896 Music Signal Processing (Dan Ellis) 214-5-5-9 /19

Structural Similarity Diego Silva Helene Papadopoulos Self-similarity shows repeating structure in music Can we find similar pieces by finding similar structures? from Bello 211 E4896 Music Signal Processing (Dan Ellis) 214-5-5-1/19

Ordinal LDA Segmentation Low-rank decomposition of skewed selfsimilarity to identify repeats Learned weighting of multiple factors to segment Linear Discriminant Analysis between adjacent segments Beat Lag 55 11 165 22 275 33 Beat McFee E4896 Music Signal Processing (Dan Ellis) 214-5-5-11/19 33 275 22 165 11 55 33 22 11-11 -22 Self-similarity Filtered self-sim. -33 55 11 165 22 275 33 Beat Lag 33 22 11-11 -22 Skewed self-sim. -33 55 11 165 22 275 33 Beat Factor 1 2 3 4 5 6 7 Latent repetition 55 11 165 22 275 33 Beat

Lyric Recognition Speech Recognition for Songs lots of interference atypical speech Matt McVicar 4 Polyphonic Audio 4 Acapella Audio Frequency (khz) 3 2 1 2 4 6 8 3 2 1 2 4 6 8 4 Natural Speech 4 Synthesized Speech Frequency (khz) 3 2 1 1 2 3 4 5 6 Time (seconds) 3 2 1 1 2 3 4 5 6 7 Time (seconds) E4896 Music Signal Processing (Dan Ellis) 214-5-5-12/19

Singing ASR Speech recognition adapted to singing needs aligned data Align scraped acapellas and full mix including jumps McVicar E4896 Music Signal Processing (Dan Ellis) 214-5-5-13/19

Remixavier" Optimal align-and-cancel of mix and acapella timing and channel may differ Raffel E4896 Music Signal Processing (Dan Ellis) 214-5-5-14/19

Million Song Dataset Many Facets Echo Nest audio features + metadata Echo Nest taste profile user-song-listen count Second Hand Song covers musixmatch lyric BoW last.fm tags Now with audio? resolving artist / album / track / duration against what.cd Bertin-Mahieux McFee E4896 Music Signal Processing (Dan Ellis) 214-5-5-15/19

MIDI-to-MSD Aligned MIDI to Audio is a nice transcription Raffel Shi E4896 Music Signal Processing (Dan Ellis) 214-5-5-16/19

De-DTMF Problem: Stationary tones confuse speech detector Adaptively filter sinusoids with steady amplitude Frequency Input audio tcp_d1_2_counting_cia_irdial Spectrum and LPC fit 3 6 2 1 55 56 57 Time Framing Gain / db 4 2 2 5 1 15 Freq / Hz Imaginary Part 1 LPC poles 2 1 1 1 Real Part t Imaginary Par.7 LPC poles detail.68.68.7.72 Real Part LPC fit Find roots Transform radii 1..8.6.4. 2. Mapped radius Ouput audio Overlapadd Filter audio frames Add poles Map to zeros Frequency 3 2 1 Filtered signal 55 56 57 Time Filter response & spectrum 6 Gain / db 4 2 2 5 1 15 Freq / Hz 1 1 1 Real Part E4896 Music Signal Processing (Dan Ellis) 214-5-5-17/19 Imaginary Part 1 Transformed filter 15 Imaginary Part Transformed filter detail.72.7.68.68.7.72 Real Part

Pitch-based Filtering Resample to flatten pitch, then filter E4896 Music Signal Processing (Dan Ellis) 214-5-5-18/19

Summary Signal Separation NMF, RPCA, cancellation, filtering Music Information Beat tracking, segmentation Large datasets Indexing & retrieval Speech Lyric recognition Speech detection & enhancement E4896 Music Signal Processing (Dan Ellis) 214-5-5-19/19

References [Bello 211] J P Bello, Measuring structural similarity in music, IEEE Tr. Audio, Speech, & Lang., 19(7): 213-225, 211. [Serra et al. 212] J Serrà, A Corral, M Boguña, M. Haro, & J. Arcos, Measuring the evolution of contemporary western popular music, Scientific Reports, 2:521, 212. E4896 Music Signal Processing (Dan Ellis) 214-5-5-2/19