Video-based Vibrato Detection and Analysis for Polyphonic String Music

Similar documents
Introductions to Music Information Retrieval

Tempo and Beat Analysis

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Topic 10. Multi-pitch Analysis

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

Topics in Computer Music Instrument Identification. Ioanna Karydi

POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING

MUSI-6201 Computational Music Analysis

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

Improving Polyphonic and Poly-Instrumental Music to Score Alignment

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Music Genre Classification and Variance Comparison on Number of Genres

Audio-Visual Analysis of Music Performances

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

Voice & Music Pattern Extraction: A Review

Soundprism: An Online System for Score-Informed Source Separation of Music Audio Zhiyao Duan, Student Member, IEEE, and Bryan Pardo, Member, IEEE

Music Representations

A CLASSIFICATION-BASED POLYPHONIC PIANO TRANSCRIPTION APPROACH USING LEARNED FEATURE REPRESENTATIONS

Multidimensional analysis of interdependence in a string quartet

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

AUTOMATICALLY IDENTIFYING VOCAL EXPRESSIONS FOR MUSIC TRANSCRIPTION

Music Alignment and Applications. Introduction

Copyright 2009 Pearson Education, Inc. or its affiliate(s). All rights reserved. NES, the NES logo, Pearson, the Pearson logo, and National

THE importance of music content analysis for musical

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE

Rechnergestützte Methoden für die Musikethnologie: Tool time!

Lecture 9 Source Separation

NOTE-LEVEL MUSIC TRANSCRIPTION BY MAXIMUM LIKELIHOOD SAMPLING

A Survey of Audio-Based Music Classification and Annotation

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

CSC475 Music Information Retrieval

jsymbolic 2: New Developments and Research Opportunities

Automatic Piano Music Transcription

Tempo and Beat Tracking

Further Topics in MIR

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Evaluating Melodic Encodings for Use in Cover Song Identification

STRING QUARTET CLASSIFICATION WITH MONOPHONIC MODELS

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Music Information Retrieval (MIR)

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Extracting Information from Music Audio

WE ADDRESS the development of a novel computational

Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

Topic 4. Single Pitch Detection

Singer Traits Identification using Deep Neural Network

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Automatic Rhythmic Notation from Single Voice Audio Sources

Singer Recognition and Modeling Singer Error

CS229 Project Report Polyphonic Piano Transcription

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Effects of acoustic degradations on cover song recognition

Music Information Retrieval

Transcription of the Singing Melody in Polyphonic Music

Lecture 10 Harmonic/Percussive Separation

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

jsymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada

Transcription An Historical Overview

Automatic Labelling of tabla signals

Automatic Construction of Synthetic Musical Instruments and Performers

Music Information Retrieval

Music Information Retrieval for Jazz

TOWARDS THE CHARACTERIZATION OF SINGING STYLES IN WORLD MUSIC

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

Improving Frame Based Automatic Laughter Detection

Data Driven Music Understanding

Week 14 Music Understanding and Classification

Computational Modelling of Harmony

Analyzing & Synthesizing Gamakas: a Step Towards Modeling Ragas in Carnatic Music

Singing Pitch Extraction and Singing Voice Separation

Recognising Cello Performers Using Timbre Models

Breakscience. Technological and Musicological Research in Hardcore, Jungle, and Drum & Bass

Proc. of NCC 2010, Chennai, India A Melody Detection User Interface for Polyphonic Music

Audio Feature Extraction for Corpus Analysis

A prototype system for rule-based expressive modifications of audio recordings

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

Lecture 15: Research at LabROSA

Music Database Retrieval Based on Spectral Similarity

A Bootstrap Method for Training an Accurate Audio Segmenter

A repetition-based framework for lyric alignment in popular songs

Efficient Vocal Melody Extraction from Polyphonic Music Signals

Semi-supervised Musical Instrument Recognition

TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

ON FINDING MELODIC LINES IN AUDIO RECORDINGS. Matija Marolt

DISTINGUISHING MUSICAL INSTRUMENT PLAYING STYLES WITH ACOUSTIC SIGNAL ANALYSES

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

EDDY CURRENT IMAGE PROCESSING FOR CRACK SIZE CHARACTERIZATION

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

Recognising Cello Performers using Timbre Models

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment

Nature Neuroscience: doi: /nn Supplementary Figure 1. Emergence of dmpfc and BLA 4-Hz oscillations during freezing behavior.

Automatic music transcription

Music Similarity and Cover Song Identification: The Case of Jazz

Transcription:

Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International Society for Music Information Retrieval Oct 23-27, 2017 Suzhou, China 1

Introduction: Vibrato in Music Important artistic effect Pitch modulation of a note in a periodic fashion Characterized by Rate & Extent Spectrogram Audio Non-vibrato Vibrato Applications of Vibrato Analysis Musicological studies Sound synthesis Voice extraction 2

Introduction: Problem Statement Vibrato Detection & Analysis for polyphonic music played by string instruments Vibrato Detection Pitch Note-level vibrato/non-vibrato classification Vibrato Analysis Vibrato rate: speed of pitch variation (1/T Hz) Time Vibrato extent: amount of pitch variation (A cents) Pitch A T Time 3

Introduction: Prior Audio-based Methods Score-informed [Abeßer et al. 2015] (Baseline) Template-based [Driedger et al. 2016] Harmonic partial [Hsu et al. 2010] Major drawbacks One source from mixture Fails in high polyphony 4

Proposed Method Overview and Key Contribution Ground-truth Pitch Audio-based, Poly Spec 0.2 0.4 0.6 0.8 1.0 1.2 sec Pitch Video-based Hand 0 0.2 0.4 0.6 0.8 1.0 1.2 sec Hand Displacement 0 0.2 0.4 0.6 0.8 1.0 1.2 sec 5

Proposed Method Overview Video-based Method Score Alignment Extent Track Association Vibrato Detection Vibrato Analysis Motion Feature Extraction Rate 6

Proposed Method Score Alignment Score Alignment Extent Track Association Vibrato Detection Vibrato Analysis Motion Feature Extraction Rate 7

Proposed Method Score Alignment Chroma feature Dynamic Time Warping 8

Proposed Method Track-player Association Score Alignment Extent Track Association Vibrato Detection Vibrato Analysis Motion Feature Extraction Rate 9

Proposed Method Track-player Association Bow motion <--> Score onset Previous work [Li et al. 2017] 10

Proposed Method Track-player Association Score Alignment Extent Track Association Vibrato Detection Vibrato Analysis Motion Feature Extraction Rate 11

Proposed Method Motion Feature Extraction Hand tracking - KLT tracker with 30 feature points - Bounding box: 70 x 70 pixels 12

Proposed Method Motion Feature Extraction Fine-grained motion capture - Optical flow estimation à pixel-level motion velocities - Frame-wise average: - Subtract moving mean: Original Frame Color-encoded Optical Flow v(t)

Proposed Method Track-player Association Score Alignment Extent Track Association Vibrato Detection Vibrato Analysis Motion Feature Extraction Rate 14

Proposed Method Vibrato Detection Method 1: Supervised framework Support Vector Machine (SVM) 8-D feature Zero-crossing rate (4-D) Frequency (2-D) Auto-correlation peaks (2-D) Leave-one-out training strategy Note segment Vibrato / Non-vibrato Classifier 8-D t 15

Proposed Method Vibrato Detection Method 2: Unsupervised framework Principal Component Analysis (PCA) 1-D Motion Velocity Curve: Integration à Motion Displacement Curve: X (t) 0.2 0.4 0.6 0.8 1.0 1.2 Time 16

Proposed Method Vibrato Analysis Score Alignment Extent Track Association Vibrato Detection Vibrato Analysis Motion Feature Extraction Rate 17

Proposed Method Vibrato Analysis Rate Motion rate = Vibrato rate Quadratic interpolation Peak distance on auto-correlation of motion curve X(t) Ground-truth pitch contour 0 0.2 0.4 0.6 0.8 1.0 1.2 sec Motion displacement Curve X(t) 0 0.2 0.4 0.6 0.8 1.0 1.2 sec 18

Proposed Method Vibrato Analysis Extent Motion extent Vibrato extent Ground-truth pitch contour Pixel à Musical cents Scale motion curve X(t) to fit pitch contour Estimated pitch contour Motion displacement Curve X(t) Estimated vib extent Motion extent Pitch contour 19

Demo of Dataset Dataset: URMP Dataset Individually recorded in sound booth Annotated frame-level / note-level pitch 20

Demo of Dataset Dataset: URMP Dataset Assembled together with concert stage background 21

Experiments: Vibrato Detection Results Overall Evaluation Proposed Video-based method à 92% F-measure Improvement over audio-based method SVM > PCA 22

Experiments: Vibrato Detection Results Impact of Polyphony Number Baseline Proposed 2 3 4 5 Poly No. Audio-based method: Poly Performance Proposed video-based method: Robust 23

Experiments: Vibrato Detection Results Variation Based on Type of Instrument Baseline Proposed Violin Viola Cello Bass Instr. Audio-based method: Pitch range Performance Proposed Video-based method: Robust 24

Experiments: Vibrato Analysis Results Vibrato Rate / Extent 2290 vibrato notes Rate error: 0.38 Hz Extent error: 3.47 cents 25

Conclusions Proposed video-based vibrato detection/analysis offers significant improvement over conventional audio-only analysis Compared to audio-based methods, proposed video-based method is Robust for polyphonic sources Robust for different types of instruments Proposed method provides good estimates for vibrato rate and extent A powerful tool for analyzing string ensembles 26

Thank you!

Experiments: Dataset URMP Dataset 19 string ensembles (57 tracks) 5 duets, 4 trios, 7 quartets, 3 quintets Audio: 48k Hz Video: 1080P, 29.97 fps URMP Dataset 28

Demo of Dataset Dataset: URMP Dataset 14 instruments, 44 piece arrangements 29

Experiments Results Potential Application on Musicologies Vibrato characteristics for different instruments Test on TPs from Vid-PCA method: 2290 vibrato notes Average error: 0.38 Hz / 3.47 cents Double bass à lower rate / extent [1] [1] James Paul Mick. An analysis of double bass vibrato: Rates, widths, and pitches as influenced by pitch height, fingers used, and tempo. PhDthesis, The Florida State University, 2012. 30