Music Alignment and Applications. Introduction

Similar documents
Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Music Understanding and the Future of Music

Lab P-6: Synthesis of Sinusoidal Signals A Music Illusion. A k cos.! k t C k / (1)

Polyphonic Audio Matching for Score Following and Intelligent Audio Editors

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Computer Coordination With Popular Music: A New Research Agenda 1

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Outline. Why do we classify? Audio Classification

CS 591 S1 Computational Audio

Automatic Construction of Synthetic Musical Instruments and Performers

CS229 Project Report Polyphonic Piano Transcription

10 Visualization of Tonal Content in the Symbolic and Audio Domains

A Bootstrap Method for Training an Accurate Audio Segmenter

Lecture 15: Research at LabROSA

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Hidden Markov Model based dance recognition

Singer Recognition and Modeling Singer Error

2. AN INTROSPECTION OF THE MORPHING PROCESS

Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification

Music 209 Advanced Topics in Computer Music Lecture 1 Introduction

Music Structure Analysis

AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC

A repetition-based framework for lyric alignment in popular songs

Music Radar: A Web-based Query by Humming System

Automatic Labelling of tabla signals

A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

Transcription An Historical Overview

ALIGNING SEMI-IMPROVISED MUSIC AUDIO WITH ITS LEAD SHEET

Data Driven Music Understanding

Music Representations

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

Score Printing and Layout

> f. > œœœœ >œ œ œ œ œ œ œ

THE importance of music content analysis for musical

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Introductions to Music Information Retrieval

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

Audio Structure Analysis

SOA PIANO ENTRANCE AUDITIONS FOR 6 TH - 12 TH GRADE

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Robert Alexandru Dobre, Cristian Negrescu

Improving Polyphonic and Poly-Instrumental Music to Score Alignment

ARECENT emerging area of activity within the music information

Level 2 Music, Demonstrate knowledge of conventions in a range of music scores pm Wednesday 28 November 2012 Credits: Four

A prototype system for rule-based expressive modifications of audio recordings

Figure 1: Feature Vector Sequence Generator block diagram.

Week 14 Music Understanding and Classification

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Author... Program in Media Arts and Sciences,

Frankenstein: a Framework for musical improvisation. Davide Morelli

CSC475 Music Information Retrieval

Breakscience. Technological and Musicological Research in Hardcore, Jungle, and Drum & Bass

Cedits bim bum bam. OOG series

Chord Classification of an Audio Signal using Artificial Neural Network

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment

Curriculum Standard One: The student will listen to and analyze music critically, using the vocabulary and language of music.

Music Representations

Curriculum Standard One: The student will listen to and analyze music critically, using the vocabulary and language of music.

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

MUSIC is a ubiquitous and vital part of the lives of billions

The MPC X & MPC Live Bible 1

Evaluating Melodic Encodings for Use in Cover Song Identification

MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases

Audio Source Separation: "De-mixing" for Production

Tempo and Beat Analysis

Algorithms for melody search and transcription. Antti Laaksonen

Audio Structure Analysis

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

Assessment Schedule 2017 Music: Demonstrate knowledge of conventions in a range of music scores (91276)

Analysis of local and global timing and pitch change in ordinary

Chapter 40: MIDI Tool

The Yamaha Corporation

Reason Overview3. Reason Overview

OCTAVE C 3 D 3 E 3 F 3 G 3 A 3 B 3 C 4 D 4 E 4 F 4 G 4 A 4 B 4 C 5 D 5 E 5 F 5 G 5 A 5 B 5. Middle-C A-440

Audio Structure Analysis

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

A Bayesian Network for Real-Time Musical Accompaniment

Topic 10. Multi-pitch Analysis

Rechnergestützte Methoden für die Musikethnologie: Tool time!

A CHROMA-BASED SALIENCE FUNCTION FOR MELODY AND BASS LINE ESTIMATION FROM MUSIC AUDIO SIGNALS

A probabilistic framework for audio-based tonal key and chord recognition

Music Composition with RNN

Sound Magic Piano Thor NEO Hybrid Modeling Horowitz Steinway. Piano Thor. NEO Hybrid Modeling Horowitz Steinway. Developed by

y POWER USER MUSIC PRODUCTION and PERFORMANCE With the MOTIF ES Mastering the Sample SLICE function

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping

Music Genre Classification and Variance Comparison on Number of Genres

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

Data Driven Music Understanding

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data

MUSIC/AUDIO ANALYSIS IN PYTHON. Vivek Jayaram

Pitch correction on the human voice

Assessment Schedule 2012 Music: Demonstrate knowledge of conventions in a range of music scores (91276)

DCI Requirements Image - Dynamics

User Guide Version 1.1.0

Transcription:

Music Alignment and Applications Roger B. Dannenberg Schools of Computer Science, Art, and Music Introduction Music information comes in many forms Digital Audio Multi-track Audio Music Notation MIDI Structured meta-data (e.g. AMG) Unstructured meta-data (e.g. tags, blogs) October 2010 (c) 2010 Roger B. Dannenberg 2 1

Overview Music Representations Music Alignment Chromagrams Dynamic Programming Some Applications Audacity implementation Onset detection October 2010 (c) 2010 Roger B. Dannenberg 3 Music Audio Millions of files online Usually considered the "true" document What people listen to Details at all levels, from composition to signal Limitations: Does not contain any explicit abstract information: Notes, chords, rhythm, sections, instrumentation Can't automatically extract a note-level description Source separation problem (unsolved) October 2010 (c) 2010 Roger B. Dannenberg 4 2

Multi-track Music Audio Most music is recorded on separate "tracks" Stereo has left and right Master (source) recordings typically have "piano" track, "vocal" track, "bass" track, etc. Allow studios to manipulate audio in interesting ways without solving the source separation problem. www.software-dungeon.co.uk/images/117796_main.gif October 2010 (c) 2010 Roger B. Dannenberg 5 Mostly quantized or symbolic representation of music "Deep Structure" explicitly denotes much (not all) abstract information To derive (musical) audio requires musicians to perform the music Music Notation http://www.informatics.indiana.edu/donbyrd/interestingmusicnotation.html October 2010 (c) 2010 Roger B. Dannenberg 6 3

MIDI Musical Instrument Digital Interface Designed to capture music keyboard performance information: key number+velocity, key up, volume pedal, etc. Some MIDI files are "quantized" and contain some music notation info. Usually, instrument info (sound source) is available. Convert to audio with synthesis, but usually not great sound. http://www.les-stooges.org/pascal/midiswing/ October 2010 (c) 2010 Roger B. Dannenberg 7 Meta-Data and Text An interesting topic, but I will not talk about it today. October 2010 (c) 2010 Roger B. Dannenberg 8 4

Linking/Sync'ing Different Representations Music alignment is not trivial: Music is somehow "the same" at different speeds Performers are not exact, so no two performances have the same tempo Radio stations typically time-scale recordings to make them shorter(!) Music notation leaves exact timing to performers Performers take liberties with timing for expression, e.g. timing details are important to communicate emotion October 2010 (c) 2010 Roger B. Dannenberg 9 Linking/Sync'ing Different Representations (2) Music alignment is interesting: Requires some abstract "understanding" Automatic abstraction is inherently interesting "Poor Man's Transcription": Aligned MIDI data gives pitch, timing, and source instrument information without solving automatic transcription Automatic Page Turning: computers can "listen" to audio and turn pages of aligned music notation Compare great performances: How does Mario Lanza compare to Luciano Pavarotti? Search: "Let's listen to the oboe solo at measure 200" Editing: "Let's replace the audio from Thursday where someone coughed with the same spot recorded on Friday" October 2010 (c) 2010 Roger B. Dannenberg 10 5

Linking/Sync'ing Different Representations (3) Music alignment is (partially) solved (robustly) Let's see how: Step 1: Chromagram representation Step 2: Distance function Step 3: Dynamic programming Step 4: Smoothing October 2010 (c) 2010 Roger B. Dannenberg 11 Chromagram Representation Spectrum Linear frequency to log frequency: "Semi vector": one bin per semitone Projection to pitch classes: "Chroma vector" C 1 +C 2 +C 3 +C 4 +C 5 +C 6 +C 7, C# 1 +C# 2 +C# 3 +C# 4 +C# 5 +C# 6 +C# 7, etc. "Distance Function": Euclidean, Cosine, etc. October 2010 (c) 2010 Roger B. Dannenberg 12 6

Distance Function Sometimes normalize each chromagram to a variance of 1 and a mean of 0: amplitude variations may not be consistently reproduced, so best to normalize them out Sometimes keep a "13th" vector element to indicate "silence": normalizing background noise during silence makes it hard to align silence to silence Euclidean distance works well Some use vector cosine (especially if vectors are not normalized) October 2010 (c) 2010 Roger B. Dannenberg 13 Alignment: What Is It? Timeline for music Audio 2 Alignment path gives a mapping from time points in Audio 1 to time points in Audio 2 Timeline for music Audio 1 October 2010 (c) 2010 Roger B. Dannenberg 14 7

Dynamic Programming Extract feature vector for each frame of Audio 1 and Audio 2. Compare NxM feature vectors (Euclidean, Cosine, etc.): DISTANCE MATRIX Find lowest-cost path. October 2010 (c) 2010 Roger B. Dannenberg 15 Dynamic Programming (2) Objective: find the path from [1,1] to [m,n] that minimizes the sum of distances along the way. Exponential number of paths: you can go left, right, or diagonal at each step. Trick: Store the lowest cost from [1,1] to [i,j] and compute cost incrementally in terms of previous solutions. October 2010 (c) 2010 Roger B. Dannenberg 16 8

Dynamic Programming (3) Computed Alignment Path October 2010 (c) 2010 Roger B. Dannenberg 17 Smoothing Alignment tends to have some local irregularities: horizontal and vertical segments in path correspond to small but abrupt jumps in time Sometimes smoothing can help: fit smooth curves to approximate the alignment path October 2010 (c) 2010 Roger B. Dannenberg 18 9

Chromagrams and MIDI Option 1: synthesize MIDI to audio, compute chromagrams as usual Option 2: set chroma vector bin to the count of all notes (or the sum of their velocities) in that pitch class October 2010 (c) 2010 Roger B. Dannenberg 19 Score Alignment in Audacity October 2010 (c) 2010 Roger B. Dannenberg 20 10

Finding Note Onsets Not all attacks are clean Slurs do not have obvious (or fast) transitions We can use score alignment to get a rough idea of where the notes are (~1/10 second) Then, machine learning can create programs that do an even better job (bootstrap learning). October 2010 (c) 2010 Roger B. Dannenberg 21 Conclusions Music alignment based on DP is robust, fast, and has many applications. Still some bothersome problems: Detecting beginning and ending (local alignment) is a problem Tradeoffs between smoothness, local timing accuracy, and global robustness October 2010 (c) 2010 Roger B. Dannenberg 22 11