MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases

Similar documents
Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Learning Musicianship for Automatic Accompaniment

Music Alignment and Applications. Introduction

Outline. Why do we classify? Audio Classification

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Music Similarity and Cover Song Identification: The Case of Jazz

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Lecture 15: Research at LabROSA

A repetition-based framework for lyric alignment in popular songs

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Database Retrieval Based on Spectral Similarity

Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University

Singer Traits Identification using Deep Neural Network

Transcription of the Singing Melody in Polyphonic Music

The Million Song Dataset

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

Polyphonic Audio Matching for Score Following and Intelligent Audio Editors

MATCH: A MUSIC ALIGNMENT TOOL CHEST

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

A wavelet-based approach to the discovery of themes and sections in monophonic melodies Velarde, Gissel; Meredith, David

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

MUSI-6201 Computational Music Analysis

Music Radar: A Web-based Query by Humming System

COSC282 BIG DATA ANALYTICS FALL 2015 LECTURE 11 - OCT 21

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

THE importance of music content analysis for musical

Supplementary Note. Supplementary Table 1. Coverage in patent families with a granted. all patent. Nature Biotechnology: doi: /nbt.

Story Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004

Communication Theory and Engineering

Analysis of Visual Similarity in News Videos with Robust and Memory-Efficient Image Retrieval

Lecture 9 Source Separation

Cedits bim bum bam. OOG series

Singer Recognition and Modeling Singer Error

CS229 Project Report Polyphonic Piano Transcription

Evaluating Melodic Encodings for Use in Cover Song Identification

base calling: PHRED...

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

Automatic Music Clustering using Audio Attributes

NEW QUERY-BY-HUMMING MUSIC RETRIEVAL SYSTEM CONCEPTION AND EVALUATION BASED ON A QUERY NATURE STUDY

Music Representations

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Inverted Index Construction

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

Subjective Similarity of Music: Data Collection for Individuality Analysis

Representing, comparing and evaluating of music files

A Music Retrieval System Using Melody and Lyric

CSC475 Music Information Retrieval

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

Music Understanding and the Future of Music

CHAPTER 3. Melody Style Mining

Gus (Guangyu) Xia , NYU Shanghai, Shanghai, Tel: (412) Webpage:

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Retrieval of textual song lyrics from sung inputs

10 Visualization of Tonal Content in the Symbolic and Audio Domains

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Detecting Musical Key with Supervised Learning

Robert Alexandru Dobre, Cristian Negrescu

mmwave Radar Sensor Auto Radar Apps Webinar: Vehicle Occupancy Detection

Agilent Technologies. N5106A PXB MIMO Receiver Tester. Error Messages. Agilent Technologies

Digital Signal Processing Detailed Course Outline

CSE 101. Algorithm Design and Analysis Miles Jones Office 4208 CSE Building Lecture 9: Greedy

Music Information Retrieval Using Audio Input

A Framework for Segmentation of Interview Videos

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

Semi-supervised Musical Instrument Recognition

Automatic Music Genre Classification

Jazz Melody Generation and Recognition

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

1 Overview. 1.1 Nominal Project Requirements

Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification

Automatic Construction of Synthetic Musical Instruments and Performers

Audio: Generation & Extraction. Charu Jaiswal

Release Year Prediction for Songs

Introduction. Edge Enhancement (SEE( Advantages of Scalable SEE) Lijun Yin. Scalable Enhancement and Optimization. Case Study:

Music Representations

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping

10 Gb/s Duobinary Signaling over Electrical Backplanes Experimental Results and Discussion

Grouping Recorded Music by Structural Similarity Juan Pablo Bello New York University ISMIR 09, Kobe October 2009 marl music and audio research lab

A combination of approaches to solve Task How Many Ratings? of the KDD CUP 2007

Music 209 Advanced Topics in Computer Music Lecture 1 Introduction

CALCULATING SIMILARITY OF FOLK SONG VARIANTS WITH MELODY-BASED FEATURES

Xuelong Li, Thomas Huang. University of Illinois at Urbana-Champaign

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

MCP Signal Extraction and Timing Studies. Kurtis Nishimura University of Hawaii LAPPD Collaboration Meeting June 11, 2010

technical note flicker measurement display & lighting measurement

UARP. User Guide Ver 2.2

H.261: A Standard for VideoConferencing Applications. Nimrod Peleg Update: Nov. 2003

Automatic Piano Music Transcription

Analysis of MPEG-2 Video Streams

Music and Text: Integrating Scholarly Literature into Music Data

Transcription:

1 MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases Gus Xia Tongbo Huang Yifei Ma Roger B. Dannenberg Christos Faloutsos Schools of Computer Science Carnegie Mellon University

2 Introduction: Background What is MIDI? Musical Instrument Digital Interface. A MIDI file doesn t carry the actual sound but rather the control informa=on. For piano pieces: pitch, velocity, start =me, and ending =me. What are similar MIDI files? Different performance versions of the same composi=on, including the pure quan=zed version. Why find similar MIDI files? Important for musicians and music amateurs. Widely distributed online.

3 Introduction: Goal Context: MIDI files are difficult to search by metadata due to careless or casual labeling. Idea: content-based retrieval Our goal: Given a query MIDI file, find all different performance versions (including pure quantized version) of the same composition The search should be effective and fast to deal with 1 million MIDI files.

4 Introduction: General approach

5 Outline Introduc/on Search Quality Search Scalability Build MidiFind System Experiments Demo Conclusion

6 Search Quality Goal: Design features and corresponding measurements to reveal the similarity between different MIDI files. General Methods: Euclidean distance for Bag- of- words feature Modified Levenshtein distance for melody string feature

7 Search Quality: ED for BOW feature Bag- of- words feature: Word: note, where its octave and dura=on are ignored. Word count: normalized appearance =mes of a note. BOW feature: 12- dim vector, an empirical distribu=on over the pitch classes (0.3, 0, 0.1, 0.05, 0.1, 0.1, 0.02,0.2, 0,0.02, 0.01, 0.1 ) (C, C#, D, D#, E, F, F#,G, G#, A, A#, B ) Euclidean Distance for two vectors: 12 i=1 ED(a,b) = (a i b i ) 2

8 Search Quality: modified Levenshtein distance for melody string feature Melody string feature: Dis=nc=ve element to help people tell different music We simply use highest pitches at any given =me as the melody, where the dura=ons are ignored. Levenshtein distance for two strings:

Search Quality: cons of Levenshtein Problem: The distance correlates with the melody length The distribu=on over the length of melody strings follows a power law, with the mean of 1303 and standard devia=on of 1240 800 600 count 400 200 0 0 2000 4000 6000 8000 10000 12000 14000 Length of melody string 9

10 Search Quality: Lev-400 Solu=on: Turn melody strings into equal length Chopping and concatena=ng the first and last 200 notes Don t modify the strings which are shorter than 400, but scale up the Levenshtein distance Insights: A unified length will leads to a unified threshold Similar melodies tend to agree more at the beginning and the ending part.

11 Search Quality: Lev-400SC Observa=on: For similar melody strings, the string edi=ng path of smallest distances stays close to diagonal. Idea: We don t need to fill up the whole matrix Solu=on: Use a diagonal Sakoe- Chiba Band Sakoe, H. & Chiba, S. (1978). Dynamic programming spoken word recognition algorithm optimization for

12 Outline Introduc/on Search Quality Search Scalability Build MidiFind System Experiments Demo Conclusion

13 Search Scalability Goal: Speed up the searching process since naïve linear scanning is very slow. General Methods: Combine different similarity measurements Use M- tree indexing

14 Search Scalability: MF-Q Idea: Combine ED and Lev- 400 First do linear scan for ED, filtering out most candidates Then do linear scan for Lev- 400 on the surviving candidates Speed- up factor: BOW filtering: a frac=on p remains, we speed up 1/p m n Clipped melody representa=on: 400 2

15 Search Scalability: MF-SC Idea: Combine ED and Lev- 400SC BOW filtering works the same Use diagonal Sakoe- Chiba Band. Set the bandwidth: b = max{10% min{m,n,400},20} Speed- up factor: Most melody strings are longer than 400, A factor of 10 b = 40

16 Search Scalability: MF Idea: Further speed up for ED computa=on Use M- tree indexing for range query Speed- up factor: a frac=on q is searched, we speed up the ED by 1/q

17 Outline Introduc/on Search Quality Search Scalability Build MidiFind System Experiments Demo Conclusion

18 Build MidiFind System Goal: Set the thresholds Consider both search quality and search scalability S Whole set S ED(< ε ED ) Lev400 sc(< ε Lev ) Method: S precision = (S recall = (S S ) S S ) S F value = ( 1 precision + 1 recall ) 1 Compute precision, recall, and F- value as func=ons of thresholds Choose ε Lev = 306,which leads to the largest F- value Choose ε ED = 0.1, which balance a large recall and a small size S ED

19 Outline Introduc/on Search Quality Search Scalability Build MidiFind System Experiments Demo Conclusion

20 Experiment: Dataset and machine Small labeled dataset: 325 different MIDI files, 79 unique composi=ons, 2289 similar pairs of MIDI files. Much bigger unlabeled dataset: 12484 MIDI files, free download from websites Machine: 3.06 GHz, 2- core(intel Core i3) imac with 4GB Memory

21 Experiment: Search quality ε ED (a) ED (b) Lev- 400sc (c) Standard- Lev ε Lev (d) MF

22 Best thresholds and their qualities

Experiment: Search scalability ED threshold (ε ED ) Vs. Speed- ups Fraction of surviving candidates 0.08 0.06 0.04 0.02 0 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 ED threshold (ε ED ) Ratio to linear scan 0.65 0.6 0.55 0.5 0.45 0.4 0.35 maximum lower bound approach minimum sum of radii approach 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 ED threshold (ε ED ) 23

24 Experiment: Search scalability A comparison of the searching =me of different methods Average query time (sec) 0.25 0.2 0.15 0.1 0.05 MF MF SC MF Q Lev 400 linear scan Lev linear scan 0 0 2000 4000 6000 8000 10000 12000 The size of MIDI dataset

25 Outline Introduc/on Search Quality Search Scalability Build MidiFind System Experiments Demo Conclusion

26 Demo www.cmumidifind.com

27 Conclusion We present MidiFind, a MIDI query system for effec=ve and fast searching of MIDI databases. It is effec=ve: It achieve 99.5% precision and 89.8% recall, compared to pure Levinshtein distance measurement, which achieves 95.6% precision and 56.3% recall. It is fast: By using clipped melody representa=on, bag- of- words filtering, Sakoe- Chiba Band, and M- tree, we achieve speed- ups of factors of 10, 40, 10, and 1.05, respec=vely, which finally leads to a speed- up of about 4000.

28 Thanks! Q&A