A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio

Similar documents
A System for Automatic Chord Transcription from Audio Using Genre-Specific Hidden Markov Models

EMPLOYMENT SERVICE. Professional Service Editorial Board Journal of Audiology & Otology. Journal of Music and Human Behavior

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

ANDY M. SARROFF CURRICULUM VITAE

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Music Radar: A Web-based Query by Humming System

Melody Retrieval On The Web

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION

Acoustic Scene Classification

Computational Modelling of Harmony

Methods for the automatic structural analysis of music. Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Automatic Construction of Synthetic Musical Instruments and Performers

Piya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan

Introductions to Music Information Retrieval

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Statistical Modeling and Retrieval of Polyphonic Music

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Proposal for Application of Speech Techniques to Music Analysis

A probabilistic framework for audio-based tonal key and chord recognition

Outline. Why do we classify? Audio Classification

Automatic Rhythmic Notation from Single Voice Audio Sources

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

Topics in Computer Music Instrument Identification. Ioanna Karydi

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

A CLASSIFICATION-BASED POLYPHONIC PIANO TRANSCRIPTION APPROACH USING LEARNED FEATURE REPRESENTATIONS

Subjective Similarity of Music: Data Collection for Individuality Analysis

A Study on Music Genre Recognition and Classification Techniques

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

Lecture 9 Source Separation

Analysis, Synthesis, and Perception of Musical Sounds

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Interacting with a Virtual Conductor

Phone-based Plosive Detection

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE

Piano Transcription MUMT611 Presentation III 1 March, Hankinson, 1/15

Singer Traits Identification using Deep Neural Network

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

A Fast Alignment Scheme for Automatic OCR Evaluation of Books

SYNTHESIS FROM MUSICAL INSTRUMENT CHARACTER MAPS

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

SONG HUI CHON. EDUCATION McGill University Montreal, QC, Canada Doctor of Philosophy in Music Technology (Advisor: Dr.

Music Database Retrieval Based on Spectral Similarity

Pattern Recognition in Music

Singing Pitch Extraction and Singing Voice Separation

A prototype system for rule-based expressive modifications of audio recordings

Automatic Piano Music Transcription

Audio Feature Extraction for Corpus Analysis

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Music Information Retrieval

After Direct Manipulation - Direct Sonification

Transcription of the Singing Melody in Polyphonic Music

Joint bottom-up/top-down machine learning structures to simulate human audition and musical creativity

A repetition-based framework for lyric alignment in popular songs

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Music Understanding and the Future of Music

Melody transcription for interactive applications

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

A Beat Tracking System for Audio Signals

Improving Frame Based Automatic Laughter Detection

Speech and Speaker Recognition for the Command of an Industrial Robot

First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Singer Identification

Automatic Laughter Detection

Music Information Retrieval with Temporal Features and Timbre

Scoregram: Displaying Gross Timbre Information from a Score

Normalized Cumulative Spectral Distribution in Music

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

CURRICULUM VITAE John Usher

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Luwei Yang. Mobile: (+86) luweiyang.com

An Examination of Foote s Self-Similarity Method

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts

158 ACTION AND PERCEPTION

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

Chord Classification of an Audio Signal using Artificial Neural Network

Automatic Music Clustering using Audio Attributes

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Music Segmentation Using Markov Chain Methods

Analysing Musical Pieces Using harmony-analyser.org Tools

Audio-Based Video Editing with Two-Channel Microphone

Shades of Music. Projektarbeit

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

Sparse Representation Classification-Based Automatic Chord Recognition For Noisy Music

Recognising Cello Performers Using Timbre Models

Music Genre Classification and Variance Comparison on Number of Genres

Edison Revisited. by Scott Cannon. Advisors: Dr. Jonathan Berger and Dr. Julius Smith. Stanford Electrical Engineering 2002 Summer REU Program

Music Information Retrieval

TECHNIQUES FOR AUTOMATIC MUSIC TRANSCRIPTION. Juan Pablo Bello, Giuliano Monti and Mark Sandler

Transcription:

Curriculum Vitae Kyogu Lee Advanced Technology Center, Gracenote Inc. 2000 Powell Street, Suite 1380 Emeryville, CA 94608 USA Tel) 1-510-428-7296 Fax) 1-510-547-9681 klee@gracenote.com kglee@ccrma.stanford.edu http://ccrma.stanford.edu/~kglee EDUCATION 2002-2008 Ph.D. in Computer-Based Music Theory and Acoustics Center for Computer Research in Music and Acoustics (CCRMA) Stanford University, CA, USA 2005-2007 M.S. in Electrical Engineering Stanford University, CA, USA 2000-2002 M.M. in Music Technology New York University, NY, USA 1992-1996 Electrical Engineering Seoul National University, Seoul, Korea DISSERTATION A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio This dissertation discusses a statistical model for automatically identifying musical chords from the raw audio, and demonstrates several potential applications such as music segmentation, music summarization, and music similarity finding. In order to avoid the enormously time-consuming and laborious process of manual annotation, which must be done to provide the ground-truth to the supervised learning models, symbolic data like MIDI files are used to obtain a large amount of labeled training data. The experimental results show that the proposed system not only yields the chord recognition performance comparable to or better than other previously published systems, but also provides additional information of key and/or genre without using any other algorithms or feature sets for such tasks. It is also demonstrated that the chord sequence with precise timing can be successfully used to find cover songs from audio and to detect musical phrase boundaries by recognizing the cadences or harmonic closures. Advisor: Julius Smith Reading Committee: Jonathan Berger, Chris Chafe, Malcolm Slaney CV for Kyogu Lee, Page 1/5

ACADEMIC AWARDS 2006 2 nd place in the MIREX task on Audio Cover Song Identification 2005-2006 Dorothy Culver Haynie Fellowship, Stanford University 2002-2003 Department Fellowship, Stanford University 1994 2 nd place in the first International Robot Contest, Japan MEMBERSHIPS 2006-present 2008-present Institute of Electrical and Electronics Engineers (IEEE), Member Association of Computing Machinery (ACM), Member 2006-2007 Acoustical Society of America (ASA), Student Member 2005-2007 International Computer Music Association (ICMA), Student Member REVIEW ACTIVITIES IEEE Transactions on Speech, Audio, and Language Processing Computer Music Journal International Symposium on Music Information Retrieval TEACHING/RESEARCH INTERESTS Music/Multimedia and Semantic Web Music/Multimedia Information Retrieval Multimedia Content Analysis Machine Learning Methods for Multimedia Applications Computational Model of Music Perception/Cognition Complex Data Sonification Digital Audio Signal Processing CV for Kyogu Lee, Page 2/5

TEACHING EXPERIENCE Summer 2007 Instructor, CCRMA Summer Workshop in Korea, Yonsei University, Seoul, Korea Course: Music Information Retrieval - designed core curriculum (lectures and labs) - delivered lectures and led lab sessions 2003-2005 Teaching Assistant, Music/Electrical Engineering, Stanford University, CA, USA - developed weekly problem sets - evaluated course performance - held regular office hours - conducted student advising/counseling. Spring 2005 Winter 2005 Music/Audio Applications of the FFT Perceptual Audio Coding Fall 2004 Auditory Remapping of Bioinformatics Spring 2004 Winter 2004 Elements of Music Theory Compositional Algorithms, Psychoacoustics, and Spatial Processing Fall 2003 Introduction to Digital Signal Processing PROFESSIONAL EXPERIENCE Multimedia Researcher, Gracenote Inc., Emeryville, CA, USA Research and development in multimedia content analysis for efficient and effective search/retrieval 2005-2007 Research Assistant, Stanford University, CA, USA Research topic: audio content analysis, music information retrieval. Summer 2006 Research Engineer, Gracenote Inc., Emeryville, CA, USA Designed algorithms to compute musical complexity from the raw audio. Summer 2006 Research Engineer, Sennheiser Palo Alto Research Center, Palo Alto, CA, USA Designed algorithms and built a prototype for virtual surround with multi-driver technologies. 2003-2004 Research Assistant, Stanford University (funded by DARPA), CA, USA Member of a team who designed algorithms for Sonification of Complex Data. Designed various mapping schemes using vocal synthesis models to sonify hyperspectral colon tissue images. 2001-2002 Software Engineer, Soundball Inc., NY, USA Designed TestSuite and Opcode Library for SAOL Compiler/Player. CV for Kyogu Lee, Page 3/5

PROFESSIONAL EXPERIENCE CONTINUTED 1996-1999 Software/Hardware Engineer, MIRAE Corporation, Chonan, Korea Developed hardware/software for motion controller and I/O controller. Conducted equipments maintenance. PUBLICATIONS Lee, K. (2008). A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov Models Trained on Synthesized Audio, Ph.D. thesis, Stanford University Lee, K., & Slaney, M. (2008). Acoustic Chord Transcription and Key Extraction from Audio Using Key- Dependent HMMs Trained on Synthesized Audio, The IEEE Transactions on Audio, Speech and Language Processing, 16(2), pp. 291-301 Lee, K. & Slaney, M. (2007). A Unified System for Chord Transcription and Key Extraction from Audio Using Hidden Markov Models, in Proceedings of International Conference on Music Information Retrieval Lee, K. (2007). A System for Automatic Chord Transcription Using Genre-Specific HMMs, in Proceedings of International Workshop on Adaptive Multimedia Retrieval Lee, K., & Slaney, M. (2006). Automatic Chord Recognition from Audio Using a Supervised HMM Trained with Audio-from-Symbolic Data, in Proceedings of Audio and Music Computing for Multimedia Workshop in conjunction with ACM Multimedia Lee, K., & Slaney, M. (2006). Automatic Chord Recognition Using an HMM with Supervised Learning, in Proceedings of International Symposium in Music Information Retrieval Lee, K., (2006). Automatic Chord Recognition Using Enhanced Pitch Class Profile, in Proceedings of International Computer Music Conference Lee, K. & Kim, M. (2005). Estimating the Amplitude of the Cubic Difference Tone Using a Third-Order Adaptive Volterra Filter, in Proceedings of International Conference on Digital Audio Effects Lee, K., Sell, G. & Berger, J. (2005). Sonification Using Digital Waveguide And 2- and 3-Dimensional Digital Waveguide Mesh, in Proceedings of International Conference on Auditory Display Master, A. & Lee, K. (2005). Explicit Onset Modeling Using Time Reassignment, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing Lee, K. & Smith, J. (2004). Implementation of a Highly Diffusing 2-D Digital Waveguide Mesh with a Quadratic Residue Diffuser, in Proceedings of International Computer Music Conference Cassidy, R., Berger, J. & Lee, K. (2004). Analysis of hyperspectral colon tissue images using vocal synthesis models, in IEEE Asilomar Conference On Signals, Systems, and Computers Cassidy, R., Berger, J. & Lee, K. (2004). Auditory Display of Hyperspectral Colon Tissue Images Using Vocal Synthesis Models, in Proceedings of International Conference on Auditory Display CV for Kyogu Lee, Page 4/5

INVITED TALKS 2007 Toward Content-based Music Information Retrieval: A System for Chord Transcription, Key Extraction, and its Applications, Korea University, Seoul, Korea 2007 Toward Content-based Music Information Retrieval: A System for Chord Transcription, Key Extraction, and its Applications, AOL Labs, Mountain View, CA, USA 2007 A System for Chord Transcription, Key Extraction, and Cadence Recognition Using Hidden Markov Models, Chung-ang University, Seoul, Korea 2006 Automatic Chord Recognition using Hidden Markov Models Trained on Audio-from- Symbolic data, Korea Institute of Science and Technology, Daejeon, Korea 2006 Music Similarity Finding Using Middle-Level Harmonic Content of Musical Audio, Samsung Advanced Institute of Technology, Suwon, Korea 2005 Application of Quadratic Residue Diffuser in a 2-Dimensional Digital Waveguide Mesh, Chung-ang University, Seoul, Korea 2005 Effective Vowel Classification Using Auditory Model-based Front End, Korea Institute of Science and Technology, Daejeon, Korea REFERENCES Prof. Julius Smith Center for Computer Research in Music and Acoustics Department of Music and Electrical Engineering, Stanford University 660 Lomita Drive, Stanford, CA 94305, USA 1-650-723-4971 jos@ccrma.stanford.edu Dr. Malcolm Slaney Principal Scientist, Consulting Professor Yahoo! Research and Stanford University 701 North First Street Sunnyvale, CA 94089, USA 1-408-242-1586 malcolm@ieee.org Prof. Jonathan Berger Center for Computer Research in Music and Acoustics Department of Music, Stanford University 660 Lomita Drive, Stanford, CA 94305, USA 1-650-723-4971 brg@ccrma.stanford.edu CV for Kyogu Lee, Page 5/5