Singer Identification

Similar documents
Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

638 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Automatic Laughter Detection

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Laughter Detection

Speech and Speaker Recognition for the Command of an Industrial Robot

Music Genre Classification and Variance Comparison on Number of Genres

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

CULTIVATING VOCAL ACTIVITY DETECTION FOR MUSIC AUDIO SIGNALS IN A CIRCULATION-TYPE CROWDSOURCING ECOSYSTEM

Acoustic Scene Classification

Chord Classification of an Audio Signal using Artificial Neural Network

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Singer Recognition and Modeling Singer Error

Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution

A Music Retrieval System Using Melody and Lyric

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Mood Tracking of Radio Station Broadcasts

Automatic Music Genre Classification

Normalized Cumulative Spectral Distribution in Music

Speech Recognition Combining MFCCs and Image Features

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

Recognising Cello Performers Using Timbre Models

Subjective Similarity of Music: Data Collection for Individuality Analysis

Musical Examination to Bridge Audio Data and Sheet Music

MUSI-6201 Computational Music Analysis

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Supervised Learning in Genre Classification

SINCE the lyrics of a song represent its theme and story, they

A Survey of Audio-Based Music Classification and Annotation

Proposal for Application of Speech Techniques to Music Analysis

Improving Frame Based Automatic Laughter Detection

The song remains the same: identifying versions of the same piece using tonal descriptors

A Categorical Approach for Recognizing Emotional Effects of Music

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

TIMBRE AND MELODY FEATURES FOR THE RECOGNITION OF VOCAL ACTIVITY AND INSTRUMENTAL SOLOS IN POLYPHONIC MUSIC

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Audio-Based Video Editing with Two-Channel Microphone

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

Music Information Retrieval

MODELING OF PHONEME DURATIONS FOR ALIGNMENT BETWEEN POLYPHONIC AUDIO AND LYRICS

Singer Traits Identification using Deep Neural Network

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

Interactive Classification of Sound Objects for Polyphonic Electro-Acoustic Music Annotation

Features for Audio and Music Classification

Singing Voice Detection for Karaoke Application

THE importance of music content analysis for musical

Recognising Cello Performers using Timbre Models

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND

Classification of Timbre Similarity

Clustering Streaming Music via the Temporal Similarity of Timbre

Research Article Drum Sound Detection in Polyphonic Music with Hidden Markov Models

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

A Survey on: Sound Source Separation Methods

Parameter Estimation of Virtual Musical Instrument Synthesizers

Music Information Retrieval with Temporal Features and Timbre

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

Content-based Music Structure Analysis with Applications to Music Semantics Understanding

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

An Accurate Timbre Model for Musical Instruments and its Application to Classification

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

A QUERY-BY-EXAMPLE TECHNIQUE FOR RETRIEVING COVER VERSIONS OF POPULAR SONGS WITH SIMILAR MELODIES

Low-Latency Instrument Separation in Polyphonic Audio Using Timbre Models

Smart Traffic Control System Using Image Processing

Unisoner: An Interactive Interface for Derivative Chorus Creation from Various Singing Voices on the Web

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

Cross-Dataset Validation of Feature Sets in Musical Instrument Classification

Figure 1: Feature Vector Sequence Generator block diagram.

Musical Instrument Recognizer Instrogram and Its Application to Music Retrieval based on Instrumentation Similarity

A Survey Of Mood-Based Music Classification

ISSN ICIRET-2014

Singing Pitch Extraction and Singing Voice Separation

Proc. of NCC 2010, Chennai, India A Melody Detection User Interface for Polyphonic Music

WE ADDRESS the development of a novel computational

MODELS of music begin with a representation of the

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

The Million Song Dataset

Unisoner: An Interactive Interface for Derivative Chorus Creation from Various Singing Voices on the Web

1602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 8, NOVEMBER 2009

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

Chapter 1 Introduction to Sound Scene and Event Analysis

Analysing Musical Pieces Using harmony-analyser.org Tools

Music structure information is

pitch estimation and instrument identification by joint modeling of sustained and attack sounds.

AUTOMATIC IDENTIFICATION FOR SINGING STYLE BASED ON SUNG MELODIC CONTOUR CHARACTERIZED IN PHASE PLANE

A Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music

Recommending Music for Language Learning: The Problem of Singing Voice Intelligibility

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

Transcription:

Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27

Outline 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 2 / 27

Outline Introduction 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 3 / 27

Introduction Applications Singer Identification is to be (has been) applied on pop music mainly Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 4 / 27

Introduction Applications Automatically label data for which no/or not much information is available recognize the singer Distinguish between original version of a song and cover songs Copyright enforcement: recording companies could scan bootleg sites on the internet to check if there are any unauthorized recorded versions of a concert [Kim, 2002 and Tsai and Wang, 2006] Music recommendation systems could use singer identification to group singers with same voice characteristics. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 5 / 27

Introduction Applications Automatically label data for which no/or not much information is available recognize the singer Distinguish between original version of a song and cover songs Copyright enforcement: recording companies could scan bootleg sites on the internet to check if there are any unauthorized recorded versions of a concert [Kim, 2002 and Tsai and Wang, 2006] Music recommendation systems could use singer identification to group singers with same voice characteristics. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 5 / 27

Introduction Applications Automatically label data for which no/or not much information is available recognize the singer Distinguish between original version of a song and cover songs Copyright enforcement: recording companies could scan bootleg sites on the internet to check if there are any unauthorized recorded versions of a concert [Kim, 2002 and Tsai and Wang, 2006] Music recommendation systems could use singer identification to group singers with same voice characteristics. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 5 / 27

Introduction Applications Automatically label data for which no/or not much information is available recognize the singer Distinguish between original version of a song and cover songs Copyright enforcement: recording companies could scan bootleg sites on the internet to check if there are any unauthorized recorded versions of a concert [Kim, 2002 and Tsai and Wang, 2006] Music recommendation systems could use singer identification to group singers with same voice characteristics. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 5 / 27

Introduction Challenges Singing Voice = hybrid btw speech and musical instrument create specific methods of analysis. In pop music, voice is never heard alone: presence of accompaniement Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 6 / 27

Introduction Challenges Singing Voice = hybrid btw speech and musical instrument create specific methods of analysis. In pop music, voice is never heard alone: presence of accompaniement Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 6 / 27

Outline Feature Extraction 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 7 / 27

Feature Extraction As seen in the previous diagrams: need to extract some features from the sounds. Features used: MFCC (Mel-Frequency Cepstral Coefficient) MDCT (Modified Discrete Cosine Transform) LPCC (Linear Predictive Coding Coefficients) WLPCC (Warped...) Cepstral Coefficients of the LPC spectrum LPMFCC (MFCC of the LPC spectrum) Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 8 / 27

Feature Extraction As seen in the previous diagrams: need to extract some features from the sounds. Features used: MFCC (Mel-Frequency Cepstral Coefficient) MDCT (Modified Discrete Cosine Transform) LPCC (Linear Predictive Coding Coefficients) WLPCC (Warped...) Cepstral Coefficients of the LPC spectrum LPMFCC (MFCC of the LPC spectrum) Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 8 / 27

Outline Vocal/NonVocal Region Segmentation 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 9 / 27

Principle Vocal/NonVocal Region Segmentation Difference in spectrum between voiced regions and accompaniement-only: hamonicity of the voice. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 10 / 27

Vocal/NonVocal Region Segmentation Voice/Accompaniement Spectra Fig.1 [Tsai and Wang, 2006] Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 11 / 27

Tsai s Approach Vocal/NonVocal Region Segmentation GMM-based methods Fig.1 [Tsai, 2004] Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 12 / 27

Tsai s Approach Vocal/NonVocal Region Segmentation GMM-based methods This method is supposed to yield 82.3% accuracy [Tsai and Wang, 2006] Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 13 / 27

Vocal/NonVocal Region Segmentation Fujihara s Approach GMM-based methods from Fig.1 [Fujihara 2005] Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 14 / 27

Vocal/NonVocal Region Segmentation GMM-based methods The GMM classification between Vocal and Non Vocal is done on the resynthesized signal. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 15 / 27

Outline Classification 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 16 / 27

3 main strategies Classification GMM SVM k-nn Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 17 / 27

Classification GMM GMM Method with Solo Voice Modeling Fig.3 [Tsai and Wang, 2006] Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 18 / 27

Outline Results 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 19 / 27

Performance Results Kim and Whitman 2002 45% Liu and Huang, 2002 80 % Tsai and Wang, 2006, Fujihara et al., 2005 95% Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 20 / 27

Outline Conclusion 1 Introduction Applications Challenges 2 Feature Extraction 3 Vocal/NonVocal Region Segmentation GMM-based methods 4 Classification GMM 5 Results 6 Conclusion Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 21 / 27

Good Conclusion Singer identification yields satisfactory results. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 22 / 27

But... Conclusion Only one article tackles Target Singer Detection or Target Singer Tracking: [Tsai and Wang 2006]. results are not perfect for duet but are better than doing GMM without solo modeling. Specific to pop music what happens with a cappela singers? Specific to on geographical area (Asia) important because of voice mix Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 23 / 27

But... Conclusion Only one article tackles Target Singer Detection or Target Singer Tracking: [Tsai and Wang 2006]. results are not perfect for duet but are better than doing GMM without solo modeling. Specific to pop music what happens with a cappela singers? Specific to on geographical area (Asia) important because of voice mix Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 23 / 27

But... Conclusion Only one article tackles Target Singer Detection or Target Singer Tracking: [Tsai and Wang 2006]. results are not perfect for duet but are better than doing GMM without solo modeling. Specific to pop music what happens with a cappela singers? Specific to on geographical area (Asia) important because of voice mix Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 23 / 27

But... Conclusion Only one article tackles Target Singer Detection or Target Singer Tracking: [Tsai and Wang 2006]. results are not perfect for duet but are better than doing GMM without solo modeling. Specific to pop music what happens with a cappela singers? Specific to on geographical area (Asia) important because of voice mix Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 23 / 27

Bibliography I Conclusion Fujihara, H., T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, 2005. Singer identification based on accompaniment sound reduction and reliable frame selection. In Proceedings of the International Conference on Music Information Retrieval. Kim, Y. E. and B. Whitman, 2002. Singer identification in popular music recordings using voice coding features. In Proceedings of the International Conference on Music Information Retrieval. Liu, C.-C. and C.-S. Huang, 2002. A singer identification technique for content-based clas- sification of MP3 music objects. In Proceedings of the eleventh International Conference on Information and Knowledge Management. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 24 / 27

Bibliography II Conclusion Tsai, W.-H. and H.-M. Wang, 2004. Automatic detection and tracking of target singer in multi-singer music recordings. In Proceedings of the 2004 IEEE International Conferecence on Acoustics, Speech and Signal Processing, vol. 4. pp. 221 224. Tsai, W.-H. and H.-M. Wang, 2006. Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. IEEE Transactions on Audio, Speech and Language Processing, vol. 14: 330 341. Zhang, T., 2003. Automatic singer identification. In Proceedings of the 2003 International Conference on Multimedia and Expo, vol. 1., pp. 33 36. Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 25 / 27

Conclusion Questions? Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 26 / 27