Singing Pitch Extraction and Singing Voice Separation

Similar documents
SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION

Voice & Music Pattern Extraction: A Review

Music Radar: A Web-based Query by Humming System

Lecture 9 Source Separation

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Lecture 10 Harmonic/Percussive Separation

Efficient Vocal Melody Extraction from Polyphonic Music Signals

THE importance of music content analysis for musical

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Singer Traits Identification using Deep Neural Network

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

Effects of acoustic degradations on cover song recognition

ON THE USE OF PERCEPTUAL PROPERTIES FOR MELODY ESTIMATION

Singer Identification

A Music Retrieval System Using Melody and Lyric

A COMPARISON OF MELODY EXTRACTION METHODS BASED ON SOURCE-FILTER MODELLING

Query By Humming: Finding Songs in a Polyphonic Database

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

SINGING VOICE MELODY TRANSCRIPTION USING DEEP NEURAL NETWORKS

Music Information Retrieval with Temporal Features and Timbre

Introductions to Music Information Retrieval

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Topics in Computer Music Instrument Identification. Ioanna Karydi

Transcription of the Singing Melody in Polyphonic Music

Subjective Similarity of Music: Data Collection for Individuality Analysis

SINGING VOICE ANALYSIS AND EDITING BASED ON MUTUALLY DEPENDENT F0 ESTIMATION AND SOURCE SEPARATION

Music Source Separation

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Automatic Rhythmic Notation from Single Voice Audio Sources

Topic 10. Multi-pitch Analysis

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

Chord Classification of an Audio Signal using Artificial Neural Network

Music Information Retrieval for Jazz

REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

COMBINING MODELING OF SINGING VOICE AND BACKGROUND MUSIC FOR AUTOMATIC SEPARATION OF MUSICAL MIXTURES

Acoustic Scene Classification

Audio-Based Video Editing with Two-Channel Microphone

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

Vocal Melody Extraction from Polyphonic Audio with Pitched Accompaniment

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

A Survey of Audio-Based Music Classification and Annotation

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

A Survey on: Sound Source Separation Methods

Further Topics in MIR

Lecture 15: Research at LabROSA

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

The Million Song Dataset

Music Information Retrieval Community

Singer Recognition and Modeling Singer Error

Content-based Music Structure Analysis with Applications to Music Semantics Understanding

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

Video-based Vibrato Detection and Analysis for Polyphonic String Music

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

/$ IEEE

MODELING OF PHONEME DURATIONS FOR ALIGNMENT BETWEEN POLYPHONIC AUDIO AND LYRICS

NEURAL NETWORKS FOR SUPERVISED PITCH TRACKING IN NOISE. Kun Han and DeLiang Wang

Retrieval of textual song lyrics from sung inputs

Automatic music transcription

A. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) =

arxiv: v1 [cs.sd] 4 Jun 2018

Pitch-Synchronous Spectrogram: Principles and Applications

MELODY EXTRACTION FROM POLYPHONIC AUDIO OF WESTERN OPERA: A METHOD BASED ON DETECTION OF THE SINGER S FORMANT

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Data Driven Music Understanding

A Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Classification of Timbre Similarity

Music Information Retrieval

Singing voice synthesis based on deep neural networks

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Music Perception with Combined Stimulation

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Proc. of NCC 2010, Chennai, India A Melody Detection User Interface for Polyphonic Music

Automatic Piano Music Transcription

MUSI-6201 Computational Music Analysis

CURRENT CHALLENGES IN THE EVALUATION OF PREDOMINANT MELODY EXTRACTION ALGORITHMS

MODELS of music begin with a representation of the

Music Segmentation Using Markov Chain Methods

Singing Voice Detection for Karaoke Application

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data

Transcription:

Singing Pitch Extraction and Singing Voice Separation Advisor: Jyh-Shing Roger Jang Presenter: Chao-Ling Hsu Multimedia Information Retrieval Lab (MIR) Department of Computer Science National Tsing Hua University

Outline Introduction Binary Mask based Separation System Overview Proposed Method Voiced Singing Separation Unvoiced Singing Separation Evaluation Conclusions 2011/1/24 MIR, CS, NTHU 2

Problems Come with Music Accompaniment Many applications encounter difficulties when music accompaniment is present Music accompaniment acts like noise and interferes with the analysis of the singing voice. Solution: singing voice separation 3

Goal of the Dissertation Goal: Separate singing voice from back ground music 2011/1/24 MIR, CS, NTHU 4

Comparison of Speech and Singing Voice Separation Singing voice separation is similar to speech separation with similar applications: speech recognition : lyrics recognition speaker identification : singer identification subtitle alignment : lyric alignment Differences: Correlation of background noise and target Noise type Speech Uncorrelated Periodic/aperiodic, Narrow band/broad band Singing Strong correlated Most periodic, Most broad band Target pitch range 80~500Hz Up to 1400 Hz 2011/1/24 MIR, CS, NTHU 5

Binary Mask based Separation Cochleagram Waveform Clean Speech Noisy Speech Apply Separated Speech 2011/1/24 MIR, CS, NTHU 6

System Overview Polyphonic Song Segmentation Stage Stage 1 Time-Frequency Decomposition A/U/V Detection Voiced Frames Unvoiced Frames Grouping Stage Stage 2 Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Resynthesis Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 7

Time-Frequency Decomposition Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 8

Time-Frequency Decomposition Input Song Gammatone filterbank Framing Frequency channels T F unit Time frames 2011/1/24 MIR, CS, NTHU 9

Time-Frequency Decomposition Example 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 2 4 6 8 10 12 14 4 x 10 0.8 Input song signal 0.6 0.4 0.2 0 0.2 0.4 0.8 0.6 0.6 0.8 0.4 1 2 0.2 4 6 8 10 12 14 4 x 10 0 0.2 0.4 0.6 1 2 4 6 8 10 12 14 4 x 10 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 2 4 6 8 10 12 14 4 x 10 0.1 0.05 0 0.05 0.1 0.15 2 4 6 8 10 12 14 4 x 10 1. Frequency decomposition 2011/1/24 2. Time decomposition MIR, CS, NTHU 10

A/U/V Detection Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 11

A/U/V Detection This block performs an Accompaniment/ Unvoiced sound/ Voiced sound detection. An hidden Markov model (HMM) is employed: S ˆ = arg max p( x0 s0 ) t t t t 1) S t { p( x s ) p( s s } s A s U s V p x s ) x s ) ( A p p x s ) ( U ( V Output Likelihood Output likelihood Output likelihood 12

Examples of A/U/V Detection A U V U V U V 2011/1/24 MIR, CS, NTHU 13

Voiced Singing Separation Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 14

Voiced Singing Separation Three improvements for singing pitch extraction a) Vocal Component Enhancement b) Trend Estimation c) A tandem algorithm for singing pitch extraction and voice separation 2011/1/24 MIR, CS, NTHU 15

The tandem algorithm (1) The tandem algorithm [Hu and Wang 2010] has been shown very effective and robust for separating speech from varied noises. It performs pitch estimation and voice separation jointly and iteratively. However, it does not help when the intrusion type is music. 2011/1/24 MIR, CS, NTHU 16

The tandem algorithm (2) N1 white noise, N2 noise bursts, N3 cocktail party noise, N4 rock music, N5 siren, N6 trill telephone, N7 female speech, N8 male speech, and N9 female speech. 2011/1/24 MIR, CS, NTHU 17

Schematic Diagram of Voiced Singing Separation Mixture Trend Estimation Singing Voice Detection Separated Singing Singing Pitch Extraction Singing Voice Separation Iterative procedure 2011/1/24 MIR, CS, NTHU 18

Trend estimation Objective: Estimates a rough pitch range of the target singing voice. Mixture Estimated Trend Vocal Component Enhancement Pitch Range Estimation 2011/1/24 MIR, CS, NTHU 19

Trend estimation-vocal component enhancement (1) We employ HPSS (Harmonic/Percussive Sound Separation) [Tachibana2010] to enhance the singing 8000 7000 voice. 6000 5000 This method uses information of 4000 3000 anisotropic smoothness of the sounds. 2000 Harmonic sound: smooth in temporal direction because they are sustained and periodic for a while Percussive sound: 0 0 1 2 3 4 5 6 Time smooth in frequency direction because they are instantaneous and aperiodic HPSS exploits anisotropic smoothness of harmonic 2011/1/24 sound and percussive MIR, CS, sound NTHU to separate them. 20 Frequency 1000

Trend estimation-vocal component enhancement (2) HPSS is designed as an optimization problem to minimize an objective function as: 2 2 γ γ ( J H, P) = H ( t, ω) dtdω + P( t, ω) dtdω t ω And H ( t, ω) + P( t, ω) = W ( t, ω) H(t, ω) and P(t, ω) are complex spectrograms of harmonic sound and percussive sound to be estimated. W(t, ω) is the spectrogram of the original signal γ is an exponential constant approximately 0.6 (to imitate the auditory systems) 2011/1/24 MIR, CS, NTHU 21

Trend estimation-vocal component enhancement (3) How to enhance singing voice from background music? Frequency of singing voice fluctuates more than that of instruments such as guitar and piano. Using long STFT window size: the spectrogram has low temporal resolution and high frequency resolution. 2011/1/24 MIR, CS, NTHU 22

Trend estimation-vocal component enhancement (4) window size: 64 ms window size: 256 ms 900 900 800 800 music Frequency 700 600 500 Frequency 700 600 500 400 400 300 300 200 200 100 1.8 2 2.2 2.4 2.6 2.8 Time 1600 100 1.8 2 2.2 2.4 2.6 2.8 Time 1600 1400 1400 1200 1200 vocal Frequency 1000 800 Frequency 1000 800 600 600 400 400 200 200 2011/1/24 MIR, CS, NTHU 23 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 Time Time

Trend estimation-vocal component enhancement (5) H Mixture P 2011/1/24 MIR, CS, NTHU 24

Pitch Range Estimation (1) 1200 1000 Frequency (Hz) 800 600 400 200 1 2 3 4 5 6 7 8 9 10 11 Time (Secs) 2011/1/24 The spectrogram MIR, CS, NTHUafter HPSS 25

Pitch Range Estimation (2) 180 Frequency bin (Step by 0.25 Semitone) 160 140 120 100 80 60 40 20 1 2 3 4 5 6 7 8 9 10 11 Time (Secs) MR-FFT 2011/1/24 MIR, CS, NTHU 26

Pitch Range Estimation (3) 180 Frequency bin (Step by 0.25 Semitone) 160 140 120 100 80 60 40 20 1 2 3 4 5 6 7 8 9 10 11 Time (Secs) Overtone deletion 2011/1/24 MIR, CS, NTHU 27

Pitch Range Estimation (4) 11 10 9 8 7 6 5 4 3 2 1 2 4 6 8 10 12 14 2011/1/24 MIR, CS, NTHU 28

Pitch Range Estimation (5) 1200 1000 Frequency (Hz) 800 600 400 200 1 2 3 4 5 6 7 8 9 10 11 Time (Secs) 2011/1/24 MIR, CS, NTHU 29

Pitch Extraction Result of the Tandem Algorithm (1) No HPSS, No Trend Estimation 2011/1/24 MIR, CS, NTHU 30

Pitch Extraction Result of the Tandem Algorithm (2) With HPSS, No Trend Estimation 2011/1/24 MIR, CS, NTHU 31

Pitch Extraction Result of the Tandem Algorithm (3) With HPSS and Trend Estimation 2011/1/24 MIR, CS, NTHU 32

Pitch Extraction Result of the Tandem Algorithm (4) Post Processing 2011/1/24 MIR, CS, NTHU 33

Mask Comparison IBM Proposed 120 120 100 100 80 80 60 60 40 40 20 20 120 50 100 150 200 250 300 350 400 450 500 Li_Wang 2007 50 100 150 200 250 300 350 400 450 500 Mixture (0dB) 100 80 60 40 2011/1/24 MIR, CS, NTHU 34 50 100 150 200 250 300 350 400 450 500

Evaluation for Voiced Singing Separation Datasets: MIR-1K 1000 clips of Chinese pop music 4 to 13 seconds for each clip, total length is 133 minutes Each clip contains 2 tracks, one is singing voice and the other one is music accompaniment. Mix the two tracks at -5 db, 0 db, and 5 db SNR for evaluation. 2011/1/24 MIR, CS, NTHU 35

Evaluation of Singing voice detection -5 db 0 db 5 db Accuracy (%) Accuracy (%) Accuracy (%) 100 90 80 70 100 90 80 70 100 90 80 70 Performance of Singing Voice Detection (a) Precision Recall Overall Accuracy (b) Precision Recall Overall Accuracy (c) Precision Recall Overall Accuracy without HPSS with HPSS 2011/1/24 MIR, CS, NTHU 36

Evaluation of Trend Estimation and Singing Pitch Extraction (1) 100 (a) Voiced Only (b) Oveall Results with Vocal Detection 100 (c) Oveall Results without Vocal Detection 100 90 90 90 Percent of correct detection 80 70 60 50 40 Percent of correct detection 80 70 60 50 40 Percent of correct detection 80 70 60 50 40 30 30 30 20-5 db 0 db 5 db Mixture SNR 20-5 db 0 db 5 db Mixture SNR -5 db 0 db 5 db Mixture SNR Trend estimation Before post-processing* Proposed Hu-Wang 2010 (with HPSS)* Hu-Wang 2010 (without HPSS)* Li-Wang 2007 2011/1/24 MIR, CS, NTHU 37 20

Evaluation of Trend Estimation and Singing pitch extraction (2) Winner of MIREX 2010 Audio Melody Extraction task 100 Combined results of MIREX 2009 and 2010 (Based on clips with vocal only) 90 80 70 60 50 40 30 20 10 0 HJ1 (2010, proposed) TOOS1 (2010) JJY2 (2010) JJY1 (2010) SG1 (2010) cl1 (2009) cl2 (2009) dr1 (2009) dr2 (2009) hjc1 (2009) hjc2 (2009) jjy (2009) kd (2009) mw (2009) pc (2009) rr (2009) toos (2009) ADC2004 (12 MIREX2005 clips) (15 clips)indian08 MIREX09 0dB MIREX09-5dBMIREX09 +5dB Average 2011/1/24 MIR, CS, NTHU 38

Evaluation of Singing voice separation (SNR gain) 12 (a) Voiced Only 12 (b) Overall Results with Vocal Detection SNR gain of separated target (db) 10 8 6 4 2 0-2 SNR gain of separated target (db) 10 8 6 4 2 0-2 -4-4 -5 db 0 db 5 db -5 db 0 db 5 db Mixture SNR Mixture SNR IBM Ideal pitch Proposed Li-Wang 2007 (ideal pitch) 2011/1/24 MIR, CS, NTHU Li-Wang 2007 (estimated pitch) Ozerov 2007 39

Unvoiced Singing Separation Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 40

What is an Unvoiced Sound? Unvoiced sounds 2011/1/24 MIR, CS, NTHU 41

Unvoiced Singing Separation To identify the T-F units of the unvoiced frames that are dominated by the unvoiced sounds Mel-scale Frequency Cepstral Coefficient (MFCC) are used. 2 GMMs are trained for each frequency channel. Establish binary masks for the unvoiced part of the singing voice 20 20 20 40 40 40 60 60 60 80 80 80 100 100 100 120 120 120 20 40 60 80 100 120 140 160 Before unvoiced T F units identification 20 40 60 80 100 120 140 160 After unvoiced T F units identification 20 40 60 80 100 120 140 160 The ideal binary mask 2011/1/24 MIR, CS, NTHU 42

Evaluation of Unvoiced Singing voice separation (SNR gain) Overall GNSDR (db) 1 0.5 0 0.5 1 1.5 2 2.5 3 GNSDR of separated singing voice (a) GNSDR in unvoiced frames (db) GNSDR of separated unvoiced singing voice (b) 10 8 6 4 2 0 2 4 6 8 3.5 5 0 5 10 5 0 5 Mixture SNR (db) Mixture SNR (db) Ozerov s method Li and Wang s method Proposed method 2011/1/24 MIR, CS, NTHU 43

Sound Demos (1) Lyrics : 人說情歌總是老的好 2011/1/24 MIR, CS, NTHU 44

Sound Demos (2) Lyrics : 時間在逃亡 2011/1/24 MIR, CS, NTHU 45

Sound Demos (3) Lyrics : 然而談的情, 說的愛不夠喔, 說來就來說走就走喔 2011/1/24 MIR, CS, NTHU 46

Conclusions We proposed an extended tandem algorithm for singing pitch extraction and singing voice separation from music accompaniment. A trend estimation algorithm is proposed to estimate the pitch range of singing voice. The HPSS is employed to improve the performance of singing voice detection. A post processing is proposed to deal with the sequential grouping problem. The first unvoiced singing voice separation method for pitch-based inference method is introduced. 2011/1/24 MIR, CS, NTHU 47

Publications (1) Journal Papers: 1. Chao-Ling Hsu and Jyh-Shing Roger Jang, On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset, IEEE Trans. Audio, Speech, and Language Processing, volume 18, issue 2, p.p 310-319, 2010. 2. Chao-Ling Hsu, DeLiang Wang, and Jyh-Shing Roger Jang, A Tandem Algorithm for Singing Pitch Extraction and Voice Separation from Music Accompaniment, submitted to IEEE Trans. Audio, Speech, and Language Processing. Conference Papers: 1. Chao-Ling Hsu, DeLiang Wang, and Jyh-Shing Roger Jang, A Trend Estimation Algorithm for Singing Pitch Detection in Musical Recordings, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May. 2011. 2. Chao-Ling Hsu and Jyh-Shing Roger Jang, Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion, International Society for Music Information Retrieval (ISMIR), Utrecht, Netherlands, Aug. 2010. 3. Chao-Ling Hsu and Jyh-Shing Roger Jang, Singing Pitch Extraction at MIREX 2010, extended abstract in International Society for Music Information Retrieval (ISMIR), Utrecht, Netherlands, Aug. 2010. (Rank 1st for vocal songs in audio melody extraction competition) 4. Chao-Ling Hsu, Liang-Yu Chen, Jyh-Shing Roger Jang, and Hsing-Ji Li, Singing Pitch Extraction From Monaural Polyphonic Songs By Contextual Audio Modeling and Singing Harmonic Enhancement, International Society for Music Information Retrieval (ISMIR), Kobe, Japan, Oct. 2009. 5. Chao-Ling Hsu, Liang-Yu Chen, and Jyh-Shing Roger Jang, Singing Pitch Extraction at MIREX 2009, extended abstract in International Society for Music Information Retrieval (ISMIR), Kobe, Japan, Oct. 2009. 2011/1/24 MIR, CS, NTHU 48

Publications (2) 6. Chao-Ling Hsu, Jyh-Shing Roger Jang, and Te-Lu Tsai, "Separation of Singing Voice from Music Accompaniment with Unvoiced Sounds Reconstruction for Monaural Recordings", Proceedings of 125th American Engineering Society Convention (AES), San Francisco, USA, Oct. 2008. 7. Jyh-Shing Roger Jang Nien-Jung Lee, and Chao-Ling Hsu, "Simple But Effective Methods for QBSH at MIREX 2006", extended abstract in International Symposium on Music Information Retrieval (ISMIR), Victoria, Canada, Oct. 2006. 8. Ruo-Han Chen, Chao-Ling Hsu, Jyh-Shing Roger Jang, Fong-Jhu Luo, "Content-based Music Emotion Recognition ", Workshop on Computer Music and Audio Technology, Taipei, Taiwan, March 2006. 9. Jyh-Shing Roger Jang, Chao-Ling Hsu, Hong-Ru Lee, "Continuous HMM and Its Enhancement for Singing/Humming Query Retrieval", International Symposium on Music Information Retrieval (ISMIR), London, UK, Sept 2005. 10. Chao-Ling Hsu, Hong-Ru Lee, Jyh-Shing Roger Jang, "On the Improvement and Error Analysis of Singing/Humming Query Retrieval", Workshop on Computer Music and Audio Technology, Taipei, Taiwan, March 2005. 11. Hong-Ru Lee, Chao-Ling Hsu, Yi-Cin Wang, Jyh-Shing Roger Jang, " Multi-modal Music Retrieval System", Conference on Digital Archive Technology, Taipei, Taiwan, 2004. 2011/1/24 MIR, CS, NTHU 49