Singing Pitch Extraction and Singing Voice Separation

Size: px
Start display at page:

Download "Singing Pitch Extraction and Singing Voice Separation"

Transcription

1 Singing Pitch Extraction and Singing Voice Separation Advisor: Jyh-Shing Roger Jang Presenter: Chao-Ling Hsu Multimedia Information Retrieval Lab (MIR) Department of Computer Science National Tsing Hua University

2 Outline Introduction Binary Mask based Separation System Overview Proposed Method Voiced Singing Separation Unvoiced Singing Separation Evaluation Conclusions 2011/1/24 MIR, CS, NTHU 2

3 Problems Come with Music Accompaniment Many applications encounter difficulties when music accompaniment is present Music accompaniment acts like noise and interferes with the analysis of the singing voice. Solution: singing voice separation 3

4 Goal of the Dissertation Goal: Separate singing voice from back ground music 2011/1/24 MIR, CS, NTHU 4

5 Comparison of Speech and Singing Voice Separation Singing voice separation is similar to speech separation with similar applications: speech recognition : lyrics recognition speaker identification : singer identification subtitle alignment : lyric alignment Differences: Correlation of background noise and target Noise type Speech Uncorrelated Periodic/aperiodic, Narrow band/broad band Singing Strong correlated Most periodic, Most broad band Target pitch range 80~500Hz Up to 1400 Hz 2011/1/24 MIR, CS, NTHU 5

6 Binary Mask based Separation Cochleagram Waveform Clean Speech Noisy Speech Apply Separated Speech 2011/1/24 MIR, CS, NTHU 6

7 System Overview Polyphonic Song Segmentation Stage Stage 1 Time-Frequency Decomposition A/U/V Detection Voiced Frames Unvoiced Frames Grouping Stage Stage 2 Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Resynthesis Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 7

8 Time-Frequency Decomposition Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 8

9 Time-Frequency Decomposition Input Song Gammatone filterbank Framing Frequency channels T F unit Time frames 2011/1/24 MIR, CS, NTHU 9

10 Time-Frequency Decomposition Example x Input song signal x x x x Frequency decomposition 2011/1/24 2. Time decomposition MIR, CS, NTHU 10

11 A/U/V Detection Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 11

12 A/U/V Detection This block performs an Accompaniment/ Unvoiced sound/ Voiced sound detection. An hidden Markov model (HMM) is employed: S ˆ = arg max p( x0 s0 ) t t t t 1) S t { p( x s ) p( s s } s A s U s V p x s ) x s ) ( A p p x s ) ( U ( V Output Likelihood Output likelihood Output likelihood 12

13 Examples of A/U/V Detection A U V U V U V 2011/1/24 MIR, CS, NTHU 13

14 Voiced Singing Separation Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 14

15 Voiced Singing Separation Three improvements for singing pitch extraction a) Vocal Component Enhancement b) Trend Estimation c) A tandem algorithm for singing pitch extraction and voice separation 2011/1/24 MIR, CS, NTHU 15

16 The tandem algorithm (1) The tandem algorithm [Hu and Wang 2010] has been shown very effective and robust for separating speech from varied noises. It performs pitch estimation and voice separation jointly and iteratively. However, it does not help when the intrusion type is music. 2011/1/24 MIR, CS, NTHU 16

17 The tandem algorithm (2) N1 white noise, N2 noise bursts, N3 cocktail party noise, N4 rock music, N5 siren, N6 trill telephone, N7 female speech, N8 male speech, and N9 female speech. 2011/1/24 MIR, CS, NTHU 17

18 Schematic Diagram of Voiced Singing Separation Mixture Trend Estimation Singing Voice Detection Separated Singing Singing Pitch Extraction Singing Voice Separation Iterative procedure 2011/1/24 MIR, CS, NTHU 18

19 Trend estimation Objective: Estimates a rough pitch range of the target singing voice. Mixture Estimated Trend Vocal Component Enhancement Pitch Range Estimation 2011/1/24 MIR, CS, NTHU 19

20 Trend estimation-vocal component enhancement (1) We employ HPSS (Harmonic/Percussive Sound Separation) [Tachibana2010] to enhance the singing voice This method uses information of anisotropic smoothness of the sounds Harmonic sound: smooth in temporal direction because they are sustained and periodic for a while Percussive sound: Time smooth in frequency direction because they are instantaneous and aperiodic HPSS exploits anisotropic smoothness of harmonic 2011/1/24 sound and percussive MIR, CS, sound NTHU to separate them. 20 Frequency 1000

21 Trend estimation-vocal component enhancement (2) HPSS is designed as an optimization problem to minimize an objective function as: 2 2 γ γ ( J H, P) = H ( t, ω) dtdω + P( t, ω) dtdω t ω And H ( t, ω) + P( t, ω) = W ( t, ω) H(t, ω) and P(t, ω) are complex spectrograms of harmonic sound and percussive sound to be estimated. W(t, ω) is the spectrogram of the original signal γ is an exponential constant approximately 0.6 (to imitate the auditory systems) 2011/1/24 MIR, CS, NTHU 21

22 Trend estimation-vocal component enhancement (3) How to enhance singing voice from background music? Frequency of singing voice fluctuates more than that of instruments such as guitar and piano. Using long STFT window size: the spectrogram has low temporal resolution and high frequency resolution. 2011/1/24 MIR, CS, NTHU 22

23 Trend estimation-vocal component enhancement (4) window size: 64 ms window size: 256 ms music Frequency Frequency Time Time vocal Frequency Frequency /1/24 MIR, CS, NTHU Time Time

24 Trend estimation-vocal component enhancement (5) H Mixture P 2011/1/24 MIR, CS, NTHU 24

25 Pitch Range Estimation (1) Frequency (Hz) Time (Secs) 2011/1/24 The spectrogram MIR, CS, NTHUafter HPSS 25

26 Pitch Range Estimation (2) 180 Frequency bin (Step by 0.25 Semitone) Time (Secs) MR-FFT 2011/1/24 MIR, CS, NTHU 26

27 Pitch Range Estimation (3) 180 Frequency bin (Step by 0.25 Semitone) Time (Secs) Overtone deletion 2011/1/24 MIR, CS, NTHU 27

28 Pitch Range Estimation (4) /1/24 MIR, CS, NTHU 28

29 Pitch Range Estimation (5) Frequency (Hz) Time (Secs) 2011/1/24 MIR, CS, NTHU 29

30 Pitch Extraction Result of the Tandem Algorithm (1) No HPSS, No Trend Estimation 2011/1/24 MIR, CS, NTHU 30

31 Pitch Extraction Result of the Tandem Algorithm (2) With HPSS, No Trend Estimation 2011/1/24 MIR, CS, NTHU 31

32 Pitch Extraction Result of the Tandem Algorithm (3) With HPSS and Trend Estimation 2011/1/24 MIR, CS, NTHU 32

33 Pitch Extraction Result of the Tandem Algorithm (4) Post Processing 2011/1/24 MIR, CS, NTHU 33

34 Mask Comparison IBM Proposed Li_Wang Mixture (0dB) /1/24 MIR, CS, NTHU

35 Evaluation for Voiced Singing Separation Datasets: MIR-1K 1000 clips of Chinese pop music 4 to 13 seconds for each clip, total length is 133 minutes Each clip contains 2 tracks, one is singing voice and the other one is music accompaniment. Mix the two tracks at -5 db, 0 db, and 5 db SNR for evaluation. 2011/1/24 MIR, CS, NTHU 35

36 Evaluation of Singing voice detection -5 db 0 db 5 db Accuracy (%) Accuracy (%) Accuracy (%) Performance of Singing Voice Detection (a) Precision Recall Overall Accuracy (b) Precision Recall Overall Accuracy (c) Precision Recall Overall Accuracy without HPSS with HPSS 2011/1/24 MIR, CS, NTHU 36

37 Evaluation of Trend Estimation and Singing Pitch Extraction (1) 100 (a) Voiced Only (b) Oveall Results with Vocal Detection 100 (c) Oveall Results without Vocal Detection Percent of correct detection Percent of correct detection Percent of correct detection db 0 db 5 db Mixture SNR 20-5 db 0 db 5 db Mixture SNR -5 db 0 db 5 db Mixture SNR Trend estimation Before post-processing* Proposed Hu-Wang 2010 (with HPSS)* Hu-Wang 2010 (without HPSS)* Li-Wang /1/24 MIR, CS, NTHU 37 20

38 Evaluation of Trend Estimation and Singing pitch extraction (2) Winner of MIREX 2010 Audio Melody Extraction task 100 Combined results of MIREX 2009 and 2010 (Based on clips with vocal only) HJ1 (2010, proposed) TOOS1 (2010) JJY2 (2010) JJY1 (2010) SG1 (2010) cl1 (2009) cl2 (2009) dr1 (2009) dr2 (2009) hjc1 (2009) hjc2 (2009) jjy (2009) kd (2009) mw (2009) pc (2009) rr (2009) toos (2009) ADC2004 (12 MIREX2005 clips) (15 clips)indian08 MIREX09 0dB MIREX09-5dBMIREX09 +5dB Average 2011/1/24 MIR, CS, NTHU 38

39 Evaluation of Singing voice separation (SNR gain) 12 (a) Voiced Only 12 (b) Overall Results with Vocal Detection SNR gain of separated target (db) SNR gain of separated target (db) db 0 db 5 db -5 db 0 db 5 db Mixture SNR Mixture SNR IBM Ideal pitch Proposed Li-Wang 2007 (ideal pitch) 2011/1/24 MIR, CS, NTHU Li-Wang 2007 (estimated pitch) Ozerov

40 Unvoiced Singing Separation Polyphonic Song Stage 1 Time-Frequency Decomposition A/U/V Detection Stage 2 Voiced Frames Voiced-dominant T-F unit Identification within Voiced Frames Unvoiced Frames Unvoiced-dominant T-F unit Identification within Unvoiced Frames Stage 3 Resynthesis Separated Singing Voice 2011/1/24 MIR, CS, NTHU 40

41 What is an Unvoiced Sound? Unvoiced sounds 2011/1/24 MIR, CS, NTHU 41

42 Unvoiced Singing Separation To identify the T-F units of the unvoiced frames that are dominated by the unvoiced sounds Mel-scale Frequency Cepstral Coefficient (MFCC) are used. 2 GMMs are trained for each frequency channel. Establish binary masks for the unvoiced part of the singing voice Before unvoiced T F units identification After unvoiced T F units identification The ideal binary mask 2011/1/24 MIR, CS, NTHU 42

43 Evaluation of Unvoiced Singing voice separation (SNR gain) Overall GNSDR (db) GNSDR of separated singing voice (a) GNSDR in unvoiced frames (db) GNSDR of separated unvoiced singing voice (b) Mixture SNR (db) Mixture SNR (db) Ozerov s method Li and Wang s method Proposed method 2011/1/24 MIR, CS, NTHU 43

44 Sound Demos (1) Lyrics : 人說情歌總是老的好 2011/1/24 MIR, CS, NTHU 44

45 Sound Demos (2) Lyrics : 時間在逃亡 2011/1/24 MIR, CS, NTHU 45

46 Sound Demos (3) Lyrics : 然而談的情, 說的愛不夠喔, 說來就來說走就走喔 2011/1/24 MIR, CS, NTHU 46

47 Conclusions We proposed an extended tandem algorithm for singing pitch extraction and singing voice separation from music accompaniment. A trend estimation algorithm is proposed to estimate the pitch range of singing voice. The HPSS is employed to improve the performance of singing voice detection. A post processing is proposed to deal with the sequential grouping problem. The first unvoiced singing voice separation method for pitch-based inference method is introduced. 2011/1/24 MIR, CS, NTHU 47

48 Publications (1) Journal Papers: 1. Chao-Ling Hsu and Jyh-Shing Roger Jang, On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset, IEEE Trans. Audio, Speech, and Language Processing, volume 18, issue 2, p.p , Chao-Ling Hsu, DeLiang Wang, and Jyh-Shing Roger Jang, A Tandem Algorithm for Singing Pitch Extraction and Voice Separation from Music Accompaniment, submitted to IEEE Trans. Audio, Speech, and Language Processing. Conference Papers: 1. Chao-Ling Hsu, DeLiang Wang, and Jyh-Shing Roger Jang, A Trend Estimation Algorithm for Singing Pitch Detection in Musical Recordings, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May Chao-Ling Hsu and Jyh-Shing Roger Jang, Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion, International Society for Music Information Retrieval (ISMIR), Utrecht, Netherlands, Aug Chao-Ling Hsu and Jyh-Shing Roger Jang, Singing Pitch Extraction at MIREX 2010, extended abstract in International Society for Music Information Retrieval (ISMIR), Utrecht, Netherlands, Aug (Rank 1st for vocal songs in audio melody extraction competition) 4. Chao-Ling Hsu, Liang-Yu Chen, Jyh-Shing Roger Jang, and Hsing-Ji Li, Singing Pitch Extraction From Monaural Polyphonic Songs By Contextual Audio Modeling and Singing Harmonic Enhancement, International Society for Music Information Retrieval (ISMIR), Kobe, Japan, Oct Chao-Ling Hsu, Liang-Yu Chen, and Jyh-Shing Roger Jang, Singing Pitch Extraction at MIREX 2009, extended abstract in International Society for Music Information Retrieval (ISMIR), Kobe, Japan, Oct /1/24 MIR, CS, NTHU 48

49 Publications (2) 6. Chao-Ling Hsu, Jyh-Shing Roger Jang, and Te-Lu Tsai, "Separation of Singing Voice from Music Accompaniment with Unvoiced Sounds Reconstruction for Monaural Recordings", Proceedings of 125th American Engineering Society Convention (AES), San Francisco, USA, Oct Jyh-Shing Roger Jang Nien-Jung Lee, and Chao-Ling Hsu, "Simple But Effective Methods for QBSH at MIREX 2006", extended abstract in International Symposium on Music Information Retrieval (ISMIR), Victoria, Canada, Oct Ruo-Han Chen, Chao-Ling Hsu, Jyh-Shing Roger Jang, Fong-Jhu Luo, "Content-based Music Emotion Recognition ", Workshop on Computer Music and Audio Technology, Taipei, Taiwan, March Jyh-Shing Roger Jang, Chao-Ling Hsu, Hong-Ru Lee, "Continuous HMM and Its Enhancement for Singing/Humming Query Retrieval", International Symposium on Music Information Retrieval (ISMIR), London, UK, Sept Chao-Ling Hsu, Hong-Ru Lee, Jyh-Shing Roger Jang, "On the Improvement and Error Analysis of Singing/Humming Query Retrieval", Workshop on Computer Music and Audio Technology, Taipei, Taiwan, March Hong-Ru Lee, Chao-Ling Hsu, Yi-Cin Wang, Jyh-Shing Roger Jang, " Multi-modal Music Retrieval System", Conference on Digital Archive Technology, Taipei, Taiwan, /1/24 MIR, CS, NTHU 49

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION th International Society for Music Information Retrieval Conference (ISMIR ) SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION Chao-Ling Hsu Jyh-Shing Roger Jang

More information

Voice & Music Pattern Extraction: A Review

Voice & Music Pattern Extraction: A Review Voice & Music Pattern Extraction: A Review 1 Pooja Gautam 1 and B S Kaushik 2 Electronics & Telecommunication Department RCET, Bhilai, Bhilai (C.G.) India pooja0309pari@gmail.com 2 Electrical & Instrumentation

More information

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

Lecture 9 Source Separation

Lecture 9 Source Separation 10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research

More information

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National

More information

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES Vishweshwara Rao and Preeti Rao Digital Audio Processing Lab, Electrical Engineering Department, IIT-Bombay, Powai,

More information

Lecture 10 Harmonic/Percussive Separation

Lecture 10 Harmonic/Percussive Separation 10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 10 Harmonic/Percussive Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing

More information

Efficient Vocal Melody Extraction from Polyphonic Music Signals

Efficient Vocal Melody Extraction from Polyphonic Music Signals http://dx.doi.org/1.5755/j1.eee.19.6.4575 ELEKTRONIKA IR ELEKTROTECHNIKA, ISSN 1392-1215, VOL. 19, NO. 6, 213 Efficient Vocal Melody Extraction from Polyphonic Music Signals G. Yao 1,2, Y. Zheng 1,2, L.

More information

THE importance of music content analysis for musical

THE importance of music content analysis for musical IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 1, JANUARY 2007 333 Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates With

More information

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, Nobukata Ono, Shigeki Sagama The University of Tokyo, Graduate

More information

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA Ming-Ju Wu Computer Science Department National Tsing Hua University Hsinchu, Taiwan brian.wu@mirlab.org Jyh-Shing Roger Jang Computer

More information

Effects of acoustic degradations on cover song recognition

Effects of acoustic degradations on cover song recognition Signal Processing in Acoustics: Paper 68 Effects of acoustic degradations on cover song recognition Julien Osmalskyj (a), Jean-Jacques Embrechts (b) (a) University of Liège, Belgium, josmalsky@ulg.ac.be

More information

ON THE USE OF PERCEPTUAL PROPERTIES FOR MELODY ESTIMATION

ON THE USE OF PERCEPTUAL PROPERTIES FOR MELODY ESTIMATION Proc. of the 4 th Int. Conference on Digital Audio Effects (DAFx-), Paris, France, September 9-23, 2 Proc. of the 4th International Conference on Digital Audio Effects (DAFx-), Paris, France, September

More information

Singer Identification

Singer Identification Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27 Outline 1 Introduction Applications Challenges

More information

A Music Retrieval System Using Melody and Lyric

A Music Retrieval System Using Melody and Lyric 202 IEEE International Conference on Multimedia and Expo Workshops A Music Retrieval System Using Melody and Lyric Zhiyuan Guo, Qiang Wang, Gang Liu, Jun Guo, Yueming Lu 2 Pattern Recognition and Intelligent

More information

A COMPARISON OF MELODY EXTRACTION METHODS BASED ON SOURCE-FILTER MODELLING

A COMPARISON OF MELODY EXTRACTION METHODS BASED ON SOURCE-FILTER MODELLING A COMPARISON OF MELODY EXTRACTION METHODS BASED ON SOURCE-FILTER MODELLING Juan J. Bosch 1 Rachel M. Bittner 2 Justin Salamon 2 Emilia Gómez 1 1 Music Technology Group, Universitat Pompeu Fabra, Spain

More information

Query By Humming: Finding Songs in a Polyphonic Database

Query By Humming: Finding Songs in a Polyphonic Database Query By Humming: Finding Songs in a Polyphonic Database John Duchi Computer Science Department Stanford University jduchi@stanford.edu Benjamin Phipps Computer Science Department Stanford University bphipps@stanford.edu

More information

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE

MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE 12th International Society for Music Information Retrieval Conference (ISMIR 2011) MELODY EXTRACTION BASED ON HARMONIC CODED STRUCTURE Sihyun Joo Sanghun Park Seokhwan Jo Chang D. Yoo Department of Electrical

More information

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 31, 821-838 (2015) Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * Department of Electronic Engineering National Taipei

More information

SINGING VOICE MELODY TRANSCRIPTION USING DEEP NEURAL NETWORKS

SINGING VOICE MELODY TRANSCRIPTION USING DEEP NEURAL NETWORKS SINGING VOICE MELODY TRANSCRIPTION USING DEEP NEURAL NETWORKS François Rigaud and Mathieu Radenen Audionamix R&D 7 quai de Valmy, 7 Paris, France .@audionamix.com ABSTRACT This paper

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information

Introductions to Music Information Retrieval

Introductions to Music Information Retrieval Introductions to Music Information Retrieval ECE 272/472 Audio Signal Processing Bochen Li University of Rochester Wish List For music learners/performers While I play the piano, turn the page for me Tell

More information

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

SINGING VOICE ANALYSIS AND EDITING BASED ON MUTUALLY DEPENDENT F0 ESTIMATION AND SOURCE SEPARATION

SINGING VOICE ANALYSIS AND EDITING BASED ON MUTUALLY DEPENDENT F0 ESTIMATION AND SOURCE SEPARATION SINGING VOICE ANALYSIS AND EDITING BASED ON MUTUALLY DEPENDENT F0 ESTIMATION AND SOURCE SEPARATION Yukara Ikemiya Kazuyoshi Yoshii Katsutoshi Itoyama Graduate School of Informatics, Kyoto University, Japan

More information

Music Source Separation

Music Source Separation Music Source Separation Hao-Wei Tseng Electrical and Engineering System University of Michigan Ann Arbor, Michigan Email: blakesen@umich.edu Abstract In popular music, a cover version or cover song, or

More information

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University Week 14 Query-by-Humming and Music Fingerprinting Roger B. Dannenberg Professor of Computer Science, Art and Music Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Metadata-based

More information

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music Mine Kim, Seungkwon Beack, Keunwoo Choi, and Kyeongok Kang Realistic Acoustics Research Team, Electronics and Telecommunications

More information

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

Topic 10. Multi-pitch Analysis

Topic 10. Multi-pitch Analysis Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds

More information

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Jana Eggink and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 11

More information

Chord Classification of an Audio Signal using Artificial Neural Network

Chord Classification of an Audio Signal using Artificial Neural Network Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Music Information Retrieval for Jazz

Music Information Retrieval for Jazz Music Information Retrieval for Jazz Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,thierry}@ee.columbia.edu http://labrosa.ee.columbia.edu/

More information

REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 1, JANUARY 2013 73 REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation Zafar Rafii, Student

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION Hui Su, Adi Hajj-Ahmad, Min Wu, and Douglas W. Oard {hsu, adiha, minwu, oard}@umd.edu University of Maryland, College Park ABSTRACT The electric

More information

COMBINING MODELING OF SINGING VOICE AND BACKGROUND MUSIC FOR AUTOMATIC SEPARATION OF MUSICAL MIXTURES

COMBINING MODELING OF SINGING VOICE AND BACKGROUND MUSIC FOR AUTOMATIC SEPARATION OF MUSICAL MIXTURES COMINING MODELING OF SINGING OICE AND ACKGROUND MUSIC FOR AUTOMATIC SEPARATION OF MUSICAL MIXTURES Zafar Rafii 1, François G. Germain 2, Dennis L. Sun 2,3, and Gautham J. Mysore 4 1 Northwestern University,

More information

Acoustic Scene Classification

Acoustic Scene Classification Acoustic Scene Classification Marc-Christoph Gerasch Seminar Topics in Computer Music - Acoustic Scene Classification 6/24/2015 1 Outline Acoustic Scene Classification - definition History and state of

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon A Study of Synchronization of Audio Data with Symbolic Data Music254 Project Report Spring 2007 SongHui Chon Abstract This paper provides an overview of the problem of audio and symbolic synchronization.

More information

Vocal Melody Extraction from Polyphonic Audio with Pitched Accompaniment

Vocal Melody Extraction from Polyphonic Audio with Pitched Accompaniment Vocal Melody Extraction from Polyphonic Audio with Pitched Accompaniment Vishweshwara Rao (05407001) Ph.D. Defense Guide: Prof. Preeti Rao (June 2011) Department of Electrical Engineering Indian Institute

More information

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval Yi Yu, Roger Zimmermann, Ye Wang School of Computing National University of Singapore Singapore

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;

More information

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Introduction Active neurons communicate by action potential firing (spikes), accompanied

More information

A Survey of Audio-Based Music Classification and Annotation

A Survey of Audio-Based Music Classification and Annotation A Survey of Audio-Based Music Classification and Annotation Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang IEEE Trans. on Multimedia, vol. 13, no. 2, April 2011 presenter: Yin-Tzu Lin ( 阿孜孜 ^.^)

More information

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam GCT535- Sound Technology for Multimedia Timbre Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Timbre Analysis Definition of Timbre Timbre Features Zero-crossing rate Spectral

More information

A Survey on: Sound Source Separation Methods

A Survey on: Sound Source Separation Methods Volume 3, Issue 11, November-2016, pp. 580-584 ISSN (O): 2349-7084 International Journal of Computer Engineering In Research Trends Available online at: www.ijcert.org A Survey on: Sound Source Separation

More information

Further Topics in MIR

Further Topics in MIR Tutorial Automatisierte Methoden der Musikverarbeitung 47. Jahrestagung der Gesellschaft für Informatik Further Topics in MIR Meinard Müller, Christof Weiss, Stefan Balke International Audio Laboratories

More information

Lecture 15: Research at LabROSA

Lecture 15: Research at LabROSA ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 15: Research at LabROSA 1. Sources, Mixtures, & Perception 2. Spatial Filtering 3. Time-Frequency Masking 4. Model-Based Separation Dan Ellis Dept. Electrical

More information

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity Holger Kirchhoff 1, Simon Dixon 1, and Anssi Klapuri 2 1 Centre for Digital Music, Queen Mary University

More information

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng Introduction In this project we were interested in extracting the melody from generic audio files. Due to the

More information

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM 19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM Tomoko Matsui

More information

The Million Song Dataset

The Million Song Dataset The Million Song Dataset AUDIO FEATURES The Million Song Dataset There is no data like more data Bob Mercer of IBM (1985). T. Bertin-Mahieux, D.P.W. Ellis, B. Whitman, P. Lamere, The Million Song Dataset,

More information

Music Information Retrieval Community

Music Information Retrieval Community Music Information Retrieval Community What: Developing systems that retrieve music When: Late 1990 s to Present Where: ISMIR - conference started in 2000 Why: lots of digital music, lots of music lovers,

More information

Singer Recognition and Modeling Singer Error

Singer Recognition and Modeling Singer Error Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing

More information

Content-based Music Structure Analysis with Applications to Music Semantics Understanding

Content-based Music Structure Analysis with Applications to Music Semantics Understanding Content-based Music Structure Analysis with Applications to Music Semantics Understanding Namunu C Maddage,, Changsheng Xu, Mohan S Kankanhalli, Xi Shao, Institute for Infocomm Research Heng Mui Keng Terrace

More information

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,

More information

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology KAIST Juhan Nam 1 Introduction ü Instrument: Piano ü Genre: Classical ü Composer: Chopin ü Key: E-minor

More information

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Video-based Vibrato Detection and Analysis for Polyphonic String Music Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International

More information

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS Rui Pedro Paiva CISUC Centre for Informatics and Systems of the University of Coimbra Department

More information

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION Graham E. Poliner and Daniel P.W. Ellis LabROSA, Dept. of Electrical Engineering Columbia University, New York NY 127 USA {graham,dpwe}@ee.columbia.edu

More information

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function EE391 Special Report (Spring 25) Automatic Chord Recognition Using A Summary Autocorrelation Function Advisor: Professor Julius Smith Kyogu Lee Center for Computer Research in Music and Acoustics (CCRMA)

More information

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has

More information

/$ IEEE

/$ IEEE 564 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010 Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals Jean-Louis Durrieu,

More information

MODELING OF PHONEME DURATIONS FOR ALIGNMENT BETWEEN POLYPHONIC AUDIO AND LYRICS

MODELING OF PHONEME DURATIONS FOR ALIGNMENT BETWEEN POLYPHONIC AUDIO AND LYRICS MODELING OF PHONEME DURATIONS FOR ALIGNMENT BETWEEN POLYPHONIC AUDIO AND LYRICS Georgi Dzhambazov, Xavier Serra Music Technology Group Universitat Pompeu Fabra, Barcelona, Spain {georgi.dzhambazov,xavier.serra}@upf.edu

More information

NEURAL NETWORKS FOR SUPERVISED PITCH TRACKING IN NOISE. Kun Han and DeLiang Wang

NEURAL NETWORKS FOR SUPERVISED PITCH TRACKING IN NOISE. Kun Han and DeLiang Wang 24 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) NEURAL NETWORKS FOR SUPERVISED PITCH TRACKING IN NOISE Kun Han and DeLiang Wang Department of Computer Science and Engineering

More information

Retrieval of textual song lyrics from sung inputs

Retrieval of textual song lyrics from sung inputs INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Retrieval of textual song lyrics from sung inputs Anna M. Kruspe Fraunhofer IDMT, Ilmenau, Germany kpe@idmt.fraunhofer.de Abstract Retrieving the

More information

Automatic music transcription

Automatic music transcription Music transcription 1 Music transcription 2 Automatic music transcription Sources: * Klapuri, Introduction to music transcription, 2006. www.cs.tut.fi/sgn/arg/klap/amt-intro.pdf * Klapuri, Eronen, Astola:

More information

A. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) =

A. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) = 1 Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks Yang Sun, Student Member, IEEE, Wenwu Wang, Senior Member, IEEE, Jonathon Chambers, Fellow, IEEE, and

More information

arxiv: v1 [cs.sd] 4 Jun 2018

arxiv: v1 [cs.sd] 4 Jun 2018 REVISITING SINGING VOICE DETECTION: A QUANTITATIVE REVIEW AND THE FUTURE OUTLOOK Kyungyun Lee 1 Keunwoo Choi 2 Juhan Nam 3 1 School of Computing, KAIST 2 Spotify Inc., USA 3 Graduate School of Culture

More information

Pitch-Synchronous Spectrogram: Principles and Applications

Pitch-Synchronous Spectrogram: Principles and Applications Pitch-Synchronous Spectrogram: Principles and Applications C. Julian Chen Department of Applied Physics and Applied Mathematics May 24, 2018 Outline The traditional spectrogram Observations with the electroglottograph

More information

MELODY EXTRACTION FROM POLYPHONIC AUDIO OF WESTERN OPERA: A METHOD BASED ON DETECTION OF THE SINGER S FORMANT

MELODY EXTRACTION FROM POLYPHONIC AUDIO OF WESTERN OPERA: A METHOD BASED ON DETECTION OF THE SINGER S FORMANT MELODY EXTRACTION FROM POLYPHONIC AUDIO OF WESTERN OPERA: A METHOD BASED ON DETECTION OF THE SINGER S FORMANT Zheng Tang University of Washington, Department of Electrical Engineering zhtang@uw.edu Dawn

More information

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

The Intervalgram: An Audio Feature for Large-scale Melody Recognition The Intervalgram: An Audio Feature for Large-scale Melody Recognition Thomas C. Walters, David A. Ross, and Richard F. Lyon Google, 1600 Amphitheatre Parkway, Mountain View, CA, 94043, USA tomwalters@google.com

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

Data Driven Music Understanding

Data Driven Music Understanding Data Driven Music Understanding Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Engineering, Columbia University, NY USA http://labrosa.ee.columbia.edu/ 1. Motivation:

More information

A Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music

A Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music A Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music Hung-Ming Yu, Wei-Ho Tsai, and Hsin-Min Wang Institute of Information Science, Academia Sinica, Taipei, Taiwan, Republic

More information

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy

More information

Classification of Timbre Similarity

Classification of Timbre Similarity Classification of Timbre Similarity Corey Kereliuk McGill University March 15, 2007 1 / 16 1 Definition of Timbre What Timbre is Not What Timbre is A 2-dimensional Timbre Space 2 3 Considerations Common

More information

Music Information Retrieval

Music Information Retrieval Music Information Retrieval When Music Meets Computer Science Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Berlin MIR Meetup 20.03.2017 Meinard Müller

More information

Singing voice synthesis based on deep neural networks

Singing voice synthesis based on deep neural networks INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Music Perception with Combined Stimulation

Music Perception with Combined Stimulation Music Perception with Combined Stimulation Kate Gfeller 1,2,4, Virginia Driscoll, 4 Jacob Oleson, 3 Christopher Turner, 2,4 Stephanie Kliethermes, 3 Bruce Gantz 4 School of Music, 1 Department of Communication

More information

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio Curriculum Vitae Kyogu Lee Advanced Technology Center, Gracenote Inc. 2000 Powell Street, Suite 1380 Emeryville, CA 94608 USA Tel) 1-510-428-7296 Fax) 1-510-547-9681 klee@gracenote.com kglee@ccrma.stanford.edu

More information

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu

More information

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University

More information

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Investigation

More information

Proc. of NCC 2010, Chennai, India A Melody Detection User Interface for Polyphonic Music

Proc. of NCC 2010, Chennai, India A Melody Detection User Interface for Polyphonic Music A Melody Detection User Interface for Polyphonic Music Sachin Pant, Vishweshwara Rao, and Preeti Rao Department of Electrical Engineering Indian Institute of Technology Bombay, Mumbai 400076, India Email:

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

CURRENT CHALLENGES IN THE EVALUATION OF PREDOMINANT MELODY EXTRACTION ALGORITHMS

CURRENT CHALLENGES IN THE EVALUATION OF PREDOMINANT MELODY EXTRACTION ALGORITHMS CURRENT CHALLENGES IN THE EVALUATION OF PREDOMINANT MELODY EXTRACTION ALGORITHMS Justin Salamon Music Technology Group Universitat Pompeu Fabra, Barcelona, Spain justin.salamon@upf.edu Julián Urbano Department

More information

MODELS of music begin with a representation of the

MODELS of music begin with a representation of the 602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010 Modeling Music as a Dynamic Texture Luke Barrington, Student Member, IEEE, Antoni B. Chan, Member, IEEE, and

More information

Music Segmentation Using Markov Chain Methods

Music Segmentation Using Markov Chain Methods Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some

More information

Singing Voice Detection for Karaoke Application

Singing Voice Detection for Karaoke Application Singing Voice Detection for Karaoke Application Arun Shenoy *, Yuansheng Wu, Ye Wang ABSTRACT We present a framework to detect the regions of singing voice in musical audio signals. This work is oriented

More information

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data

Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data Repeating Pattern Discovery and Structure Analysis from Acoustic Music Data Lie Lu, Muyuan Wang 2, Hong-Jiang Zhang Microsoft Research Asia Beijing, P.R. China, 8 {llu, hjzhang}@microsoft.com 2 Department

More information