AN ON-THE-FLY MANDARIN SINGING VOICE SYNTHESIS SYSTEM

Size: px
Start display at page:

Download "AN ON-THE-FLY MANDARIN SINGING VOICE SYNTHESIS SYSTEM"

Transcription

1 AN ON-THE-FLY MANDARIN SINGING VOICE SYNTHESIS SYSTEM Cheng-Yuan Lin*, J.-S. Roger Jang*, and Shaw-Hwa Hwang** *Dept. of Computer Science, National Tsing Hua University, Taiwan **Dept. of Electrical Engineering, National Taipei University, Taiwan ABSTRACT An on-the-fly Mandarin singing voice synthesis system, called SINVOIS (singing voice synthesis), is proposed in this paper. The SINVOIS system can receive the continuous speech of the lyrics of a song, and generate the singing voice immediately based on the music score information (embedded in a MIDI file) of the song. Two sub-systems are designed and embedded into the system. One is the synthesis unit generator and the other is the pitch-shifting module. In the first one, the Viterbi decoding algorithm is employed on a continuous speech to generate the synthesis unit for singing voice. And the PSOLA method is employed to implement the pitch-shifting function in the second one. Moreover, the energy, duration, and spectrum modifications on the synthesis unit are also implemented in the second part. The synthesized singing voice sounds reasonably good. From the subjective listening test, the MOS (mean opinion score) of 4.5 and 3. are obtained for the original and synthesized singing voices, respectively. synthesized singing voices for virtual singers, and so on. In a conventional concatenation-based Chinese TTS system, the synthesis unit is taken from a set of pre-recorded 4 syllabic clips, representing the distinct base syllables in Mandarin Chinese. A concatenation-based singing voice synthesis system works in a similar way, except that we need to synthesize the singing voice based on a given music score and lyrics of a song. More specifically, a singing voice synthesis system takes as inputs the lyrics and melody information of a song. The lyrics are converted into syllables and the corresponding syllable clips are selected for concatenation. Then the system performs pitch/time modification and adds other desirable effects such as vibrato and echoes to make the synthesized singing voice more naturally sounding. The following figure demonstrates the flow chart of a conventional singing voice synthesis system:. INTRODUCTION Text-to-speech (TTS) systems have been developed in the past few decades and the most recent TTS systems can produce human-like natural sounding speech. The success of TTS systems can be attributed to their wide applications as well as the advances in modern computers. On the other hand, the research and developments of singing voice synthesis are not as mature as speech synthesis, partly due to its limited application domains. However, as computer-based games and entertainments are becoming popular, interesting applications of singing voice synthesis are emerging, including software for vocal training and However, such conventional singing voice synthesis systems cannot be used to produce personalized singing unless one has to record the 4 base Mandarin syllables in advance, which is a time-consuming process. Moreover, we must re-create the co-articulate effects since they are not available in the original 4 base syllable recordings. Considering these disadvantages in the conventional systems, we propose the use of speech recognition technology as a front end of our SINVOIS system. In other words, to create a personalized singing voice, the use needs to read the lyrics, sentence by sentence, to our system. Our system then employs

2 forced alignment via Viterbi decoding to detect the boundary of each character, as well as its consonant and vowel parts. Once these parts are identified, we can use them as synthesis units to synthesize a singing voice of the song, retaining all the timber and co-articulate effects of the user. Other add-on features, such as vibrato and echoes, can be imposed in the post-processing. The following figure demonstrates the flow chart of our SINVOS system: 2. RELATED WORK Due to limited computing power, most previous approaches to singing voice synthesis employ acoustic models to implement the human voice production. These include:. The SPASM system by Perry Cook [4] 2. The CHANT system by Bennett et al. [] 3. Rule-based synthesis by Sundberg [4] 4. Frequency modulation method by Chowning [3] However, performance of the above methods is not acceptable since the acoustic models cannot produce natural sounding human voices. Recently, the success of concatenation based text-to-speech systems motivates the use of concatenation for singing voice synthesis. For example, the LYRICOS system by Macon et al. [9][] is a typical example of concatenation-based singing voice synthesis system. The SMALLTALK system by OKI company [7] in Japan is another example that adopts PSOLA [6] method (which will be introduced in forth section) to synthesize singing voices. Even though these systems can produce satisfactory performance, they cannot produce personalized singing voices on the fly for a specific user. 3. GENERATION OF SYNTHESIS UNIT The conventional method of synthesis unit generation for speech synthesis derives from a database of 4 syllables that was recorded previously by a specific person who possesses clear tone. Once the recordings of 4 base syllables are available, we need to process the speech data according to the following steps:. End-point detection [5] based on energies and zero crossing rates are employed to identify the exact position of the speech recordings. 2. We need to find the pitch marks of each syllable, which are positions at the time axis indicating the beginning of a pitch period. 3. The consonant part and the vowel part of each syllable are also labeled manually. For best performance, the above three steps are usually carried out manually, which is a rather time-consuming process. In our SINVOIS system, we need to synthesize the singing voice on the fly; hence all three steps are performed automatically. Moreover, we also need to identify each syllable boundary via Viterbi decoding. 3. Syllable Detection For a given recording of a lyric sentence, each syllable is detected by force alignment via Viterbi decoding [2][3]. The process can be divided into the following two steps:. Each character in the lyric sentence must be labeled with a base syllable. This task is not as trivial as it seems since we need to take care of some of the character-to-syllable mappings that are one-to-many. A maximum matching method is used in conjunction with a dictionary of about 9, terms to determine the best character-to-syllable mapping. 2. The syllable sequence from a lyric sentence is then converted into bi-phone models for constructing a single-sentence of a linear lexicon. Viterbi decoding [2][3] is then employed to align the frames of the speech recording to the bi-phone models in the one-sentence linear lexicon, such that a best state sequence of the maximal probability is found. The obtained optimal state sequence indicates the best alignment of each frame to a state in the lexicon. Therefore we can correctly identify the position of each syllable, including its consonant and vowel parts. Of course, before the use of Viterbi decoding, we need to have an acoustic model in advance. The acoustic model used here contains 52 bi-phone models, which are obtained from a speech corpus of 7 subjects to achieve speaker independency. The

3 complete acoustic model ensures the precision in syllable detection. The following plot demonstrates a typical result of syllable detection. Amplitude Wave form x 4 For simplicity, once a syllable is detected, we can also use zero crossing rates directly to distinguish consonant part from vowel part. 3.2 Identification of Pitch Mark Pitch marks are the positions where complete pitch periods start. We need to identify pitch marks for effective time/pitch modification. In our system, pitch identification is only performed on the vowel part of each syllable since the consonant part does not have a clearly defined pitch. Namely, the consonant part of a syllable is kept unchanged during pitch shifting. The steps involved in pitch mark identification are listed next:. Use ACF (autocorrelation function) or AMDF (average magnitude difference function) to compute the average pitch periodt of a given syllable recording. 2. Find the global maximum of the syllable waveform and label its time coordinate ast m ; this is the position of the first pitch mark. 3. Search other pitch marks to the right oft m by finding the maximum in the region [ tm +.9* T, tm +.* T ]. Repeat the same procedure until all pitch marks to the right of the global maximum are found. 4. Search the pitch marks to the left of t m and the region should be t.* T, t.9* ] instead. [ m m T Repeat the same procedure until all pitch marks to the left of the global maximum are found. The following plot shows the waveform after pitch marks (denoted as circles) are found. Once pitch marks are found, we can perform necessary pitch/time modification according to the music score of the song, and add other desirable effects for singing voices. These procedures are introduced in next section. 4. PITCH SHIFTING MODULE In this section we will introduce the essential operations of STS that include pitch/time scale modification and energy normalization. Afterward we further to do fine tuning such as echo effect, pitch vibrato, coarticulation effect to make the singing voice more natural. 4. Pitch Shifting Pitch shifting of speech/audio signals is an essential part in speech and music synthesis. There are several well-known approaches to pitch shifting:. PSOLA (Pitch Synchronous Overlap and Add) [6] 2. Cross-Fading [2] 3. Residual Signal with PSOLA [5] 4. Sinusoidal Modeling [] In our system, we adopt the PSOLA method to achieve a balance between quality and efficiency. The basic concept behind PSOLA is to multiply a hamming window centered at each pitch mark of the speech signal. The windowed signals at each pitch mark are then relocated as necessary. More specifically, if we want to shift up pitch, the distance between neighboring pitch marks will be decreased. On the contrary, if we want to shift down pitch, the distance between neighboring pitch marks should be increased. When performing a pitch-up operation, the half size of the hamming window is equal to the desired distance between neighboring pitch marks of the pitch-up signals. On the other hand, when performing a pitch-down operation, the half size of the hamming window is equal to the distance between neighboring pitch marks of the original signals. As a result, we might want to insert some zeros between two windowed signals if a

4 pitch-down operation with less than 5% of the original pitch frequency is desired. The following plot demonstrates the situations for both pitch-up and pitch-down operations: O rig ina l Pitch up. Compute the energy of each syllable in the recorded lyric sentence, E, E2,... EN, where N is the number of syllables. N 2. Compute the average energy Eave = E k. N k= 3. Multiply the waveform of the k -th syllable by E / E E / E = 3.6. ave k ave k a constant ( ) Pitch down Insert zero Hamming Window On the other hand, if the recorded lyric sentence already bears the desirable energy profile, then we do not need to apply such energy normalization procedure. 4.2 Time Modification Time modification is used to increase or decrease the duration of a synthesis unit. We use a simple linear mapping method for time modification in our system. The method can duplicate or delete fundamental periods as necessary, as shown in the following diagram: O rigin al Later O riginal Later contraction of waveform extension of waveform Before we want to do time modification for a syllable, we must separate the consonant and vowel parts. Usually the consonant part is not change at all; only the vowel part is shortened or lengthened. The speech-based recording has already included natural co-articulation; therefore we do not need to do special arrangement on this issue. (In our previous approach to singing voice synthesis based on 4 syllable recordings, we need to smooth the transition between syllables by a method of cross-fading on a small overlapped region.) 4.3 Energy Modification The concatenated singing voice occasionally results in unnatural sound since each synthesis unit has diverse level of energy (intensity or volume). Therefore, we can simply adjust the amplitude in each syllable such that the energy is equal to the average energy of the whole sentence. The energy normalization procedure is described as follows: 4.4 Other Desirable Effects Results of the above synthesis procedure constantly contain some undesirable artificial-sounding buzzy effects. As a result, we adopt the following formula to implement the echo effect: y[ n] = x[ n] + ay[ n k] Or in its z-transform: Y ( z) H ( z) = = k X ( z) az The value of k controls the amount of delay and it can be adjust accordingly. Usually the amount of delay is set to.7 second. The echo effect can essentially mask the undesirable buzzy components. Furthermore, the echo effect can make the whole synthesize singing voice more genuine and softer, which is also exemplified by the fact that almost every karaoke machine has a knob for echo effect control. Besides the echo effect, the inclusion of vibrato effect [9] is an important factor to make the synthesized singing voice more natural. Vibrato effect can be implemented according to the following guidelines:. We use the sinusoidal function to model the effect of vibrato. For instance, if we want to alter the pitch curve of a syllable to a sinusoidal function in the range [a b] (for instance, [.8,.2]), we can simply do so by rescaling and shifting the basic sinusoid sin( ω t) : sin( ω t ) * ( b a) a + b +, 2 where ω is the vibration angular frequency and t is the frame index. 2. Reassign the position of pitch marks based on the above sinusoidal function of the pitch curve with vibrato. 2

5 3. Only syllables with duration greater than.8 seconds are permitted to have vibrato effect. Moreover, vibrato only occurs in the vowel part. The following figure demonstrates the synthesized singing voice without vibrato effect. The first plot is time-domain waveform; the second plot is the corresponding pitch curve without vibrato: Amplitude Semitone Wave form pitch: After median/merging filter demonstrates the average score of MOS test for each song: Song MOS From the above table, it is obvious that the synthesized singing voices are acceptable, but definitely not satisfactory enough to be described as natural sounding. The major reason is that the synthesis units are obtained from recordings of speech instead of singing. Therefore the synthesized voices are reminiscent of speech instead of natural singing. However, the effect of on-the-fly synthesis is entertaining and most people are eager to try the system for fun Time (in seconds) 6 CONCLUSIONS AND FUTURE WORK After adding the vibrato effect, the waveform as well as the pitch curve is shown in the following plot: Amplitude Semitone Wave form pitch: After median/merging filter Time (in seconds) 5 RESULTS AND ANALYSIS The performance of our SINVOS system depends on three factors: the outcome of force alignment via Viterbi decoding, the result of pitch/time modification, and the special effects of singing voice. We have 5 persons try 5 different Mandarin Chinese pop songs and obtain a 95% recognition rate on syllable detection. The resulted synthesized singing voices all seemed to be acceptable. We adopt a test of MOS (mean opinion score) [8] to obtain subjective assessments of our system. In the test, we have ten persons to listen to the fifteen synthesized singing voice and each person has to give a score for each song. The score ranges from t o 5, with 5 representing the highest grade for naturalness. The following table In the paper, we have described the development of a singing voce synthesis system called SINVOS (singing voice synthesis). The system can accept a user s speech input of the lyric sentences, and generate a synthesized singing voice based on the input recording and the song s music score. The operation of the system is divided into two parts: one is the synthesis unit generator via Viterbi decoding, and the other is time/pitch modification and special effects. To assess the performance of SINVOS, we designed an experiment with MOS for subjective evaluation. The experiment results are acceptable, but not totally satisfactory due to the way our synthesis units are obtained. However, the fun part of the system also comes from the personal recording, which can be used for on-the-fly synthesis that can retain personal features via audio timber. This is only a preliminary study and there are many directions for future work. Some of the immediately future work includes:. Find the transformation in the frequency domain that can capture and transform the speech recordings into their singing duplicates. 2. Find the most likely pitch contour via methods in system identification, such that the synthesized singing voice will have a natural pitch contour. 3. Try some other frequency domain techniques for pitch shifting, such as sinusoidal modeling [].

6 7 REFERENCES [] Bennett, Gerald, and Rodet, Xavier, Synthesis of the singing voice, in Current Directions in Computer Music Research (M. V. Mathews and J. R. Pierce, eds.), pp. 9-44, MIT Press, 989. [2] Chen, S.G. and Lin, G.J., High Quality and Low Complexity Pitch Modification of Acoustic Signals, Proceedings of the 995 IEEE International Conference on Acoustic, Speech, and Signal Processing, May, Detroit, USA, 995, p [3] Chowning, John M., Frequency Modulation Synthesis of the Singing Voice, in Current Directions in Computer Music Research (Max. V. Mathews and John. R. Pierce, eds.), pp , MIT Press, 989. [4] Cook, P.R., SPASM, a real time vocal track physical model controller and singer, the companion software synthesis system, Computer Music Journal, vol. 7, pp.3-43, spring 993. [5] Edgington, M. and Lowry, A., Residual-based speech modification algorithms for text-to-speech synthesis, Spoken Language, 996. ICSLP 96. Proceedings, Fourth International Conference on Volume: 3, 996, Page(s): vol.3 [6] F. Charpentier and Moulines, Pitch-synchronous Waveform Processing Technique for Text-to-Speech Synthesis Using Diphones, European Conf. On Speech Communication and Technology, pp.3-9, Paris, 989. [7] [8] ITU-T, Methods for Subjective Determination of Transmission Quality, 996, Int. Telecommunication Unit. [9] Macon, Michael W. and Jensen-Link, Leslie and Oliverio, James and Clements, Mark A. and George, E. Bryan, A Singing voice synthesis system based on sinusoidal modeling, Proc. of International Conference on Acoustics, Speech, and Signal Processing, Vol., pp , 997. [] Macon, Michael W., and Jensen-Link, Leslie and Oliverio, James and Clements, Mark A. and George, E. Bryan, "Concatenation-based MIDI-to-Singing Voice Synthesis," 3rd Meeting of the Audio Engineering Society, New York, 997. [] Macon, Michael W., M. W. Macon, Speech Synthesis Based on Sinusoidal Modeling, PhD thesis, Georgia Institute of Technology, October 996. [2] Ney, F., and Aubert, X., Dynamic programming search: from digit strings to large vocabulary word graphs, in C. H. Lee, F Soong, and K. Paliwal, eds., Automatic Speech and Speaker Recognition, Kluwer, Norwell, Mass., 996. [3] Rabiner, L., and Juang, B-H., Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., pp , 993. [4] Sundberg, Johan, Synthesis of Singing by Rule, in Current Directions in Computer Music Research (Max. V. Mathews and John. R. Pierce, eds.), pp , MIT Press, 989. [5] Yiying Zhang, Xiaoyan Zhu, Yu Hao, Yupin Luo, A robust and fast endpoint detection algorithm for isolated word recognition, Intelligent Processing Systems, 997. ICIPS ' IEEE International Conference on Volume: 2, 997, Page(s): vol.2

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm ALEJANDRO RAMOS-AMÉZQUITA Computer Science Department Tecnológico de Monterrey (Campus Ciudad de México)

More information

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016 Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016 Jordi Bonada, Martí Umbert, Merlijn Blaauw Music Technology Group, Universitat Pompeu Fabra, Spain jordi.bonada@upf.edu,

More information

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB Laboratory Assignment 3 Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB PURPOSE In this laboratory assignment, you will use MATLAB to synthesize the audio tones that make up a well-known

More information

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION

SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION th International Society for Music Information Retrieval Conference (ISMIR ) SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION Chao-Ling Hsu Jyh-Shing Roger Jang

More information

A prototype system for rule-based expressive modifications of audio recordings

A prototype system for rule-based expressive modifications of audio recordings International Symposium on Performance Science ISBN 0-00-000000-0 / 000-0-00-000000-0 The Author 2007, Published by the AEC All rights reserved A prototype system for rule-based expressive modifications

More information

Robert Alexandru Dobre, Cristian Negrescu

Robert Alexandru Dobre, Cristian Negrescu ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q

More information

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,

More information

Query By Humming: Finding Songs in a Polyphonic Database

Query By Humming: Finding Songs in a Polyphonic Database Query By Humming: Finding Songs in a Polyphonic Database John Duchi Computer Science Department Stanford University jduchi@stanford.edu Benjamin Phipps Computer Science Department Stanford University bphipps@stanford.edu

More information

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon A Study of Synchronization of Audio Data with Symbolic Data Music254 Project Report Spring 2007 SongHui Chon Abstract This paper provides an overview of the problem of audio and symbolic synchronization.

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

CSC475 Music Information Retrieval

CSC475 Music Information Retrieval CSC475 Music Information Retrieval Monophonic pitch extraction George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 32 Table of Contents I 1 Motivation and Terminology 2 Psychacoustics 3 F0

More information

1. Introduction NCMMSC2009

1. Introduction NCMMSC2009 NCMMSC9 Speech-to-Singing Synthesis System: Vocal Conversion from Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices * Takeshi SAITOU 1, Masataka GOTO 1, Masashi

More information

Pitch correction on the human voice

Pitch correction on the human voice University of Arkansas, Fayetteville ScholarWorks@UARK Computer Science and Computer Engineering Undergraduate Honors Theses Computer Science and Computer Engineering 5-2008 Pitch correction on the human

More information

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION H. Pan P. van Beek M. I. Sezan Electrical & Computer Engineering University of Illinois Urbana, IL 6182 Sharp Laboratories

More information

2. AN INTROSPECTION OF THE MORPHING PROCESS

2. AN INTROSPECTION OF THE MORPHING PROCESS 1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,

More information

Communication Lab. Assignment On. Bi-Phase Code and Integrate-and-Dump (DC 7) MSc Telecommunications and Computer Networks Engineering

Communication Lab. Assignment On. Bi-Phase Code and Integrate-and-Dump (DC 7) MSc Telecommunications and Computer Networks Engineering Faculty of Engineering, Science and the Built Environment Department of Electrical, Computer and Communications Engineering Communication Lab Assignment On Bi-Phase Code and Integrate-and-Dump (DC 7) MSc

More information

Advanced Signal Processing 2

Advanced Signal Processing 2 Advanced Signal Processing 2 Synthesis of Singing 1 Outline Features and requirements of signing synthesizers HMM based synthesis of singing Articulatory synthesis of singing Examples 2 Requirements of

More information

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Available online at  ScienceDirect. Procedia Computer Science 46 (2015 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 381 387 International Conference on Information and Communication Technologies (ICICT 2014) Music Information

More information

Automatic Construction of Synthetic Musical Instruments and Performers

Automatic Construction of Synthetic Musical Instruments and Performers Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.

More information

MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS

MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS Ju-Chiang Wang Hung-Yan Gu Hsin-Min Wang Institute of Information Science, Academia Sinica Dept. of Computer

More information

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,

More information

ELEC 484 Project Pitch Synchronous Overlap-Add

ELEC 484 Project Pitch Synchronous Overlap-Add ELEC 484 Project Pitch Synchronous Overlap-Add Joshua Patton University of Victoria, BC, Canada This report will discuss steps towards implementing a real-time audio system based on the Pitch Synchronous

More information

Computer Coordination With Popular Music: A New Research Agenda 1

Computer Coordination With Popular Music: A New Research Agenda 1 Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,

More information

International Journal of Computer Architecture and Mobility (ISSN ) Volume 1-Issue 7, May 2013

International Journal of Computer Architecture and Mobility (ISSN ) Volume 1-Issue 7, May 2013 Carnatic Swara Synthesizer (CSS) Design for different Ragas Shruti Iyengar, Alice N Cheeran Abstract Carnatic music is one of the oldest forms of music and is one of two main sub-genres of Indian Classical

More information

Music Source Separation

Music Source Separation Music Source Separation Hao-Wei Tseng Electrical and Engineering System University of Michigan Ann Arbor, Michigan Email: blakesen@umich.edu Abstract In popular music, a cover version or cover song, or

More information

Speech and Speaker Recognition for the Command of an Industrial Robot

Speech and Speaker Recognition for the Command of an Industrial Robot Speech and Speaker Recognition for the Command of an Industrial Robot CLAUDIA MOISA*, HELGA SILAGHI*, ANDREI SILAGHI** *Dept. of Electric Drives and Automation University of Oradea University Street, nr.

More information

Normalized Cumulative Spectral Distribution in Music

Normalized Cumulative Spectral Distribution in Music Normalized Cumulative Spectral Distribution in Music Young-Hwan Song, Hyung-Jun Kwon, and Myung-Jin Bae Abstract As the remedy used music becomes active and meditation effect through the music is verified,

More information

Quarterly Progress and Status Report. Formant frequency tuning in singing

Quarterly Progress and Status Report. Formant frequency tuning in singing Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Formant frequency tuning in singing Carlsson-Berndtsson, G. and Sundberg, J. journal: STL-QPSR volume: 32 number: 1 year: 1991 pages:

More information

A Music Retrieval System Using Melody and Lyric

A Music Retrieval System Using Melody and Lyric 202 IEEE International Conference on Multimedia and Expo Workshops A Music Retrieval System Using Melody and Lyric Zhiyuan Guo, Qiang Wang, Gang Liu, Jun Guo, Yueming Lu 2 Pattern Recognition and Intelligent

More information

Singing-voice Synthesis Using ANN Vibrato-parameter Models *

Singing-voice Synthesis Using ANN Vibrato-parameter Models * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, 425-442 (2014) Singing-voice Synthesis Using ANN Vibrato-parameter Models * Department of Computer Science and Information Engineering National Taiwan

More information

Music Representations

Music Representations Lecture Music Processing Music Representations Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

1 Introduction to PSQM

1 Introduction to PSQM A Technical White Paper on Sage s PSQM Test Renshou Dai August 7, 2000 1 Introduction to PSQM 1.1 What is PSQM test? PSQM stands for Perceptual Speech Quality Measure. It is an ITU-T P.861 [1] recommended

More information

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy

More information

Proposal for Application of Speech Techniques to Music Analysis

Proposal for Application of Speech Techniques to Music Analysis Proposal for Application of Speech Techniques to Music Analysis 1. Research on Speech and Music Lin Zhong Dept. of Electronic Engineering Tsinghua University 1. Goal Speech research from the very beginning

More information

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas Marcello Herreshoff In collaboration with Craig Sapp (craig@ccrma.stanford.edu) 1 Motivation We want to generative

More information

Adaptive Resampling - Transforming From the Time to the Angle Domain

Adaptive Resampling - Transforming From the Time to the Angle Domain Adaptive Resampling - Transforming From the Time to the Angle Domain Jason R. Blough, Ph.D. Assistant Professor Mechanical Engineering-Engineering Mechanics Department Michigan Technological University

More information

System Identification

System Identification System Identification Arun K. Tangirala Department of Chemical Engineering IIT Madras July 26, 2013 Module 9 Lecture 2 Arun K. Tangirala System Identification July 26, 2013 16 Contents of Lecture 2 In

More information

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling International Conference on Electronic Design and Signal Processing (ICEDSP) 0 Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling Aditya Acharya Dept. of

More information

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY Eugene Mikyung Kim Department of Music Technology, Korea National University of Arts eugene@u.northwestern.edu ABSTRACT

More information

Melody Retrieval On The Web

Melody Retrieval On The Web Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,

More information

Quarterly Progress and Status Report. Musicians and nonmusicians sensitivity to differences in music performance

Quarterly Progress and Status Report. Musicians and nonmusicians sensitivity to differences in music performance Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Musicians and nonmusicians sensitivity to differences in music performance Sundberg, J. and Friberg, A. and Frydén, L. journal:

More information

SYNTHESIS FROM MUSICAL INSTRUMENT CHARACTER MAPS

SYNTHESIS FROM MUSICAL INSTRUMENT CHARACTER MAPS Published by Institute of Electrical Engineers (IEE). 1998 IEE, Paul Masri, Nishan Canagarajah Colloquium on "Audio and Music Technology"; November 1998, London. Digest No. 98/470 SYNTHESIS FROM MUSICAL

More information

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution. CS 229 FINAL PROJECT A SOUNDHOUND FOR THE SOUNDS OF HOUNDS WEAKLY SUPERVISED MODELING OF ANIMAL SOUNDS ROBERT COLCORD, ETHAN GELLER, MATTHEW HORTON Abstract: We propose a hybrid approach to generating

More information

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES Vishweshwara Rao and Preeti Rao Digital Audio Processing Lab, Electrical Engineering Department, IIT-Bombay, Powai,

More information

Analysis, Synthesis, and Perception of Musical Sounds

Analysis, Synthesis, and Perception of Musical Sounds Analysis, Synthesis, and Perception of Musical Sounds The Sound of Music James W. Beauchamp Editor University of Illinois at Urbana, USA 4y Springer Contents Preface Acknowledgments vii xv 1. Analysis

More information

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 31, 821-838 (2015) Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * Department of Electronic Engineering National Taipei

More information

Singing voice synthesis based on deep neural networks

Singing voice synthesis based on deep neural networks INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

More information

Melody transcription for interactive applications

Melody transcription for interactive applications Melody transcription for interactive applications Rodger J. McNab and Lloyd A. Smith {rjmcnab,las}@cs.waikato.ac.nz Department of Computer Science University of Waikato, Private Bag 3105 Hamilton, New

More information

Pitch-Synchronous Spectrogram: Principles and Applications

Pitch-Synchronous Spectrogram: Principles and Applications Pitch-Synchronous Spectrogram: Principles and Applications C. Julian Chen Department of Applied Physics and Applied Mathematics May 24, 2018 Outline The traditional spectrogram Observations with the electroglottograph

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

Tempo and Beat Analysis

Tempo and Beat Analysis Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:

More information

DISTRIBUTION STATEMENT A 7001Ö

DISTRIBUTION STATEMENT A 7001Ö Serial Number 09/678.881 Filing Date 4 October 2000 Inventor Robert C. Higgins NOTICE The above identified patent application is available for licensing. Requests for information should be addressed to:

More information

Acoustic Measurements Using Common Computer Accessories: Do Try This at Home. Dale H. Litwhiler, Terrance D. Lovell

Acoustic Measurements Using Common Computer Accessories: Do Try This at Home. Dale H. Litwhiler, Terrance D. Lovell Abstract Acoustic Measurements Using Common Computer Accessories: Do Try This at Home Dale H. Litwhiler, Terrance D. Lovell Penn State Berks-LehighValley College This paper presents some simple techniques

More information

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING José Ventura, Ricardo Sousa and Aníbal Ferreira University of Porto - Faculty of Engineering -DEEC Porto, Portugal ABSTRACT Vibrato is a frequency

More information

AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC

AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC A Thesis Presented to The Academic Faculty by Xiang Cao In Partial Fulfillment of the Requirements for the Degree Master of Science

More information

Doubletalk Detection

Doubletalk Detection ELEN-E4810 Digital Signal Processing Fall 2004 Doubletalk Detection Adam Dolin David Klaver Abstract: When processing a particular voice signal it is often assumed that the signal contains only one speaker,

More information

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS Matthew Roddy Dept. of Computer Science and Information Systems, University of Limerick, Ireland Jacqueline Walker

More information

Tempo Estimation and Manipulation

Tempo Estimation and Manipulation Hanchel Cheng Sevy Harris I. Introduction Tempo Estimation and Manipulation This project was inspired by the idea of a smart conducting baton which could change the sound of audio in real time using gestures,

More information

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra

More information

Music Alignment and Applications. Introduction

Music Alignment and Applications. Introduction Music Alignment and Applications Roger B. Dannenberg Schools of Computer Science, Art, and Music Introduction Music information comes in many forms Digital Audio Multi-track Audio Music Notation MIDI Structured

More information

DSP First Lab 04: Synthesis of Sinusoidal Signals - Music Synthesis

DSP First Lab 04: Synthesis of Sinusoidal Signals - Music Synthesis DSP First Lab 04: Synthesis of Sinusoidal Signals - Music Synthesis Pre-Lab and Warm-Up: You should read at least the Pre-Lab and Warm-up sections of this lab assignment and go over all exercises in the

More information

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.8, NO.5, OCTOBER, 08 ISSN(Print) 598-657 https://doi.org/57/jsts.08.8.5.640 ISSN(Online) -4866 A Modified Static Contention Free Single Phase Clocked

More information

NUMEROUS elaborate attempts have been made in the

NUMEROUS elaborate attempts have been made in the IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 46, NO. 12, DECEMBER 1998 1555 Error Protection for Progressive Image Transmission Over Memoryless and Fading Channels P. Greg Sherwood and Kenneth Zeger, Senior

More information

Voice & Music Pattern Extraction: A Review

Voice & Music Pattern Extraction: A Review Voice & Music Pattern Extraction: A Review 1 Pooja Gautam 1 and B S Kaushik 2 Electronics & Telecommunication Department RCET, Bhilai, Bhilai (C.G.) India pooja0309pari@gmail.com 2 Electrical & Instrumentation

More information

Welcome to Vibrationdata

Welcome to Vibrationdata Welcome to Vibrationdata Acoustics Shock Vibration Signal Processing February 2004 Newsletter Greetings Feature Articles Speech is perhaps the most important characteristic that distinguishes humans from

More information

VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION

VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION Tomoyasu Nakano Masataka Goto National Institute of Advanced Industrial Science and Technology (AIST), Japan

More information

Frame Synchronization in Digital Communication Systems

Frame Synchronization in Digital Communication Systems Quest Journals Journal of Software Engineering and Simulation Volume 3 ~ Issue 6 (2017) pp: 06-11 ISSN(Online) :2321-3795 ISSN (Print):2321-3809 www.questjournals.org Research Paper Frame Synchronization

More information

Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction

Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction Hsuan-Huei Shih, Shrikanth S. Narayanan and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical

More information

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG Sangeon Yong, Juhan Nam Graduate School of Culture Technology, KAIST {koragon2, juhannam}@kaist.ac.kr ABSTRACT We present a vocal

More information

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University A Pseudo-Statistical Approach to Commercial Boundary Detection........ Prasanna V Rangarajan Dept of Electrical Engineering Columbia University pvr2001@columbia.edu 1. Introduction Searching and browsing

More information

Lab experience 1: Introduction to LabView

Lab experience 1: Introduction to LabView Lab experience 1: Introduction to LabView LabView is software for the real-time acquisition, processing and visualization of measured data. A LabView program is called a Virtual Instrument (VI) because

More information

A repetition-based framework for lyric alignment in popular songs

A repetition-based framework for lyric alignment in popular songs A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

How to Obtain a Good Stereo Sound Stage in Cars

How to Obtain a Good Stereo Sound Stage in Cars Page 1 How to Obtain a Good Stereo Sound Stage in Cars Author: Lars-Johan Brännmark, Chief Scientist, Dirac Research First Published: November 2017 Latest Update: November 2017 Designing a sound system

More information

Music Information Retrieval Using Audio Input

Music Information Retrieval Using Audio Input Music Information Retrieval Using Audio Input Lloyd A. Smith, Rodger J. McNab and Ian H. Witten Department of Computer Science University of Waikato Private Bag 35 Hamilton, New Zealand {las, rjmcnab,

More information

Topic 4. Single Pitch Detection

Topic 4. Single Pitch Detection Topic 4 Single Pitch Detection What is pitch? A perceptual attribute, so subjective Only defined for (quasi) harmonic sounds Harmonic sounds are periodic, and the period is 1/F0. Can be reliably matched

More information

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping 2006-2-9 Professor David Wessel (with John Lazzaro) (cnmat.berkeley.edu/~wessel, www.cs.berkeley.edu/~lazzaro) www.cs.berkeley.edu/~lazzaro/class/music209

More information

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio Curriculum Vitae Kyogu Lee Advanced Technology Center, Gracenote Inc. 2000 Powell Street, Suite 1380 Emeryville, CA 94608 USA Tel) 1-510-428-7296 Fax) 1-510-547-9681 klee@gracenote.com kglee@ccrma.stanford.edu

More information

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound Pitch Perception and Grouping HST.723 Neural Coding and Perception of Sound Pitch Perception. I. Pure Tones The pitch of a pure tone is strongly related to the tone s frequency, although there are small

More information

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing Universal Journal of Electrical and Electronic Engineering 4(2): 67-72, 2016 DOI: 10.13189/ujeee.2016.040204 http://www.hrpub.org Investigation of Digital Signal Processing of High-speed DACs Signals for

More information

Rec. ITU-R BT RECOMMENDATION ITU-R BT * WIDE-SCREEN SIGNALLING FOR BROADCASTING

Rec. ITU-R BT RECOMMENDATION ITU-R BT * WIDE-SCREEN SIGNALLING FOR BROADCASTING Rec. ITU-R BT.111-2 1 RECOMMENDATION ITU-R BT.111-2 * WIDE-SCREEN SIGNALLING FOR BROADCASTING (Signalling for wide-screen and other enhanced television parameters) (Question ITU-R 42/11) Rec. ITU-R BT.111-2

More information

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high. Pitch The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high. 1 The bottom line Pitch perception involves the integration of spectral (place)

More information

Singing Pitch Extraction and Singing Voice Separation

Singing Pitch Extraction and Singing Voice Separation Singing Pitch Extraction and Singing Voice Separation Advisor: Jyh-Shing Roger Jang Presenter: Chao-Ling Hsu Multimedia Information Retrieval Lab (MIR) Department of Computer Science National Tsing Hua

More information

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Investigation

More information

Musical Signal Processing with LabVIEW Introduction to Audio and Musical Signals. By: Ed Doering

Musical Signal Processing with LabVIEW Introduction to Audio and Musical Signals. By: Ed Doering Musical Signal Processing with LabVIEW Introduction to Audio and Musical Signals By: Ed Doering Musical Signal Processing with LabVIEW Introduction to Audio and Musical Signals By: Ed Doering Online:

More information

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT. An Advanced and Area Optimized L.U.T Design using A.P.C. and O.M.S K.Sreelakshmi, A.Srinivasa Rao Department of Electronics and Communication Engineering Nimra College of Engineering and Technology Krishna

More information

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University Week 14 Query-by-Humming and Music Fingerprinting Roger B. Dannenberg Professor of Computer Science, Art and Music Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Metadata-based

More information

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu

More information

Implementation of Memory Based Multiplication Using Micro wind Software

Implementation of Memory Based Multiplication Using Micro wind Software Implementation of Memory Based Multiplication Using Micro wind Software U.Palani 1, M.Sujith 2,P.Pugazhendiran 3 1 IFET College of Engineering, Department of Information Technology, Villupuram 2,3 IFET

More information

BASE-LINE WANDER & LINE CODING

BASE-LINE WANDER & LINE CODING BASE-LINE WANDER & LINE CODING PREPARATION... 28 what is base-line wander?... 28 to do before the lab... 29 what we will do... 29 EXPERIMENT... 30 overview... 30 observing base-line wander... 30 waveform

More information

Music Segmentation Using Markov Chain Methods

Music Segmentation Using Markov Chain Methods Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some

More information

Color Image Compression Using Colorization Based On Coding Technique

Color Image Compression Using Colorization Based On Coding Technique Color Image Compression Using Colorization Based On Coding Technique D.P.Kawade 1, Prof. S.N.Rawat 2 1,2 Department of Electronics and Telecommunication, Bhivarabai Sawant Institute of Technology and Research

More information

PCM ENCODING PREPARATION... 2 PCM the PCM ENCODER module... 4

PCM ENCODING PREPARATION... 2 PCM the PCM ENCODER module... 4 PCM ENCODING PREPARATION... 2 PCM... 2 PCM encoding... 2 the PCM ENCODER module... 4 front panel features... 4 the TIMS PCM time frame... 5 pre-calculations... 5 EXPERIMENT... 5 patching up... 6 quantizing

More information

Controlling Musical Tempo from Dance Movement in Real-Time: A Possible Approach

Controlling Musical Tempo from Dance Movement in Real-Time: A Possible Approach Controlling Musical Tempo from Dance Movement in Real-Time: A Possible Approach Carlos Guedes New York University email: carlos.guedes@nyu.edu Abstract In this paper, I present a possible approach for

More information

TANSEN: A QUERY-BY-HUMMING BASED MUSIC RETRIEVAL SYSTEM. M. Anand Raju, Bharat Sundaram* and Preeti Rao

TANSEN: A QUERY-BY-HUMMING BASED MUSIC RETRIEVAL SYSTEM. M. Anand Raju, Bharat Sundaram* and Preeti Rao TANSEN: A QUERY-BY-HUMMING BASE MUSIC RETRIEVAL SYSTEM M. Anand Raju, Bharat Sundaram* and Preeti Rao epartment of Electrical Engineering, Indian Institute of Technology, Bombay Powai, Mumbai 400076 {maji,prao}@ee.iitb.ac.in

More information