ELEC 484 Project Pitch Synchronous Overlap-Add

Similar documents
2. AN INTROSPECTION OF THE MORPHING PROCESS

A prototype system for rule-based expressive modifications of audio recordings

Pitch correction on the human voice

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

Automatic Construction of Synthetic Musical Instruments and Performers

Analysis, Synthesis, and Perception of Musical Sounds

Synthesizing a choir in real-time using Pitch Synchronous Overlap Add (PSOLA)

Tempo Estimation and Manipulation

Auto-Tune. Collection Editors: Navaneeth Ravindranath Tanner Songkakul Andrew Tam

AN ON-THE-FLY MANDARIN SINGING VOICE SYNTHESIS SYSTEM

S I N E V I B E S FRACTION AUDIO SLICING WORKSTATION

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm

Audio Compression Technology for Voice Transmission

An interdisciplinary approach to audio effect classification

Robert Alexandru Dobre, Cristian Negrescu

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Lab 5 Linear Predictive Coding

Fraction by Sinevibes audio slicing workstation

1 Introduction to PSQM

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing

OCTAVE C 3 D 3 E 3 F 3 G 3 A 3 B 3 C 4 D 4 E 4 F 4 G 4 A 4 B 4 C 5 D 5 E 5 F 5 G 5 A 5 B 5. Middle-C A-440

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

Lecture 9 Source Separation

Music Source Separation

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

Implementation of an 8-Channel Real-Time Spontaneous-Input Time Expander/Compressor

Design of a pitch quantization and pitch correction system for real-time music effects signal processing

POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

ANALYSIS-ASSISTED SOUND PROCESSING WITH AUDIOSCULPT

An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

PCM ENCODING PREPARATION... 2 PCM the PCM ENCODER module... 4

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

A New "Duration-Adapted TR" Waveform Capture Method Eliminates Severe Limitations

System Identification

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics

CONTENT-BASED MELODIC TRANSFORMATIONS OF AUDIO MATERIAL FOR A MUSIC PROCESSING APPLICATION

CSC475 Music Information Retrieval

A Composition for Clarinet and Real-Time Signal Processing: Using Max on the IRCAM Signal Processing Workstation

AN AUDIO effect is a signal processing technique used

Comparison Parameters and Speaker Similarity Coincidence Criteria:

SYNTHESIS FROM MUSICAL INSTRUMENT CHARACTER MAPS

6.111 Final Project: Digital Debussy- A Hardware Music Composition Tool. Jordan Addison and Erin Ibarra November 6, 2014

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong

Why Engineers Ignore Cable Loss

Music Radar: A Web-based Query by Humming System

Agilent PN Time-Capture Capabilities of the Agilent Series Vector Signal Analyzers Product Note

Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series

TERRESTRIAL broadcasting of digital television (DTV)

Elasticity Imaging with Ultrasound JEE 4980 Final Report. George Michaels and Mary Watts

Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm

Automatic Rhythmic Notation from Single Voice Audio Sources

Audio Processing Exercise

TEPZZ A_T EP A1 (19) (11) EP A1 (12) EUROPEAN PATENT APPLICATION. (51) Int Cl.: H04S 7/00 ( ) H04R 25/00 (2006.

TEPZZ 94 98_A_T EP A1 (19) (11) EP A1 (12) EUROPEAN PATENT APPLICATION. (43) Date of publication: Bulletin 2015/46

Experiments on musical instrument separation using multiplecause

Iterative Direct DPD White Paper

Signal processing in the Philips 'VLP' system

PRELIMINARY INFORMATION. Professional Signal Generation and Monitoring Options for RIFEforLIFE Research Equipment

y POWER USER MUSIC PRODUCTION and PERFORMANCE With the MOTIF ES Mastering the Sample SLICE function

The Measurement Tools and What They Do

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

Query By Humming: Finding Songs in a Polyphonic Database

Tempo and Beat Analysis

Module 8 : Numerical Relaying I : Fundamentals

Onset Detection and Music Transcription for the Irish Tin Whistle

Automatic LP Digitalization Spring Group 6: Michael Sibley, Alexander Su, Daphne Tsatsoulis {msibley, ahs1,

Voice & Music Pattern Extraction: A Review

A few white papers on various. Digital Signal Processing algorithms. used in the DAC501 / DAC502 units

Topic 10. Multi-pitch Analysis

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping

Musical Hit Detection

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

DATA COMPRESSION USING THE FFT

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

Reference Guide Version 1.0

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

Combining Instrument and Performance Models for High-Quality Music Synthesis

Design of a Speaker Recognition Code using MATLAB

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal

SMS Composer and SMS Conductor: Applications for Spectral Modeling Synthesis Composition and Performance

Hugo Technology. An introduction into Rob Watts' technology

Erasing 9840 and 9940 tapes

Real-time Granular Sampling Using the IRCAM Signal Processing Workstation. Cort Lippe IRCAM, 31 rue St-Merri, Paris, 75004, France

Toward a Computationally-Enhanced Acoustic Grand Piano

Rec. ITU-R BT RECOMMENDATION ITU-R BT * WIDE-SCREEN SIGNALLING FOR BROADCASTING

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

An Introduction to the Spectral Dynamics Rotating Machinery Analysis (RMA) package For PUMA and COUGAR

ECE438 - Laboratory 4: Sampling and Reconstruction of Continuous-Time Signals

Music for Alto Saxophone & Computer

Digital Signal Processing

Interacting with a Virtual Conductor

Transcription:

ELEC 484 Project Pitch Synchronous Overlap-Add Joshua Patton University of Victoria, BC, Canada This report will discuss steps towards implementing a real-time audio system based on the Pitch Synchronous Overlap and Add (PSOLA) algorithm. This time based algorithm along with Formant Preservation (PSOLAF) will be explored to produce the desired pitch manipulation effects. Some background information will be provided as well as motivation for using PSOLAF pitch shifting methods over less complex methods such as Time Stretch and Resample, and Delay Line Modulation will be discussed. An ideal solution to implementing the system will be discussed, along with an events timeline to completion and some possible audio test clips for evaluation will be determined. 1.0 Introduction The importance of pitch manipulation in the digital audio processing and effects world cannot be understated. Applications for pitch shifting may be found in vocoders, such as in cell phones, creation of realistic choir effects from a single singer, high audio playback equipment, audio editing and recording software, and voice disguising applications [SPL00]. The major motivation behind this project was to demonstrate a reliable way to modify pitch of an audio signal source for any of the above aforementioned applications. PSOLA methods offer some of the best sound reproduction with the fewest drawbacks and will be contrasted briefly with several other ways to modify audio signal pitch. 2.0 Pitch Related Methods There are several key ways to modify a source signal s pitch. The below methods are related to pitch shifting and cause a change in pitch but are not well suited for modern applications for several reasons to be explained. 2.1 Variable Speed Replay This method of pitch shifting is very straightforward and works by playing back the original sound at an increased or decreased rate, thus creating a shift in pitch. For example x(n),replay = x(n),in * c Where c < 1 is time expansion and c > 1 is time compression. Figure 1: VSR Leading to Time and Spectral Envelope Distortion [DAFX] Figure 1 shows the detrimental effects on the signal, mainly that the time of the clip is expanded and compressed deping on the pitch shift. This type of shifting also changes the spectral envelope, which makes the signal qualitatively sound like a chipmunk when compressed, and more like a baritone when expanded (c < 1). These effects are undesirable for practical use. 2.2 Delay-Line Modulation This method was described in several publications and can be implemented in several ways [BB89,DAFX]. The first principle of the proposed methods was to implement a pitch shift using two saw tooth waves to control the time varying delay line which were set half a period apart. The resulting output waveforms were multiplied by a cross fade filter and divided in to blocks. When the blocks were read faster or slower the pitch would go up or down 1

accordingly. The downside is a fair amount of distortion in the signal and the output signal becomes more noise prone. Figure 3: SOLA Time Manipulation Step 2: Shift the overlapping segments by the scaling factor (alpha). Figure 2: Pitch Shifting by Delay Line Modulation Alternatively an overlap and add scheme that does not require estimation of the fundamental frequency can be employed using three in phase time varying delay lines. Each line is used on a block that overlaps 2/3 of the next full block length. The result gives the same desired effect [DZ99]. 2.3 SOLA Time Stretch and Resample Basically this method takes the original signal uses the below SOLA algorithm and does a linear resample to get an output signal of the same time duration but with a shifted pitch. Resampling is done at the rate of alpha*ƒ s, where alpha is the time stretch or constant. 2.4 Synchronous Overlap Add (SOLA) This algorithm is important to all further study and thus is required to understand the more complex algorithms that are to be implemented in the real-time system. The synchronous overlap and add is done in several steps [MEJ86,RW85]: Step 1: Separate the input signal in to segments of fixed length and overlap as shown in Figure 3 below. Step 3: Search the overlapping samples for discrete time lag of max similarity. At a maximum point weight the samples by a fade in out function to avoid transients. Then add together to create final signal of changed time length. 3.0 Background The goal of pitch shifting is to modify up or down the pitch of an audio signal without losing its information, which is preserved in the frequency information and the harmonic ratios. If done correctly the new audio signal will be of the same length, sound like the original signal, but at a desired pitch. 3.1 Pitch Detection/Marking Detection and marking of pitches for the input sound are crucial to the next two algorithms. For input signals of constant pitch the desired pitch marks can be found at the time index location where the signal reaches it s maximum amplitude. However for more complicated signals involving multiple instruments and vocals this becomes a much more involved task. The main problem to solve requires then ls it s self to finding a way to separate the different pitch periods of the in order to accurately determine the pitch marks for each segment. 3.2 Pitch Synchronous Overlap Add (PSOLA) This method implements the SOLA algorithm and the time domain resample in a similar manner as mentioned previously in section 2.3. The major difference between the two comes in 2

the re-sampling where an interpolation is used between pitch marks to create the desired pitch effect as described by Moulines et al. [HMC89, MC90]. Voice and speech processing fall in to the category of applications that this particular algorithm excels at. Based on the assumption that the input can be characterized by a series of pitches, PSOLA remains a two-step process. First the input sound is segmented in to its harmonic, non-harmonic and transient parts then characterized by pitches, known as analysis. The second part is known as synthesis whereby various transformations can be then applied to the signal by a parameter set [SPL00]. These two phases are done as follows, with illustrations below for clarification: if the time signal is to be expanded or compressed. Scaling factors less than 1 will result in discarding of segments resulting in time compression. While a scaling factor greater than 1 will cause segments to be repeated resulting in time expansion. 3. Finally the new time index is found in order to centre the next synthesis segment and preserve the pitch. I. Analysis: 1. Determine the pitch period. Divide the signal in to small blocks where the pitch is considered constant. Finally do pitch detection on each block in succession. 2. Use a Hanning window centered on the pitch mark to extract each block length of two individual pitch periods. Thus providing for a smooth transition between blocks using a fade-in/fadeout effect between blocks [BJ95]. Figure 4: PSOLA Pitch Analysis [DAFX] II. Synthesis: Figure 5: PSOLA Synthesis (time stretching) [DAFX] The effect of this process is a shift in pitch. This is accomplished using a linear interpolation on the time stretched signal to recreate samples between the samples and then re-sampling to get the desired pitch. This approach is used rather than a simple re-sampling as seen in the SOLA algorithm and should offer much improved sound quality over the previously discussed methods. 3.3 PSOLA with Formant Preservation (PSOLAF) Using formant preservation is similar to resampling the time domain with the difference being that frequency re-sampling occurs for the short time spectral envelope rather than on the entire signal. The spectral envelope is defined as the line that goes through all the harmonic amplitudes as seen below in Figure 6. 1. Choose the analysis segment identified by its corresponding time marking. 2. Use the Overlap and Add algorithm where the scaling factor (alpha) decides 3

Figure 6: PSOLA Pitch Shifting: Frequency Re-sampling of Spectral Envelope [DAFX] All harmonics are scaled by the scaling factor, but the amplitudes are determined by sampling of the spectral envelope. Pitch markers must be placed pitch synchronously in accordance to the local maxima of each windowed function for good results during analysis [SPL00]. Figure 8: PSOLA Synthesis (pitch shifting) [DAFX] It is apparent during synthesis that rather than purely adding or removing segments from the signal in blocks and therefore stretching the time, the process results in an addition or removal of segments by overlapping of Hanning windows thus preserving the time duration of the signal while modifying its pitch. 4.0 Discussion and Results The project s final realization was achieved with some difficulties encountered along the way, which are to be examined below. Figure 7: PSOLA Analysis (pitch shifting) [DAFX] Preserving the formants of the signal effectively preserves the voice or instrument identity after synthesis has been completed [ML95]. Figure 7 above shows that PSOLA analysis when applied to pitch shifting is identical to the analysis for time stretching. Figure 8 below shows the difference during synthesis between time-stretch and resample method and pure pitch shifting. 4.1 PSOLA Final Implementation As it happened the bulk of the frustration came in trying to implement this algorithm of pitch scaling using the psola.m file from the DAFX text and a timescale and resampling method shown above. The m-file TimescaleResamplePSOLA.m simply calls the psola function with different alpha values to set the timescaling that is to occur. However there was a problem in matching matrix dimensions, the index dimensions at the Hanning window and during resampling of some signals which caused an outright faileur to process the signal for reasons that were unclear. The output of the psola algorithm gives a sound that is indeed shifted in pitch, but does not preserve the sound of the original signal. This can be observed on the x1.wav clip where the higher pitched voice sounds chipmunk like, and the lower pitched one sounds very baritone. These effects were successfully overcome using formant preservation as seen in the next section. 4

4.2 PSOLA with Formant Preservation Final Implementation This method was overall successful in producing the desired effect of pitch scaling. The produced sounds are almost identical to the original with no modification, with a slight addition of noise or clipping as it may be due to the Hanning windows. During scaling either up or down the integrity of the source is preserved well so that the resulting signal sounds like the source but at a higher or lower pitch deping on the alpha parameter. An alpha value higher than 1 results in a pitch that is higher while a fractional alpha less than 1 resulted in a lower pitch. Changing the gamma of the signal modified offered another range of options that was explored only briefly. Test files and outputs are available for very simple and short tones to longer ones including vocals. The parameters used in the test code to generate the resulting sounds can be found in the Matlab script file PSOLA_Formant.m available in the appix and on my website. Also available are several original.wav files from the DAFX text and the modified ones in.zip format to save space. The original files that were used in testing are: 1) la.wav 2) flute2.wav 3) moore_guitar.wav 4) x1.wav Sound files and m-files can be accessed at: http://www.ece.uvic.ca/~jpatton/yeshua1984 /Elec484/Elec484.html Several sound files were tested that did not work with the algorithm, these included some proposed in the initial report submission and also included extra samples of music from my own library. The error message as before with the PSOLA algorithm seemed to be related to pitch marks. This conclusion is based on an educated guess that the pitch marker program that was developed is not sophisticated enough to properly place the marks for complex signals with many harmonics. It could also be said that many of these signals which included multiple instruments and the like may not have had any primary harmonics to work on and this could have lead to the errors incurred. Another explanation may be that too many pitch marks were found (erroneously) such that the shifted Hanning window could not properly operate on the signal as this is where the psolaf1.m program failed with the more complex signals and the psola.m program failed for those signals as well as others. Since all the signals that did run for the psolaf1.m algorithm had fairly distinct pitches, it is safe to assume that the algorithm should work for all signals provided that the equivalent pitch marks are determined with very good accuracy. 5.0 Conclusions Considering the limited time of this project it is evident that much further work could be done in this area. However this being said, it was evident from the produced sound files that the project was successful in realizing a system that can modify pitch and maintain the integrity of the original sound signal and source. 6.0 Future Considerations Although this project was inted to be implemented as a real-time system it was impossible to do so with the amount of time and problems encountered. With further resources available and more understanding of transferring programs from a Matlab environment to a realtime system this PSOLA with Formant preservation program would be implemented in Marsays. Due to limitations in time and ability this did not occur. More importantly the detection of pitch with great accuracy should be considered a high priority as the better methods that preserve the message quality need input pitch marks to centre some sort of windowing method on. Without these marks placed properly this project is not very useful for any real world application. 5

7.0 References [BB89] K. Bogdanowicz and R. Blecher. Using Multiple Processors for real-time audio effects. In AES 7th International Conference, pp. 336-342, 1989. [BJ95] R. Bristow-Johnson. A detailed analysis of a time-domain format-corrected pitch shifting algorithm. J. Audio Eng. Soc., 43(5):340-353, 1995. [DAFX] U. Zolzer. Digital Audio Effects. John Wiley and Sons, pp. 202-225, 2005. http://www.dafx.de/ [DZ99] S. Disch and U. Zolzer. Modulation and delay line based digital audio effects. In Proc. DAFX-99 Digital Audio Effects Workshop, pp.4-8, Trondheim, December 1999. [HMC89] C. Hamon, E. Moulines and F. Charpentier. A diphone synthesis system based on time-domain prosodic modifications of speech. In Proc. ICASSP, pp.238-244, 1989. [MC90] E. Moulines and F. Charpentier. Pitch synchronous waveform processing technique for text-to speech synthesis using diphones. Speech Communication, 16:175-205, 1995. [MEJ86] J. Makhoul and A. El-Jaroudi. Timescale modification in medium to low rate speech coding. In Proc. ICASSP, pp.1705-1708, 1986. [ML95] E. Moulines and J. Laroche. Nonparameter technique for pitch-scale and timescale modification of speech. Speech Communication, 9(5/6):453-467, 1990. [RW85] S. Roucos and A.M. Wilgus. High quality time-scale modification for speech. In Proc. ICASSP, pp. 493-496, 1985. [SPL00] N. Schnell, G. Peeters, S. Lemouton, P. Manoury, X. Rodet, Synthesizing a choir in realtime using Pitch Synchronous Overlap Add (PSOLA). Ircam Centre Georges-Pompidou, pp. 1-4, 2000 6

Appix: PSOLA_Formant.m % Pitch Shifting by PSOLA with Formant Preservation % Josh Patton % PSOLA_Formant.m % Files required: % psolaf1.m % pitchmarker.m clear all close all clc %% la.wav [x,fs,nbits]=wavread('la.wav'); gamma=2; wavwrite(y, Fs, 'la_gamma2.wav'); beta=(3/2); wavwrite(y, Fs, 'la_high.wav'); beta=(3/4); wavwrite(y, Fs, 'la_low.wav'); %% flute2.wav [x,fs,nbits]=wavread('flute2.wav'); gamma=2; wavwrite(y, Fs, 'flute2_gamma2.wav'); beta=(3/2); wavwrite(y, Fs, 'flute2_high.wav'); beta=(3/4); wavwrite(y, Fs, 'flute2_low.wav'); %% moore_guitar.wav [x,fs,nbits]=wavread('moore_guitar.w av'); wavwrite(y, Fs, 'moore_guitar_gamma1.wav'); gamma=2; wavwrite(y, Fs, 'moore_guitar_gamma2.wav'); beta=(3/2); wavwrite(y, Fs, 'moore_guitar_high.wav'); beta=(3/4); wavwrite(y, Fs, 'moore_guitar_low.wav'); %% x1.wav [x,fs,nbits]=wavread('x1.wav'); wavwrite(y, Fs, 'x1_gamma1.wav'); gamma=2; wavwrite(y, Fs, 'x1_gamma2.wav'); beta=(3/2); wavwrite(y, Fs, 'x1_high.wav'); beta=(3/4); wavwrite(y, Fs, 'x1_low.wav'); 7

Appix: psolaf1.m % This function file preforms pitch shifting synchrounous overlap add with % formant preservation using pitch marks from an external source, and % psolaf1.m % based off of psolaf.m from DAFX function out=psolaf1(in,m,alpha,beta,gamma) %... % gamma newformantfreq/oldformantfreq %... P = diff(m); %compute pitch periods if m(1)<=p(1), %remove first pitch mark m=m(2:length(m)); P=P(2:length(P)); if m(length(m))+p(length(p))>length(in) %remove last pitch mark m=m(1:length(m)-1); else P=[P P(length(P))]; Lout=ceil(length(in)*alpha); out=zeros(1,lout); %output signal tk = P(1)+1; %output pitch mark while round(tk)<lout [minimum i]=min(abs(alpha*m-tk) ); % find analysis segment pit=p(i);pitstr=floor(pit/gamma); gr=in(m(i)-pit:m(i)+pit).*hanning(2*pit+1); gr=interp1(-pit:1:pit,gr,-pitstr*gamma:gamma:pit);% stretch segm. inigr=round(tk)-pitstr;gr=round(tk)+pitstr; if Gr>Lout, break; out(inigr:gr)=out(inigr:gr)+gr; % overlap new segment tk=tk+pit/beta; 8

Appix: TimescaleResamplePSOLA.m % Pitch Shifting by PSOLA Time Stretching and Resampling % Josh Patton % TimescaleResamplePSOLA.m % Files required: % psola.m % pitchmarker.m %% test one flute2 [x,fs,nbits]=wavread('x1.wav'); alpha=(3/2); y=psola(x,m,alpha,beta); y=resample(y,length(x),length(y)); wavwrite(y, Fs, 'psola_high_x1.wav'); alpha=(3/4); y=psola(x,m,alpha,beta); y=resample(y,length(x),length(y)); wavwrite(y, Fs, 'psola_low_x1.wav'); %% test moore_guitar [x,fs,nbits]=wavread('moore_guitar.wav'); alpha=1.5; y=psola(x,m,alpha,beta); y=resample(y,length(x),length(y)); wavwrite(y, Fs, 'psola_high_moore_guitar.wav'); alpha=0.75; y=psola(x,m,alpha,beta); y=resample(y,length(x),length(y)); wavwrite(y, Fs, 'psola_low_moore_guitar.wav'); 9

Appix: psola.m %psola.m %from DAFX %Josh Patton function out=psola(in,m,alpha,beta) % in input signal % m pitch marks (from PitchMarker.m function) % alpha time stretching factor % beta pitch shifting factor P = diff(m); %compute pitch periods if m(1)<=p(1), %remove first pitch mark m=m(2:length(m)); P=P(2:length(P)); if m(length(m))+p(length(p))>length(in) %remove last pitch mark m=m(1:length(m)-1); else P=[P P(length(P))]; Lout=ceil(length(in)*alpha); out=zeros(1,lout); %output signal tk = P(1)+1; %output pitch mark while round(tk)<lout [minimum i] = min( abs(alpha*m - tk) ); %find analysis segment pit=p(i); st=m(i)-pit; en=m(i)+pit; gr = in(st:en).* hanning(2*pit+1); inigr=round(tk)-pit; Gr=round(tk)+pit; if Gr>Lout, break; out(inigr:gr) = out(inigr:gr)+gr'; %overlap new segment tk=tk+pit/beta; 10

Appix: pitchmarker.m % pitchmarker.m % Josh Patton % Finds all the pitch marks in the input file and returns the % markings in a matrix function [ pitch ] = pitchmarker(blk_section) %% test from within (comment out the above function line) %[x,fs,bit]=wavread('moore_guitar.wav'); %blk_section=x; %% Detection % initial setup blk_size=400; mark=[1:length(blk_section)]*0; last_pos=1; place=1; blk_size=300; i=1; while last_pos+floor(blk_size*1.7) < length(blk_section) % grabs the next block to examine temp=blk_section(last_pos+50:last_pos+floor(bl k_size*1.7)); % finds the high point in the block [mag,place]=max(temp); % check for a signal in the current block if mag < 0.01 place=length(temp); mode = 0; mark(place+last_pos+50)=1; pitch(i)=place+last_pos+50; else mode = 1; % check for pitch mark before current pitch mark while mode == 1 % find the largest point in block from start to current pitch mark [mag2,place2]=max(temp(1:place-50)); % check if high mark has great enough magnitude to be a pitch mark if mag2 > 0.90*mag mag=mag2; place=place2; else mode = 0; mark(place+last_pos+50)=1; pitch(i)=place+last_pos+50; % next block to look at is 50 samples after current block blk_size=place+50; % makes sure next blk_size is of large enough size if blk_size < 150 blk_size=150; last_pos=place+last_pos+50; i=i+1; %% Plotting if needed % figure(1) % hold on % plot(mark) % plot(blk_section,'r') 11