Project Title. SPANA Development of Multimedia Tool for Learning Speech. Analysis. Supervisor: Dr. M.W. Mak

Size: px

Start display at page:

Download "Project Title. SPANA Development of Multimedia Tool for Learning Speech. Analysis. Supervisor: Dr. M.W. Mak"

Maria Skinner
5 years ago
Views:

1 Project Title SPANA Development of Multimedia Tool for Learning Speech Analysis Supervisor: Dr. M.W. Mak Student Name: Sit Chin Hung Student ID: D Period: Aug 2003 Apr 2004

2 Abstract Digital speech processing has wide applications in modern day such as mobile phone communication, voice recognition and voice verification systems. Speech analysis is the most fundamental to these applications. In order to help students learn the abstract concepts in speech analysis, a software package tool, SPANA, was developed. For this version of Spana, six functions, namely, Plotting of Pitch Contour, Plotting of LPCC-based spectral envelope, Plotting of spectral envelope by FFT-based cepstral liftering, Zooming of the speech signal in time domain, Interactive Fast Display and Interactive Spectral Plot were added. For the Plotting of Pitch Contour, AMDF method was used in pitch detection In order to reduce error in pitch detection, a probabilistic approach was applied. Plotting of LPCC-based spectral envelope and spectral envelope by FFT-based cepstral liftering were integrated to the PlotSpectralEnvelope( ) function which was responsible for plotting the frequency spectrum and LP spectral envelope. As a result, the PlotSpectralEnvelope( ) could plot all the envelopes stated above on the same screen. The Zoom function was accomplished by creating an event handler, OnMouseMove( ), to handle the mouse move event. Therefore, blow-up of the waveform of the speech signal in time domain could be done when the mouse pointer was moved across the waveform of the speech signal Interactive Fast Display was completed by adding an KeyDown event handler, OnKeyDown( ), which was used to response the keys pressed in keyboard. 2

3 Interactive Spectral Plot was done by reacting users changes in poles, zeros and sensitivity of LP parameters by changing the LPCC-based spectral envelope Spana was developed under MS Visual C++ environment with MFC It is believed that the addition of these functions to Spana has taken a step forward in making Spana more user-friendly and helping students to learn speech analysis more easily. 3

4 Acknowledgments I would like to offer my special thanks to my supervisor Dr M. W. Mak for his valuable advice and useful materials on handling the project. I was impressed by his willingness to give his time so generously to guide me to find the solution instead of giving me the solution directly. I would also like to extend my thanks to the technicians of the laboratory of the EIE department for their help in offering me the resources in the development of the project. 4

5 Table of Content Chapter 1 Introduction Background Objectives Organization...9 Chapter 2 Project Specification Plotting the LPCC-based spectral envelope Plotting the spectral envelope by FFT-based cepstral liftering Plotting of Pitch Contour Adding a Zoom function to view the speech signal in time domain Interactive Fast Display Interactive Spectral Plot...12 Chapter 3 Theories of Speech Analysis Pitch Estimation Smoothing the LP-based spectral envelopes by cepstral processing FFT-based cepstral liftering...15 Chapter 4 Window Programming Introduction The Windows Programming Model Microsoft Foundation Class (MFC) The Document/View Architecture Guide to create a simple windows program using MFC Create a Single Document Interface (SDI) Application Add a menu entry to Menu Adding function to the Menu Drawing a line GDI, Memory DC and Bitmap...25 Chapter 5 Methodology Plotting the LPCC-based spectral envelope Introduction Program Flowchart Getting the frame index Windowing the samples of the selected frame Computing the LPC coefficients Computing the Makhoul s a Computing the LPC gain Computing the cepstral coefficients Appending zero quefrency Computing the smooth spectrum from LP-derived cepstrum

6 5.2 Plotting the spectral envelope by FFT-based cepstral liftering Introduction Program flowchat Computing the frame index and then the windowed signals of a frame Computing the Short-Term Real Cepstrum (strc) Peform liftering, cut-off time at porder Computing spectral envelope based on strc Flow chart of the plotting function The plotting function Starting to plot the two spectral envelopes Plotting the Pitch Contour Introduction Program Flowchart Computation of the Mean and Standard Derivation Lowpass filtering the entire speech signal HANNING windowing the speech samples Computing the zero crossing rate Computing the pitch period estimates using AMDF Mean and standard derivation of the pitch period estimates Computing all candidate pitch periods for the selected frame Computing the zero crossing rate Filtering out the markers with the constraints Finding the constraints Weighting the markers with the normal distribution Adding a zoom function to view the speech signal in time domains Introduction Program flowchart Creating and showing the Zoom scale dialog The OnMouseMove( ) handler Getting the zoom factor to compute the new frame size Calculating the Start Play Sample index Windowing the frame samples Appending zeros for FFT Computation of frequency spectrum and the spectral envelopes Computed the frequency spectrum Computing the LPC envelope Computing the LPCC-based spectral envelope the spectral envelope by FFT-based cepstral liftering The Zoom Scale Dialog Interactive Fast Display

7 5.6.1 Introduction Program Flowchart Getting the current frame index and next frame index Setting the red vertical line to new position Interactive Spectral Plot Introduction Program flowchart Computing a new set of LP coefficients Computing the new LCPP-based spectral envelope Plotting the new LPCC-based spectral envelope...86 Chapter 6 Results and Discussion Plotting of the LPCC-based spectral envelope and the spectral envelope by FFT-based cepstral liftering Plotting of Pitch Contour Zooming the speech signals in time domain Interactive Fast Display Interactive Spectral Plot...95 Chapter 7 Conclusion and Recommendations Conclusion Recommendations for further work...98 References 99 7

8 Chapter 1 Introduction In order to develop and build applications using digital speech processing technology such as mobile phone communication, speech synthesis and speech recognition, we have to understand the characteristics of speech signal. Speech analysis refers to the analysis and extraction of characteristics of speech signal. For this goal, a speech analysis learning tool, SPANA, was therefore developed to help students learning speech analysis. The project began in Aug 2003 and completed in April Background SPANA has been developed a few years ago and was kept enhancing. It has been developed in Visual C++ environment using MFC. SPANA is run on Windows 32 application. 1.2 Objectives SPANA has been developed a few years ago. The objective of this project is to make enhancement to the SPANA, for example, integration of new features to SPANA. The new features in SPANA included Plotting of Pitch Contour, Plotting of LPCC-based Envelope, Plotting of spectral envelope by FFT-based cepstral liftering, Zooming of the speech signal in time domain, Interactive Fast Display and Interactive Spectral Plot. There are many features included in past versions of SPANA including Spectrogram Display, line spectrum pair analysis; average energy and zero crossing measurements. This report covers the theories of both speech analysis and Windows programming. Therefore, it 8

9 is suggested to have a fundamental knowledge in both speech analysis and window programming in order to have a better understanding of this project. 1.3 Organization The introduction to the background and the objectives of this project are presented in this chapter. The rest of the dissertation is organized as follows. Chapter 2 presents the specifications of the project. Chapter 3 gives the information about the speech analysis theories that was involved in this project. Chapter 4 provides information about window programming. Since MFC was used in this project, a brief introduction of MFC is included in this chapter including how to create a Single Document Interface, how to add function for a menus and how to paint using Device Context etc. Chapter 5 describes the methodologies used in the development of this project. It includes the flow charts of the algorithms involved in the project, procedures of the implementation and program codes of the procedures. Finally, conclusions are presented in Chapter 6 together with recommendations for further work. 9

10 Chapter 2 Project Specification The specifications of this project are as follows: 1. Plotting the LPCC based spectral envelope 2. Plotting the spectral envelope by FFT-based cepstral liftering 3. Plotting of Pitch Contour 4. Adding a zoom function to view the speech signal in time domains 5. Interactive Fast Display 6. Interactive Spectral Plot 2.1 Plotting the LPCC-based spectral envelope The LPCC-based spectral envelope refers to the envelope that is obtained by smoothing the LP-based spectral envelopes by cepstral processing. The advantage of LPCC-based spectral envelope is that it can provide a more consistent representation of a speaker s vocal tract characteristics. The envelope created from LP-derived cepstral coefficients (LPCCs) can track the peaks of the speech spectrum and hence it can be used as a feature for speaker recognition. 2.2 Plotting the spectral envelope by FFT-based cepstral liftering Spectral envelope by FFT-based cepstral liftering could be obtained by carrying out cepstral liftering and then FFT to the Short-Term Real Cepstrum (strc). Cepstral liftering is analogous to filtering in the usual frequency domain. The spectral envelope can be applied in formant estimation and pitch detection. The LPCC-based spectral envelope and spectral envelope by FFT-based cepstral liftering were painted on the same screen so that their relationship could be seen. 10

11 2.3 Plotting of Pitch Contour The pitch contour of a speech shows the pitch period for every frame of voiced sound. This lets users view the pitch periods at every frame of voiced sound. 2.4 Adding a Zoom function to view the speech signal in time domain The Zoom function should zoom the speech signal in time domain. Scaling of waveform blow-up is supported. In other words, users can tune the scale of zooming by adjusting the scale slider. There are four scales in the slider. When users move the mouse pointer, the zooming window would shift accordingly with the mouse pointer. The zoom window contains two parts, the upper part of the window will zoom the speech signal in time domain while the lower part will display the spectrum of the signal within the zoom window and its corresponding LPC envelope, LPCC-based spectral envelope and spectral envelope by FFT-based cepstral liftering 2.5 Interactive Fast Display In this version of Spana, users can react interactively with the Fast Display dialog by using the keyboard. That means the red vertical line (indicating the frame that is displaying in the Fast Display dialog) will shift accordingly with the keys or. For key, the red vertical line will shift to right while will shift it to left. For example, if the frame that is currently displayed in Fast Display dialog is 10 and the user presses once, the red vertical line will shift to right and the frame that will be displayed in Fast Display dialog is

2.6 Interactive Spectral Plot During the display of the spectral envelopes,

the following parameters: i. Poles in LP Pole Control dialog (Figure 2.1 A) ii.

Sensitivity of LP parameters in the Sensitivity of LP Parameters dialog (Figure

12 2.6 Interactive Spectral Plot During the display of the spectral envelopes, users can change the LPCC-based spectral envelope simultaneously by adjusting the following parameters: i. Poles in LP Pole Control dialog (Figure 2.1 A) ii. Zeros in LSP Control dialog (Figure 2.1 B) iii. Sensitivity of LP parameters in the Sensitivity of LP Parameters dialog (Figure 2.1 C) (A) (B) (C) Figure 2.1 (A) LP Pole Control dialog, (B) LSP Control dialog, (C) Sensitivity of LP Parameters dialog 12

13 Chapter 3 Theories of Speech Analysis The following theories of speech analysis were applied in this project. 1. Pitch Estimation 2. Smoothing the LP-based spectral envelopes by cepstral processing 3. Cepstral liftering 3.1 Pitch Estimation Basic algorithm for AMDF For each frame k, the short-term difference function AMDF is defined as follows: N 1 AMDFn ( j) = xn ( i) xn ( i + j), 1 j MAXLAG (3.1) N i= 1 Where MAXLAG is the maximum number of AMDF values generated in each frame. The difference function would have a local minimum if the lag j is equal to or very close to the fundamental period. Thus, for each frame, the lag for which the AMDF is a global minimum is a strong candidate for the pitch period of that frame [9]. Problem for this algorithm: The disadvantage of this algorithm is that the minimum in each frame is strongly affected by the intensity variation and the background noise of the speech signal. In order to reduce the errors due to the problem mentioned above, a global error correction routine is required for the pitch detection system to locate the incorrect estimates and correct the errors [9]. 13

14 3.2 Smoothing the LP-based spectral envelopes by cepstral processing The linear prediction (LP) analysis is based on the assumption that the current sample of speech signals s(n) can be predicted from the past P speech samples. This can be illustrated by the following equation: P a s( n s ( n) ~ s ( n) = k) (2.2) k k where = are called the LP coefficients. Another assumption is that the excitation source P ak k 1 Gu(n), where G is the gain and u(n) is the normaliszed excitation, can be separated from the vocal tract. By using these two assumptions, the vocal track can be represented by an IIR filter of the form: S ( z) 1 H ( z) = = P (2.3) Gu( n) k 1 + a z k The time-domain representation of the output s(n) of this IIR filter is a linear regression of its past output values and the present input Gu(n): k = 1 P s( n) = a k s( n k ) + Gu ( n) (2.4) k = 1 The LP analysis is aimed at computing a set of LP coefficients a = a,..., a for each frame of 1 P speech. As a result, the frequency response of Eq. 2.3 is as close to the frequency spectrum of the speech signal as possible. Therefore, vocal track of a speaker can be modeled by using the LP coefficients [1]. However, although LP coefficients represent the spectral envelope of the speech signals, it was found that a more consistent representation of a speaker s vocal tract characteristics can be obtained by smoothing the LP-based spectral envelopes by cepstral processing. The cepstral 14

15 coefficients c n can be computed from LP coefficients a k as follows: c = 0 lng n 1 k c n = an ck an k 1 n P k = 1 n (2.5) (2.6) n 1 k c n = ck an k n > P k = 1 n (2.7) where G is the estimated model gain and P is the prediction order. Fig.2.2 shows the process of computing the LP-based cepstral parameters. Since the parameters are derived from LP analysis, they are called LP-derived cepstral coefficient (LPCCs) [1]. speech Windowing and Pre-emphasis LP Analysis Frame Blocking Cepstral LPC vector vector Cepstral Transformation Figure 3.1 Computation of LPCCs from speech signals Since the envelope created by the LPCCs can track the peaks of the speech spectrum, LPCCs can be used as a feature for speaker recognition. 3.3 FFT-based cepstral liftering The spectral envelope by FFT-based cepstral liftering is obtained by carrying out cepstral liftering and then FFT (Fast Fourier Transform) to the Short-Term Real Cepstrum (strc). Figure 2.2 shows the computation of the spectral envelope by FFT-based cepstral liftering. 15

16 s(n) Zero padding FFT Log( ) s(n) Spectral envelope FFT Low-time lifter strc IFFT Figure 3.2 Computation of the spectral envelope by FFT-based cepstral liftering Liftering Linear filtering refers to filtering in quefrency domain. Therefore, low-time lifter is analogous to a lowpass filter in the usual frequency domain. strc 1 l(n) liftering L Figure 3.3 Liftering Where l(n) = 1, n=0, 1,, L 0, other than n 16

Chapter 4 Window Programming 4.1 Introduction 4.1.1 The Windows Programming Model DOS application uses a procedural programming model while windows programming is based on event-driven model.

17 Chapter 4 Window Programming 4.1 Introduction The Windows Programming Model DOS application uses a procedural programming model while windows programming is based on event-driven model. In windows programming model, there is a message queue storing the events to be handled later (Fig 4.1). An event can be a mouse move, a mouse click or minimizing a window frame etc. When there is an event happened, for instance, mouse pointer is moved over the window frame, the corresponding message, WM_MOUSEMOVE, would be generated. When the message enters the message queue, it will be passed to the message loop and dispatched to the corresponding message handler and the procedures included in the handler would be run accordingly. The message handler for WM_MOUSEMOVE is OnMouseMove(). Fig 4.1 depicts the Windows programming model. Figure 4.1 Windows programming model 17

18 4.1.2 Microsoft Foundation Class (MFC) If we want to create software with Graphical User Interface (GUI), Windows Application Interface (API) for windows can help us. However, there are many Windows API for Windows programming. Moreover, it is quite time consuming to develop software if we do all the work by calling Windows API directly. MFC is a library that provides multiple levels of support to developers. At one level, it provides a C++ class library that encapsulates the Windows API. Many of these classes encapsulate intrinsic Windows objects and their associated functions, allowing the developers to work at a somewhat higher level of abstraction than is experienced using the raw API. For example, to create a simple window in MFC, you declare an instance of the CWnd class and call its Create() function. All of the steps that are required to create a window (like defining a WndProc, registering a window class, etc.) are now provided by the CWnd class implementation MFC is used when include the header file Afxwin.h in the application The Document/View Architecture The document view architecture has been introduced since MFC 2.0. With this architecture, when you create a Single Document Interface (SDI) application, there would be four specific classes created to make up an SDI application -The CWinApp-derived class -The CFrameView-derived class -The CDocment-derived class -The Cview-derived class The CWinApp class receives all the event messages and then passes the messages to the CFrameView and CView classes. 18

The CFrameView class is the window frame. It is responsible for holding the menu, toolbar, scrollbars, and any other visible objects attached to the frame.

19 The CFrameView class is the window frame. It is responsible for holding the menu, toolbar, scrollbars, and any other visible objects attached to the frame. It is also for the determination how much of the document is visible at any time. The CDocument class houses your document. It is responsible for the storage and manipulation of data that makes up the document. The class receives input from the CView class and passes display information to the CView class. Moreover, retrieving the document data from files is done by this class. The CView class is for the display of the visual representation of the document for the user. It is responsible for passing input information to the CDocument class and receiving display information from the CDocument class. It should be noted that only one document can be opened at a time in an SDI application. On the other hand, a multiple document interface (MDI) application allows the existence of multiple documents with multiple views to each of the document and the frame window object is to host those views. Application object (CWinApp) Messages passed to the frame window and view Document object (CDocument) Two-way flow of information between the document and the view objects Fig 4.2 Data flow of the Document/View Architecture 19

20 Fig 4.2 shows a simple data flow in the document/view architecture. There is a message loop in the application object to retrieve the event-driven message. The application object (CWinApp) would act as a receiver to receives all the event messages and then passes the messages to the view object. The view object requests data from the document object while the document object would response by providing the necessary data to render the output in the view object. There are many advantages of using the document/view architecture. We can centralize the data source such that it is possible to view the same data with multiple views, one in the form of a table while another in the form of a chart. Moreover, when there is a modification to the data in any one of the view, the data in other views can be easily be synchronized by calling the UpdateAllView() function. Another important feature of MFC is command routing [5]. The command routing mechanism enables the command message almost anywhere in the application. 4.2 Guide to create a simple windows program using MFC Create a Single Document Interface (SDI) Application The following step shows how to create a new SDI application. Let us start a new project by selecting File New. 1.) In the Project Tab, select MFC AppWizard (exe). 2.) Type the project name and project location. Click OK. 3.) Select single document and check the Document/View Architecture Support. Click Next. 4.) Select None for no database support and then click Next. 5.) Select None for no compound document support and then click Next. 6.) Click the expected features of user interface and then click Next. 20

21 7.) Click Next again and then click Finish. A workspace is created and we can now develop our application through this framework Add a menu entry to Menu The following steps show you how to add a menu entry to Menu 1. Select the Resource View tab in the workspace pane 2. Select the project resources folder at the top of the tree; 3. Click the + of the Menu folder 4. Double-click the IDR_MAINFRAME, as shown in Figure 4.3. Figure 4.3 The Insert Resource dialog 6. Click the last rectangular box (the red circle) and input Test, then press Enter. 7. There will be a rectangular box appear below Test, click to highlighted it and input Draw Line as shown in Figure

Input all the parameters, as shown in Figure 4.5, and press Enter.

22 Figure 4.4 Enter menus entity 8. Right click Draw Line and select Properties in the pop-up menu. 9. Input all the parameters, as shown in Figure 4.5, and press Enter. Figure 4.5 The Menu Item Properties dialog 10. The menu entry has been created as shown in Figure

Figure 4.6 The new menu entry 4.2.3 Adding function to the Menu Windows program is event-driven.

23 Figure 4.6 The new menu entry Adding function to the Menu Windows program is event-driven. When we select a menu item, a message for this message would be generated and would be sent to message queue to invoke an operation. The operation for the event depends on the codes in the message handler. Thus, we have to add necessary codes to the handler. 1.) Select View ClassWizard. 2.) Select ID_MENUDrawLine in the Object ID column and Command in Messages column. 3.) Click Add Function to add the handler for the ID selected and click OK. 4.) Click Edit Code button to add the required program codes now. See Figure

24 Figure 4.7 MFC ClassWizard dialog for adding function 5. Now you can add the codes to the handler, OnMenuDrawLine( ), as shown in Figure 4.8. Fig 4.8 Adding code to the handler 24

25 4.2.4 Drawing a line In Windows programming, drawing graphics is done through the device context (DC). In Visual C++, the MFC device context provides numerous drawing functions for drawing circles, squares, lines, curves, and so on. The operating system uses the device context to learn in which context a graphic is being drawn, how much of the area is visible, and where on the screen it is currently located. In MFC, the drawing functions are wrapped by the CDC class. To draw a line, we should use the MoveTo() function to move to a starting point and then use LineTo() function to draw a line to the destination point. We can use the following codes to draw a line. Listing 4.1 Draw a Line from the point (20, 20) to the point (120, 120) void CHelloView::OnPaint() CPaintDC dc(this); // Device Context dc.moveto(20,20); dc.lineto(120,120); 4.3 GDI, Memory DC and Bitmap GDI stands for "Graphics Device Interface", DC for "Device Context". The designers of Windows decided that it would be nice to have a single way of drawing to all "things", The development of GDI is in order to provide a universal set of routines that can be used to draw onto a screen, printer, plotter or bitmap image in memory. Associated with a Device Context, a number of tools that can be used to act on the associated drawing surface: Pens, brushes, fonts etc. For memory DC, a number of present pens are provided, and more can be created as needed. 25

26 A Device Context is a handle to a drawing surface on some device.it can typically be obtained for the display device including printers and plotters. The most commonly worked with are window dc which is a display DC that merely represents the area of a single window and a memory DC that represents a bitmap as a device A Bitmap is the in-memory representation of a drawing surface. By linking a bitmap into a memory DC, the DC then represents that bitmap as a drawing surface, and all the normal GDI operations can be performed on the bitmap. GDI also has a number of functions that can copy areas from the drawing surface of one DC to another, so bitmaps then are a useful way to store images in memory that will later be copied to the display (or other devices). The bitmap and memory DC can be used to remove the flicker effect when updating the screen based on z-buffering technique. A bitmap object is an instance of the CBitmap class. It is not exactly the traditional bitmap graphic (BMP). Instead, a CBitmap object is a GDI object. It is an array of bits in which one or more bits correspond to each display pixel. We can load a bitmap graphic from a file to a CBitmap object or we can construct our own bitmap data of the CBitmap object. To create a CBitmap object, the following code is used. The third statement is to define the attributes of the object such as the resolution and color depth. In this case, the attributes is the same as the screen device context dcscreen and with both width and height equal to 100. Listing 4.2 Create a CBitmap object CClientDC dcscreen (this); // Device Context of the Client Window CBitmap bitmap; bitmap.createcompatiblebitmap (&dcscreen, 100, 100); 26

27 A memory DC is then created with attributes of the screen DC. To enable the GDI output functions to the memory DC, the CBitmap object is selected by the memory DC. In the example below, the GDI output function is FillRect( ) which draw a solid rectangle with blue color. Listing 4.3 Use of Memory DC CDC dcmem; // Create a Memory DC with attributes the same as the dcscreen dcmem.createcompatibledc (&dcscreen); CBrush brush (RGB (0, 0, 255)); CBitmap* poldbitmap = dcmem.selectobject (&bitmap); dcmem.fillrect (CRect (0, 0, 100, 100), &brush); dcmem.selectobject (poldbitmap); With the use of CBitmap and memory DC, an image can be pasted on the screen immediately instead of pixel by pixel. Listing 4.5 Paste the image from memory DC to the screen DC dcscreen.bitblt (0, 0, 100, 100, &dcmem, 0, 0, SRCCOPY); 27

28 Chapter 5 Methodology 5.1 Plotting the LPCC-based spectral envelope Introduction LPCC-based spectral envelope was obtained by smoothing the LP-based spectral envelope by cepstral processing. The function of plotting the envelope was added under the Plot Spectral Envelope menu and the envelope was plotted in the same screen as the LPC envelope Program Flowchart Start of Calculation function Get the frame index Compute cepstral coefficients of Makhoul s a Windowing the samples of the selected frame Append zero quefrecy to the cepstral coefficients Compute LPC coefficients Compute LPCC based spectral envelope Compute the Makhoul s a by using the LPC End of calculation function Compute the LPC gain Figure Flowchart of computation of the LPCC-based spectral envelope 28

29 5.1.3 Getting the frame index Before getting the frame index, it was necessary to know the total number of frames. Speech signal Offset Overlapping Discard this frame Figure Frame overlapping Assume there the following parameters. Overlapping = 50 % Number of samples = 95 windowsize =20. Number_of_frames = floor(number_of_samples / offset -1) = floor(95/(20*50%)-1) = 8 29

Listing 5.1.1 Getting the total number of frames (SpanaView.cpp) void CSpanaView::allocate(speech_parameter *sp)... sp->offset=( int16)((sp->windowsize)*sp->window_overlap+0.

30 Listing Getting the total number of frames (SpanaView.cpp) void CSpanaView::allocate(speech_parameter *sp)... sp->offset=( int16)((sp->windowsize)*sp->window_overlap+0.5); sp->number_of_frames=( int16)((float)sp->number_of_samples/ (float)sp->offset-(float)(sp->windowsize)/sp->offset)+1;... When there was a mouse left-click to the waveform of the speech signal, there would be a red vertical line at the point of mouse click and a Fast Display dialog would be shown in Figure Based on the x-coordinates of the red vertical line, the frame index can be evaluated. rect1 (0, 0) The red vertical line X xoffx Fast Display Dialog xoffx point.x = point.x - xoffs = M_dfMaxX 2*xoffx Y Index = M_dfMaxX point.x - xoffs X M_dfMaxX 2*xoffx Number of frame Figure Finding the frame index 30

31 Listing Finding the frame index (SpanaView.cpp) void CSpanaView::OnLButtonDown(UINT nflags, CPoint point) CRect rect1; this->getclientrect(&rect1); m_dfmaxx = rect1.right; index = (short)((sp.number_of_frames)/(m_dfmaxx-2*xoffs)*(point.x-xoffs)+1); Windowing the samples of the selected frame The frame index was then used to get the windowed signals for the selected frame. Each frame of the speech signals had been windowed once the speech file was loaded. The windowed signal for the entire speech file could be referenced by the following pointer. float **w; // pointer to matrix containing windowed data // range: w[0.. sp->number_of_frames-1][0.. // sp->windowsize-1] sp.w[0] w[0][0] w[0][1]... w[0][windowsize-1] sp.w[1] w[1][0] w[1][1]... w[1][windowsize-1] sp.w[2] w[2][0] w[2][1]... w[2][windowsize-1]... sp.w[num-1] w[num-1][0] w[num-1][1]... w[num-1][windowsize-1] Where num = number_of_frames Figure Structure of windowed data for the selected frame After getting the frame index, the windowed frame signal can be referenced by the following code. sp.w[index-1]; // index=1, 2, 3,, number_of_frame 31

32 5.1.5 Computing the LPC coefficients lpc = [1 a(1) a(2) a(3) a(order)], where a is the LPC coefficients and porder is prediction order autocc, order calc_lpc(order,autocc,lpc,k) lpc, K Where autocc = autocorrelation of x (x = sp.w[index-1] ) order = prediction order K = reflection coefficients lpc = 1 a(1) a(2) a(3) a(porder) porder+1 Figure Structure of lpc Computing the Makhoul s a lpc cal_a_markhoul(porder,lpc,windowsize) Makhoul s Makhoul s a = a(1) a(2) a(porder) porder windowsize Figure Structure of Makhoul s a Listing Computing the Makhoul s a (SpanaView.cpp) float * CSpanaView::cal_a_Markhoul( int16 porder, float *lpc, int16 windowsize) float * a=new float[porder+windowsize]; 32

33 //append the lpc to a_makhoul first for(int i=0;i<porder;i++) a[i]=lpc[i+1]; //then append zeros to a_marhoul for(i=0;i<windowsize;i++) a[i+porder]=0; return a; Computing the LPC gain porder, x, framesize, a_makhoul LPCGain(x, a_makhoul,porder,framesize) gain Where a_makhoul = makhoul s a x = sp.w[index-1] Listing Computing the LPC gain (SpanaView.cpp) float CSpanaView::LPCGain(float *x, float *a_makhoul, int16 porder, int16 framesize) // R0=dot(x,x); float temp=0; float *R=new float[porder]; float R0=0; float energy; float gain; //cal the dot product of a for(int i=0;i<framesize;i++) R0+=x[i]*x[i]; // for j=1:porder, for (int j=1;j<=porder;j++) 33

34 temp=0; for (int m=0;m<framesize-j;m++) temp=temp+x[m]*x[m+j]; R[j-1]=temp; temp=0; for (int k=0;k<porder;k++) temp=temp+a_makhoul[k]*r[k]; energy=r0+temp; gain=float(pow((double)energy,0.5)); delete R; return gain; Computing the cepstral coefficients a_makhoul porder Lpc2cep(a_makhoul, porder) tempc tempc = c(0) c(1) c(2) (2*pOrder-1) 2*pOrder Figure Structure of cepstral coefficients, tempc Listing Computing the cepstral coefficients (SpanaView.cpp) float * CSpanaView::lpc2cep(float *a_makhoul, int16 porder) //Convert to c(1) to c(porder) float temp=0; 34

35 int16 n,m; float *c=new float[2*porder]; for(n=1;n<=porder;n++) temp=0; for (m=1;m<=(n-1);m++) temp=temp+m*c[m-1]*a_makhoul[n-m-1]/n; c[n-1]=a_makhoul[n-1]-temp; //Convert to c(porder+1) to c(porder*2) for (n=porder+1;n<=2*porder;n++) temp=0; for (m=1;m<=(n-1);m++) temp=temp+m*c[m-1]*a_makhoul[n-m-1]/n; c[n-1]=-temp; //Convert to cepstral coefficients of H(z) for(int i=0;i<2*porder;i++) c[i]=-1*c[i]; return c; Appending zero quefrency c = Log(gain) tempc(0) tempc(2*porder-1) *pOrder N -2*pOrder-1 N Where N = windowsize Figure Structure of c 35

36 Listing Appending zero quefrency (SpanaView.cpp) void CSpanaView::PlotLPCSpectral() tempc=lpc2cep(a,sp.order); //Append zero quefrency c[0]=(float)log(gain); for(i=0;i<2*sp.order;i++) c[i+1]=tempc[i]; //append (N - 2*sp.order) zeros to c for (i=2*sp.order+1;i<sp.windowsize;i++) c[i]=0; Computing the smooth spectrum from LP-derived cepstrum c Y c_fft lpcepspecenv_buffer FFT(c, N) Real(Y) exp(c_fft) Y is the complex number obtained from FFT(c, N), where N is the windowsize. It should be noted that FFT(c, N) and Real(Y) together form the function FFT_complex( ), which gives the real part of FFT(c, N) to c_fft. c_fft = c_fft(0) c_fft(1) c_fft(n/2) N/2+1 lpcepspecenv_buffer = c _ fft(0) _ fft (1) e c e... c _ fft ( N / 2+ 1) e N/2+1 Figure Structure of the smooth spectrum 36

37 Listing Computing the smooth spectrum (SpanaView.cpp) float * CSpanaView::lpc2cep(float *a_makhoul, int16 porder) //Compute smooth spectrum from LP-derived cepstrum //lpcepspecenv=exp(real((fft(c,n)))); float *c_fft =new float[sp.windowsize]; float *lpcepspecenv=new float[sp.windowsize]; FFT_complex(c,c_fft,sp.windowsize); for(i=0;i<=sp.windowsize/2;i++) lpcepspecenv_buffer[i]=(float)exp(c_fft[i]); 37

38 5.2 Plotting the spectral envelope by FFT-based cepstral liftering Introduction The spectral envelope by FFT-based cepstral liftering is obtained by carrying out cepstral liftering and then FFT to the Short-Term Real Cepstrum (strc). Similarly, the function of plotting the spectral envelope by FFT-based cepstral liftering was added under Plot Spectral Envelope menu. The envelope was plotted in the same screen as the LPC envelope Program flowchat Calculation function starts Get the frame index Based on the frame index to get the windowed signal Compute the short-time real cepstrum Compute spectral envelope Perform liftering Calculation function ends Figure Flowchart of plotting the spectral envelope by FFT-based lifting 38

39 5.2.2 Computing the frame index and then the windowed signals of a frame The computation of frame index and the windowed signal for the selected frame had been discussed in session and respectively Computing the Short-Term Real Cepstrum (strc). sp.w[index-1] N Y FFT(sp.w, N) Abs( Y ) strc IFFT x_fft sp.winbuffer1 Log(sp.winbuffer1) Figure Flowchart of computing the short-time real cepstrum Where Y is the complex number returned from FFT( ). Listing Computing the short-time real cepstrum (SpanaView.cpp) void CSpanaView::OnLButtonDown(UINT nflags, CPoint point) FFT(sp.w[index-1], sp.winbuffer1, sp.windowsize); void CSpanaView::PlotLPCSpectral() float *strc =new float[sp.windowsize]; float *x_fft =new float[sp.windowsize]; float *x_ifft =new float[sp.windowsize]; //get log(abs(fft(x,n))) for(i=0;i<=sp.windowsize/2;i++) x_fft[i]=(float)log(sp.winbuffer1[i]); //make x_fft[i] symmetrical for IFFT This FFT( ) integrates both the FFT(sp.w, N) and Abs(Y). i.e. return sp.winbuffer1 directly. 39

40 for(i=1;i<sp.windowsize/2;i++) x_fft[sp.windowsize/2+i]=x_fft[sp.windowsize/2-i]; //perform ifft(log(abs(fft(x,n)))) IFFT(x_fft,x_ifft,sp.windowsize); strc=x_ifft; Y = Re(Y(0)) Im(Y(0)) Re(Y(1)) Im(Y(1))... Re(Y(N/2)) Im(Y(N/2)) N + 2 sp.winbuffer1 = Y(0) Y(1)... Y(N/2) N/2 + 1 x_fft = Log( Y(0) ) Log( Y(1) )... Log( Y(N/2) ) N/2 + 1 strc = strc(0) strc(1)... strc(n-1) N Figure Structures of Short Term Real Cepstrum, strc 40

41 5.2.4 Peform liftering, cut-off time at porder After liftering was performed, strc became: strc = strc(0) strc(1) strc(n-porder-1) strc(n-1) porder N-2*pOder porder N Figure Structure of strc after liftering Listing Perform liftering (SpanaView.cpp) void CSpanaView::PlotLPCSpectral( ) //Perform liftering, cut-off time at porder for(i=sp.order;i<(sp.windowsize-sp.order);i++) strc[i]=0; Computing spectral envelope based on strc. strc, N Y FFT(stRC, N) Real( Y ) strc_fft cepspecenv exp(strc_fft) Figure Flowchart of computing spectral envelope based on strc 41

42 Y = Re(Y(0)) Im(Y(0)) Re(Y(1)) Im(Y(1))... Re(Y(N/2)) Im(Y(N/2)) N + 2 strc_fft = Re(Y(0)) Re(Y(1))... Re(Y(N/2)) N/2 + 1 cepspecenv = Re(Y(0)) e Re(Y (1)) e... e Re( Y( N/2)) N/2 + 1 Figure Structure of the spectral envelope, cepspecenv Listing Computing the spectral envelope, cepspecevn (SpanaView.cpp) void CSpanaView::PlotLPCSpectral( ) //Compute spectral envelope based on strc //cepspecenv=exp(real(fft(strc,n))); ///float *cepspecenv=new float[sp.windowsize]; float *strc_fft =new float[sp.windowsize]; FFT_complex(stRC,stRC_fft,sp.windowsize); for(i=0;i<=sp.windowsize/2;i++) CepSpecEnv[i]=(float)exp(stRC_fft[i]); This FFT( ) integrates both the FFT( ) and Real( ). i.e. return real part only. 42

43 5.2.6 Flow chart of the plotting function Start of plotting function Declare a Memory DC, Screen DC and a CBitmap object Select the CBitmap into the Memory DC Declare a CRect object to be the Virtual Screen in Memory DC Plot the speech signal in time domain in the upper part of the Virtual Screen Plot the x-axis, y-axis and other general information Plot the LPCC based spectral envelope in the lower part of the Screen in Memory DC Plot the spectral envelope by FFT-based cepstral liftering. Plot x-axis, y-axis and other general information End of plotting function Figure Flow chart of the plotting function for the spectral envelopes 43

44 5.2.7 The plotting function. Plotting the spectral envelope (variable name: CepSpecEnv) by FFT-based cepstral liftering and LPCC based spectral envelope (variable name: lpcepspecenv_buffer) was done by the PlotSpectalenvelope( ) function. Listing Plotting the two spectral envelopes (SpanaView.cpp) void CSpanaView::PlotSpectralenvelope() // line 4(lpCepSpecEnv_buffer)start here CPen pen_lpcepspecenv(ps_solid, 1, RGB(0,0,255)); poldpen = dc.selectobject(&pen_lpcepspecenv); dc.moveto(int(xoffs),int((rect.bottom)+yoffs-((float)lpcepspecenv_buffer[0]-miny)*stepy)); x_coor=0; for(i=0;i<number_of_values;x_coor+=stepx,i++) dc.lineto((int)x_coor+xoffs,int((rect.bottom)-((float)lpcepspecenv_buffer[i]-miny)*stepy)); //line 5 (cepspecenv) start here////////////////////// CPen pen_cepspecenv(ps_solid,1,rgb(255,0,255)); poldpen = dc.selectobject(&pen_cepspecenv); dc.moveto(int(xoffs),int((rect.bottom)+yoffs-((float)cepspecenv[0]-miny)*stepy)); x_coor=0; for(i=0;i<number_of_values;x_coor+=stepx,i++) dc.lineto((int)x_coor+xoffs,int((rect.bottom)-((float)cepspecenv[i]-miny)*stepy)); // line 5 end here///////////////////////////////// Starting to plot the two spectral envelopes When there is a mouse left-click to the waveform of the speech signals, the FastDisplay dialog would be popped out. As shown in Figure

Fast Display dialog Figure 5.2.7 The Fast Display dialog The FastDisplay dialog provides user a fast display of speech signal in time domain and its corresponding frequency spectrum spectral envelope.

45 Fast Display dialog Figure The Fast Display dialog The FastDisplay dialog provides user a fast display of speech signal in time domain and its corresponding frequency spectrum spectral envelope. In this version of Spana, the spectral envelope by FFT-based Cepstral liftering and LPCC-based spectral envelope could be plotted on the Fast Display dialog. For simplicity, we could use the above data for plotting the two envelops. In order words, plotting the two envelopes on the Fast Display dialog and plotting in the main window use the same data source. After the calculation of the two envelopes, the data of them had been assigned to the variables in the class of FastDisplay dialog. The following codes assigned the data in CepSpecEnv, and pcepspecenv_buffer to CepSpecEnv and lpcepspecenv which were belonging to the class of FastDisplay dialog respectively. Listing Assigning data to the Fast Display dialog (SpanaView.cpp) void CSpanaView::OnLButtonDown(UINT nflags, CPoint point) m_fastdisplaydlg.cepspecenv = new float[sp.windowsize]; 45

46 m_fastdisplaydlg.cepspecenv = CepSpecEnv; m_fastdisplaydlg.lpcepspecenv = new float[sp.windowsize]; m_fastdisplaydlg.lpcepspecenv = lpcepspecenv_buffer; the FastDisplay dialog Listing Plotting of the two spectral envelopes on Fast Display dialog (SpEnGraphDlg.cpp) void CSpEnGraphDlg::OnPaint() /////////////////plot CepSpecEnv/////////////// CPen pen_cepspecenv(ps_solid,1,rgb(0,255,0)); poldpen = dc.selectobject(&pen_cepspecenv); dc.moveto(int(xoffs),int((rect.bottom)+yoffs-((float)cepspecenv[0]-miny)*step y)); x_coor=0; for(i=0;i<number_of_values;x_coor+=stepx,i++) dc.lineto((int)x_coor+xoffs,int((rect.bottom)-((float)cepspecenv[i]-miny)*stepy)); ////////////////CepSpecEnv line here//////////// ////////////////plot lpcepspecenv//////////////// CPen pen_lpcepspecenv(ps_solid,1,rgb(0,0,0)); poldpen = dc.selectobject(&pen_lpcepspecenv); dc.moveto(int(xoffs),int((rect.bottom)+yoffs-((float)lpcepspecenv[0]-miny)*stepy)); x_coor=0; for(i=0;i<number_of_values;x_coor+=stepx,i++) dc.lineto((int)x_coor+xoffs,int((rect.bottom)-((float)lpcepspecenv[i]-miny)*stepy)); /////////////////lpcepspecenv end///////////////////////// 46

47 5.3 Plotting the Pitch Contour Introduction For every frame of voiced signals, there must be a pitch period for that frame. A line joining all the points of pitch period for the whole speech represents the pitch contour. I had used AMDF (Average Magnitude Difference Function) in computing the pitch period together with a probabilistic approach to correct the errors during the computation of pitch period in this project Program Flowchart Start of Pitch Detection Next frame Compute the mean and standard derivation Weight the markers with the normal distribution Compute all candidate pitch periods the selected frame Pitch period marker Compute the zero crossing rate Store the pitch period marker in m_pitch[f] NO The frame is voiced? End of the file? NO YES YES Filter out the markers with the End of Pitch Detection constraints Figure Flowchart of Pitch Detection algorithm 47

48 5.3.3 Computation of the Mean and Standard Derivation Computation of mean and standard derivation of the pitch period estimates for the whole speech was done by the FindMean_Std( ) function. Start of FindMean_Std( ) Next frame Low pass filter Store the Pitch Period HANNING windowing End of file? NO YES Compute the zero crossing rate Compute mean for the array NO Voiced frame? Compute standard derivation YES Find the pitch period estimate End of FindMean_Std( ) Figure The program flow of the FindMean_Std( ) 48

5.3.3.1 Lowpass filtering the entire speech signal In order to eliminate the effects of intensity variation and background noise, passed the speech samples to a lowpass filter with 3dB attenuation at

49 Lowpass filtering the entire speech signal In order to eliminate the effects of intensity variation and background noise, passed the speech samples to a lowpass filter with 3dB attenuation at 600 Hz and 40 db attenuation at 900 Hz.. The required filter was designed with the help of FDATool (Filter Design & Analysis Tool) in MATLAB. Design Procedure A. Designed the filter Run MATLAB, and in the command line, typed: fdatool. The FDATool window will be shown as shown in Figure1.3. Inputted all the filter parameters and clicked the Filter Design button to initialize the design process Figure The FDATool 49

When the filter design finished, the frequency response of the filter could be obtained as shown in Figure 5.3.

Obtained the filter coefficients Went to File Export, then selected Export To Text-file, click OK.

50 When the filter design finished, the frequency response of the filter could be obtained as shown in Figure Figure Frequency Response of the required filter B. Obtained the filter coefficients Went to File Export, then selected Export To Text-file, click OK. The coefficients of the filter have been exported to a text file. Figure Export the filter coefficients to a text file Opened the text file that have been created and copied all the coefficients to the lowpassfilter( ) function. The lowpassfilter( ) function was to convolute the input speech samples. 50

51 Listing Lowpass filter the speech samples (SpanaView.cpp) void CSpanaView::pitch_detection(int *m_pitch_array_size) //filter the speech signal lowpassfilter(sp.spcdata,spcdata_filtered,sp.number_of_samples); void CSpanaView::lowpassfilter( int16 *spcdata, int16 *spcdata_filtered,long num_spc_samples) const double B[59] = , e-005, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , e-005, ; for(int n=0;n<num_spc_samples;n++) spcdata_filtered[n]=0; for(int m=0;m<59;m++) if ((n-m)<0) else Filter coefficients Convolution spcdata_filtered[n]= int16(b[m]*spcdata[n-m]+spcdata_filtered[n]); 51

52 HANNING windowing the speech samples In order to increase the accuracy of AMDF, for each frame of samples, they should be preprocessed by a HANNING window. By calling the windowing function directly and passed HANNING as a parameter to the windowing function, HANNING windowing was performed. The computation steps of the windowing have been discussed in.theory. Listing HANNING windowing a frame of samples (SpanaView.cpp) void CSpanaView::pitch_detection(int *m_pitch_array_size) windowing(x, framedata, win_size, HANNING, sp.norm_factor,sp.prem_factor); Computing the zero crossing rate The purpose of computing the zero crossing rate was to justify whether the frame of samples was voiced or unvoiced. The frequency of voiced sound was lower than that of unvoiced sound. Listing Computing the zero crossing rate (SpanaView.cpp) void CSpanaView::pitch_detection(int *m_pitch_array_size) for(int f=0; f<num_frames; f++) zerox = 0; for(int m=1; m<m_win_size; m++) 52

53 zerox += abs(sgn(framedata[m])-sgn(framedata[m-1]))/2; zerox /= m_win_size; Computing the pitch period estimates using AMDF AMDF ( j) = n 1 N N i= 1 x ( i) x n n ( i + j),1 j MAXLAG If the frame of samples was voiced, we would start the computation of AMDF. For each frame, the AMDF would be computed for each lag, j, and the magnitude of AMDF would be stored in an array, Delta [ ], which would be used in searching for the pitch period estimate. The pitch period estimate was the lag for which the magnitude of AMDF was the global minimum in the selected frame and the distribution of the pitch period estimates would be approximated with a normal distribution. On the other hand, if the frame of samples was unvoiced, the pitch period for that frame would be set to zero. The pitch period for unvoiced frame would not be counted into the distribution of pitch period estimates. Listing Setting the pitch period to zero for unvoiced frame and computation of AMDF (SpanaView.cpp) void CSpanaView::pitch_detection(int *m_pitch_array_size) if((zerox>0.3 && wh.bps==16) zerox == 0) If unvoiced, set the pitch m_pitch[f] = 0; period to zero else if((zerox>0.6 && wh.bps==8) zerox == 0) m_pitch[f] = 0; else 53

54 // AMDF for(int i=1; i<=m_win_size; i++) N = 0; Delta[i-1] = 0; for(int j=0; j<m_win_size; j++) Delta[i-1] += (float)fabs(framedata[m_win_size+j-i]-framedata[m_win_size+j]); N++; Delta[i-1] /= N; Mean and standard derivation of the pitch period estimates After evaluating all the pitch period estimates for the whole speech, we started to compute the mean and standard derivation of the pitch period estimates. Listing Mean of the pitch period estimates (SpanaView.cpp) float CSpanaView::average(int num_frame,int *global_min_location) int sum=0; //sum of all the lags float mean=0;//mean of the speech for(int i=0;i<num_frame;i++) sum+=global_min_location[i]; mean=sum/(float)num_frame; return mean; Listing Standard derivation of the pitch period estimates (SpanaView.cpp) float CSpanaView::STD(float mean,int *global_min_location,int num_frame) float std=0; //standard derivation float var=0; //variance for(int i=0;i<num_frame;i++) 54

55 var+=(global_min_location[i]-mean)*(global_min_location[i]-mean)/(num_frame); std=(float)pow(var,0.5); return std; Computing all candidate pitch periods for the selected frame Candidate pitch periods of a frame refer to the lags for which the AMDF were the local minima in a frame. Searching for the local minima was accomplished by the Findlocal_min( ) function. Delta[ ] Number of minima num_min Findlocal_min( ) local_min Locations of minima local_min_location Values of minima Figure Input and output of Findlocal_min( ) function 55

56 Next sample Start of FindLocalmin( ) Store Delta[i] into local_min Set i=1, counter =0 i and counter increments YES Delta[i] is last End of Delta[i]? NO sample? NO YES NO Delta[i]<(Delta[i-1] YES Assign counter to num_min &Delta[i-1]) End of FindLocalmin( ) Figure Program flow of computing the candidate pitch periods Listing Finding all the candidate pitch periods in a frame (SpanaView.cpp) int *CSpanaView::Findlocal_min(float *Delta,int MaxLag,int *counter,float *MinData) for(int j=1;j<maxlag;j++) if(j==maxlag-1) //the last sample if(delta[j]<delta[j-1]) *counter+=1; pdata[*counter-1]=j+1; MinData[*counter-1]=Delta[j]; 56

57 else if((delta[j]<delta[j-1])&&(delta[j+1]>delta[j])) tempdelta=delta[j]; tempindex=j; *counter+=1; pdata[*counter-1]=j+1; MinData[*counter-1]=Delta[j]; return pdata; Computing the zero crossing rate Again, in order to justify whether the frame was voiced or unvoiced, it was necessary to compute the zeros crossing rate. Since computation of zero crossing rate had been discussed in , please refer to that section for details Filtering out the markers with the constraints In most AMDF-based PDAs (Pitch Detection Algorithm), the lag for which the magnitude of the difference function is a global minimum is chosen as the pitch period estimate for that frame. In this AMDF PDA, we not only computed the lag with global minimum, but also a set of candidates for the pitch period in a frame was selected. Please refer to the theory. To be a marker, the candidate pitch periods must satisfy the AMDF pattern constraints that were stated in Theory. The computation of markers was implemented by the FindMarker( ) function. 57

58 Listing Finding the marker_location (lag) and marker_height (magnitude of AMDF for the lag) (SpanaView.cpp) void CSpanaView::pitch_detection(int *m_pitch_array_size) FindMarker(marker_height,marker_location,m_win_size,Delta,num_marker); Start of FindMarker( ) Find all the candidate pitch periods Store the candidate as a marker Find the constraints for each candidate Any candidates? YES Constraints satisfied? YES NO End of FindMarker( ) NO Figure Flowchart of filtering out the markers 58

59 Finding the constraints a. global_max The global_max was found by Findglobal_max( ) function. Delta[ ] Findglobal_max( ) global_max global_max_location Start of Findglobal_max( ) Next sample i increments YES i=0 NO Delta[i] < YES YES Delta[i+1]< Delta[i+1] buffer NO Store Delta[i] in buffer NO Store Delta[i+1] in buffer End of Delta[i]? NO YES global_max=buffer Start of Findglobal_max( ) Figure Program flow finding the global_max 59

60 Listing Finding the global_max (SpanaView.cpp) int CSpanaView::Findglobal_max(float *Delta,int MaxLag,float *MaxDelta ) for(int j=0;j<maxlag-1;j++) if (Delta[tempIndex]>Delta[j+1]) tempdelta=delta[tempindex]; else tempdelta=delta[j+1]; tempindex=j+1; *MaxDelta=tempDelta; return tempindex+1; b. height i height = min( left _ height, right _ height ), which was computed by the i FindHeight( ) function. i i Array of heighti Local_max num_local_max FindHeight( ) height_i Figure Input and output of FindHeight( ) function Noted that the Local_max was found before running Findlocal_min( ). Listing Finding height i (SpanaView.cpp) void CSpanaView::FindHeight(float *local_max,int num_min,float *height_i,float *local_min) for(int i=0;i<num_min;i++) 60

61 if (local_max[i]<local_max[i+1]) height_i[i]=local_max[i]-local_min[i]; else height_i[i]=local_max[i+1]-local_min[i]; c. peak_ratio The peak_ratio was computed by the Findpeak_ratio( ) function using the formula: peak_ratio=local_maximum/global_max. Local_max global_max Findpeak_ratio peak_ratio Figure Input and output of Findpeak_ratio( ) function Listing Find peak_ratio (SpanaView.cpp) void CSpanaView::Findpeak_ratio(int num_min,float *local_max,float *global_max,float *peak_ratio) for(int i=0;i<num_min;i++) peak_ratio[i]=local_max[i]/(*global_max); d. lobe _ widthi The FindLobe_width( ) could find the lobe_width by using the formula: lobe_width = distance between right and left local maxima. local_max_location FindLobe_width lobe _ widthi Figure Input and output of FindLobe_width( ) function 61

62 Listing Finding lobe _ widthi void CSpanaView::FindLobe_width(int *local_max_location,int num_min,int *lobe_width_i) for(int i=0;i<num_min;i++) lobe_width_i[i]=local_max_location[i+1]-local_max_location[i]; Listing Finding the four constraints void CSpanaView::FindMarker(float *marker_height,int *marker_location,int m_win_size,float *Delta,int *num_marker) //Find the min(left_height,right_height) FindHeight(local_max,*num_min,height_i,local_min); //Find the difference of heights between two consecutive maxima FindDiff_i(local_max,*num_min,diff_i); //Find the loba width between two consecutive maxima FindLobe_width(local_max_location,*num_min,lobe_width); //Get the peakrato Findpeak_ratio(*num_min,local_max,global_max,peak_ratio); th To be a marker, the i candidate needed to satisfy: 1. peak_ratio height i 0.3 global _ max 3. diff i 0.1 global _ max 4. lobe _ widthi 100lags 62

63 Listing Filter the candidates with the constraints (SpanaView.cpp) void CSpanaView::FindMarker(float *marker_height,int *marker_location,int m_win_size,float *Delta,int *num_marker) if((peak_ratio[i]>=0.7)&&(height_i[i]>=0.3*(*global_max))&&(diff_i[i]<=0.1*(*global_max))&&(lobe_wi dth[i]<=100)) num+=1; marker_location[num-1]=local_min_location[i]; marker_height[num-1]=height_i[i]; Weighting the markers with the normal distribution The probability density function of the normal distribution with mean µ and standard deviation σ is an example of a Gaussian function f ( x µ ) 1 2 2σ ( ) x = σ e 2π 2 Figure The graph of normal distribution After computed all the markers of a frame, the next step was to weight the markers with a normal distribution. Figure 1.8 showed the markers of a frame. 63

AMDF lags 25 50 75 100 125 150 Figure 5.3.15 AMDF and markers for a voiced frame Substituted the marker into the Gaussian function to weight the marker with the normal distribution.

64 AMDF lags Figure AMDF and markers for a voiced frame Substituted the marker into the Gaussian function to weight the marker with the normal distribution. The marker with the highest height after weighting with the normal distribution would be regarded as the pitch period of the frame. f(x) A AMDF B Gaussian function lags lags Figure (A) Distribution approximation of the initial pitch period estimate (B) AMDF for a voiced frame. The dashed line showed the normal distribution approximation It was noted that marker 2 was selected as the pitch period of the frame. However, after weighting the markers with the normal distribution, marker 1 made a better candidate for the pitch period of the frame. Listing Weighting the markers with the normal distribution of the initial pitch period estimates (SpanaView.cpp) void CSpanaView::pitch_detection(int *m_pitch_array_size) markers_weighted[i]=weight(marker_location[i],*std,*mean); int temp_pitch_index=0; for(i=0;i<*num_marker-1;i++) 64

65 if (markers_weighted[temp_pitch_index]>markers_weighted[i+1]) temp_pitch_index=temp_pitch_index; else The marker with highest height would be temp_pitch_index=i+1; selected as the pitch period of the frame m_pitch[f]=marker_location[temp_pitch_index]; //find the probability of the normal distribution of a given x float CSpanaView::weight(int x,float std,float mean) float expvalue=0; expvalue=(float)((1/pow(2* *pow(std,2),0.5))*exp(-1*(x-mean)*(x-mean)/(2*std*std))); return expvalue; 65

66 5.4 Adding a zoom function to view the speech signal in time domains Introduction The zoom function can zoom speech signal in time domain and its corresponding spectral envelopes can also be seen in the zooming window. The zoom function was added under the Plot Zoom menu Program flowchart Start of OnZoom( ) handler Start of computation of the frequency spectrum and the three Create and show the Zoom Scale dialog Compute the LPC and autocorrelation. Set the Zoom Indicator to TRUE Compute the frequency spectrum End of OnZoom( ) handler Compute the LPCC based spectral envelope Compute the spectral envelope by FFT-based cepstral liftering Convert from linear to db. End of computation function Figure (A) Flowchart of OnZoom( ) hanlder. (B) Flowchart of calculation of frequency spectrum and spectral envelopes. 66

67 Start of OnMouseMove( ) handler Zoom Indicator is TRUE? NO YES Append zeros for FFT NO Mouse's coordinate within the boundary? YES Get the zoom factor to compute the new frame size Start the computation of frequency spectrum and the three spectral envelopes Start of the plotting function Compute the start play sample index End of OnMouseMove( ) Windowing Figure Flowchart of MouseMove( ) hanlder 67

68 5.4.3 Creating and showing the Zoom scale dialog. In the menu bar, Plot Zoom would initialize the handler, OnZoom( ). The OnZoom( ) function would create and show the Zoom scale dialog and set the Zoom Indicator to TRUE which would be used as an indicator for the mouse-move handler, OnMouseMove( ). The Zoom Scale dialog The Zoom window Figure Zooming the speech signal Listing Created and showed the Zoom Scale dialog (SpanaView.cpp) void CSpanaView::OnZoom() m_fastzoomdlg.create(idd_display_zoom,this); m_fastzoomdlg.showwindow(sw_show); m_zoomindicator=true; m_fastzoomdlg.setindicator(m_zoomindicator); m_fastzoomdlg.invalidate(); 68

69 5.4.3 The OnMouseMove( ) handler. When the mouse pointer moved across the document, the OnMouseMove( ) handler would be initialized which would then run the PlotZoom( ) function Listing PlotZoom( ) ran when mouse pointer moved (SpanaView.cpp) void CSpanaView::OnMouseMove(UINT nflags, CPoint point) PlotZoom(point); Getting the zoom factor to compute the new frame size When the PlotZoom( ) function was run, the first step was to get the zoom factor. If the Zoom Indicator was TRUE and the mouse pointer s x-coordinate was within the painting area, the value returned from the slider in the Zoom Scale dialog will be assigned to zoom factor The Zoom Scale dialog The slider scales the zoom factor Figure The Zoom Scale dialog 69

70 Listing Get the Zoom factor (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) The Zoom Indicator m_zoomindicator=m_fastzoomdlg.getzoomindicator(); if (m_zoomindicator==true) //check if the mouse's coordinate is out of the window if((rect.right>point.x)&&(point.x>xoffs)) SliderIndicator=m_FastZoomDlg.GetSliderIndicator(); if (SliderIndicator==FALSE) factor=1; else factor=m_fastzoomdlg.getfactorvalue(); //find the new window size Zoomwindowsize=sp.windowsize; Zoomwindowsize=int(Zoomwindowsize*factor); 70

5.4.5 Calculating the Start Play Sample index (0, 0) point.x xoffs m_dfmaxx Start Play Sample Index = point.x m_dfmaxx 2 xoffs xoffs x Num of samples Figure 5.4.5 Calculating the Star Play Sample Index Listing 5.

71 5.4.5 Calculating the Start Play Sample index (0, 0) point.x xoffs m_dfmaxx Start Play Sample Index = point.x m_dfmaxx 2 xoffs xoffs x Num of samples Figure Calculating the Star Play Sample Index Listing Computed the Start Play Sample Number (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) // Calculate the Start Play Sample Number m_bplayindex = (unsigned long)((sp.number_of_samples)/ (m_dfmaxx-2*xoffs)*(point.x-xoffs)+0.5); 71

72 5.4.6 Windowing the frame samples unknown signal_timedomain... framedata... windowing Zoomwindowsize sp.windowsize Where signal_timedomain = speech signal framedata = the speech signal after windowing Figure Windowing the speech samples Listing HAMMING windowing the speech samples (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) float *framedata=new float[sp.windowsize]; signal_timedomain = ( int16 *)sp.spcdata+(long)m_bplayindex; windowing(signal_timedomain, framedata, Zoomwindowsize, HAMMING, sp.norm_factor,sp.prem_factor); Appending zeros for FFT In order to fit framedata into the FFT ( ) function, the length of framedata should be of power of 2. However, sp.windowsize was of power of 2 while Zoomwindowsize not. Thus, framedata should be of length equal to sp.windowsize. However, this would introduce some unknown signal to framedata. The unknown signal was due to the fact that the memory locations beyond Zoomwindowsize has not been assigned properly. 72

73 See the Figue that the data beyond the Zoomwindowsize were unknown. These unknown data were errors. To remove them, appended zeros to the memory locations beyond Zoomwindowsize. It would be clearer to see Figure unknown framedata framedata Append zeros All zeros sp.windowsize Zoomwindowsize sp.windowsize Figure Append zeros to memory locations beyond Zoomwindowsize Listing Append zeros to fit the FFT( ) (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) //append zeros for fft //since there is Zoomwindowsize of data, we need to append //(sp.windowsize-zoomwindowsize) zeros to framedata for(int i=0;i<sp.windowsize-zoomwindowsize;i++) framedata[i+zoomwindowsize]=0; Computation of frequency spectrum and the spectral envelopes Computed the frequency spectrum The frequency spectrum could be got by transforming the windowed speech samples into frequency domain. 73

74 framedata FFT sp.winbuffer1 Figure Transforming the speech samples into frequency domain Listing Transform the windowed speech samples into frequency domain (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) FFT(frameData, sp.winbuffer1, sp.windowsize); Computing the LPC envelope Prior to the computation of LPC envelope, it was necessary to compute the autocorrelation of the windowed signal. The second step was to use the result of autocorrelation, tempautocc, to calculate the LPC coefficients, templpc. Finally, put templpc into spectral_envelope( ) to compute the LPC envelope. The spectral envelope was stored in sp.winbuffer2. framedata, sp.order, framedata tempautocc autocorrelation cal_lpc( ) sp.winbuffer2 templpc Spectral_envelop e( ) Figure Flowchart of computing the LPC envelope 74

75 Listing Computation of the LPC envelope (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) td_autoc(zoomwindowsize,sp.order,framedata,tempautocc); calc_lpc(sp.order,tempautocc,templpc,sp.k[index-1]); spectral_envelope(sp.order, templpc, sp.windowsize, sp.winbuffer2, gain, sp.dspflag); Listing Computation steps of autocorrelation (SpanaView.cpp) void CSpanaView::td_autoc( int16 win_size, int16 order, float *indata,float *autocc) for (k=0;k<=order;k++) sum = 0.0; for (m=0;m<win_size-k;m++) sum += indata[m] * indata[m+k]; autocc[k] = sum; Computing the LPCC-based spectral envelope the spectral envelope by FFT-based cepstral liftering. The computation of them was performed by the function, PlotLPCSpectralZoom( ). Since the computation step in PlotLPCSpectralZoom( ) was same as the PlotSpectral( ), the implementation of PlotLPCSpectralZoom( ) would not be discussed here. Please refer to sections 5.1 and 5.2 for the details. After computing the spectral envelopes, we could start plotting the envelopes. However, it was necessary to convert data of frequency spectrum,lpc envelope, LPCC-based spectral envelope and spectral envelope by FFT-based cepstral lifting from linear to db before plotting the envelopes. 75

76 sp.winbuffer1, sp.winbuffer2 lpcepspecenv_zoom, CepSpecEnv_Zoom 10*log( ) sp.winbuffer1, sp.winbuffer2 lpcepspecenv_zoom, CepSpecEnv_Zoom Where sp.winbuffer1 = frequency spectrum sp.winbuffer2 = LPC envelope lpcepspecenv_zoom = LPCC-based spectral envelope CepSpecEnv_Zoom = spectral envelope by FFT-based cepstral liftering Figure Converting data from linear to db Listing Convert the linear data to db (SpanaView.cpp) void CSpanaView::PlotZoom(CPoint point) PlotLPCSpectralZoom(Zoomwindowsize,tempLPC,frameData); linear_to_log10(sp.winbuffer1, Zoomwindowsize/2, 1.0); // convert linear data to db linear_to_log10(sp.winbuffer2, Zoomwindowsize/2, 1.0); // convert linear data to db linear_to_log10(lpcepspecenv_zoom,zoomwindowsize/2,1.0); // convert linear data to db linear_to_log10(cepspecenv_zoom,zoomwindowsize/2,1.0); The Zoom Scale Dialog There was a slider in the Zoom Scale Dialog to scale the zooming factor which was used to scale the window size of the zooming. There are four scales in the slider. Zoom increasing Four scales Figure Four scales of zoom were supported 76

Figure 5.4.11 The four scales of zoom Listing 5.4.11 Set the four scales (SpanaView.

77 Figure The four scales of zoom Listing Set the four scales (SpanaView.cpp) void CFastZoomDlg::OnHScroll(UINT nsbcode, UINT npos, CScrollBar* pscrollbar) if (GetZoomIndicator()==TRUE) m_sliderzoom.setrange(1,4,true); 77

78 m_sliderzoom.setpagesize(1); m_sliderzoom.setticfreq(1); switch(m_sliderzoom.getpos()) case 1: ZoomFactor=1; break; case 2: ZoomFactor=(float)0.8; break; case 3: ZoomFactor=(float)0.6; break; case 4: ZoomFactor=(float)0.4; 78

5.6 Interactive Fast Display 5.6.1 Introduction The frame of samples that the Fast Display dialog is displaying out is determined by the location of the vertical line in the main window as shown in Figure 5.

79 5.6 Interactive Fast Display Introduction The frame of samples that the Fast Display dialog is displaying out is determined by the location of the vertical line in the main window as shown in Figure Thus, we could display the next frame of samples by reacting with red vertical line. To do this, it was a must to get the current location of the red vertical line. This functionality can be added by adding a handler, OnKeyDown( ), for the KeyDown event. The main purpose of the OnKeyDown( ) function was to get the frame index so that the Fast Display dialog could show the required frame of samples by using the frame index. The red vertical line Figure The red vertical line 79

80 5.6.2 Program Flowchart Start of OnKeyDown( ) NO Current frame index selected? YES NO key = key = VK_LEFT? NO VK_RIGHT? YES YES Frame index decrements Frame index increments Set the new location for the red line Start the computation and painting process End of OnKeyDown( ) Figure Flowchart of the OnKeyDown( ) function 80

81 5.6.3 Getting the current frame index and next frame index Current frame index: In order to select the next frame of samples for displaying in the Fast Display dialog, it was necessary to know current frame index. The reason was due to the fact that any increment or decrement of the frame index must be based on the current frame index. Otherwise, it was impossible to know which frame of samples that the users want to display. The current frame index is selected by clicking the left button of mouse at the waveform of the speech signals Please refer to section for the details. An indicator, m_bcheckmouse, was used to verify if the current frame index was selected. If selected, it will be set to TRUE. There would be no response to the key or if the indicator was set to FALSE. New frame index: There would be no any response to any key pressed in keyboard unless the key pressed was or. If right key is passed, the frame index would increment for the key and decrement for the key. Then, the frame index would be passed to the class of the Fast Display dialog for further computation Setting the red vertical line to new position The red line should be moved to right if key has been pressed and to left if key pressed is. The red vertical line position was determined by the value of the variable, x_indicator, which is the x-coordinate of the red vertical line in the main window. Figure

x indicator new positon xoffx original position x_movement_step m_dfmaxx x_movement_step = m_dfmaxx - 2 xoffx Number of frames new x indicator = x indicator + x movement step new position Figure 5.6.

82 x indicator new positon xoffx original position x_movement_step m_dfmaxx x_movement_step = m_dfmaxx - 2 xoffx Number of frames new x indicator = x indicator + x movement step new position Figure Calculation of the new location of the red vertical line Listing Getting the new frame index and setting the red vertical line to new position (SpanaView.cpp) void CSpanaView::OnKeyDown(UINT nchar, UINT nrepcnt, UINT nflags) if(nchar==vk_right nchar==vk_left) if(nchar==vk_right) x_indicator=x_indicator+x_movement_step; 82

83 else m_fastdisplaydlg.win_index = m_fastdisplaydlg.win_index +1; x_indicator=x_indicator-x_movement_step; m_fastdisplaydlg.win_index = m_fastdisplaydlg.win_index -1; Invalidate(); 83

84 5.6 Interactive Spectral Plot Introduction When the poles and zeros in Z-Plane and sensitivity of LP parameters are adjusted, the LPCC-based spectral envelope would change. The reason was that adjusting these parameters would change the values of LP coefficients. Therefore, what I had done was to get the new LP coefficients and use this data for the computation of the new LPCC-based spectral envelope Program flowchart Any change in zeros, poles or sensitivity of LP parameters Compute a new set of LP coefficients Compute the new LPCC-based spectral envelope Plot the new LPCC-based spectral envelope End Figure Program flow of reacting the change in poles, zeros or sensitivity of LP parameters by changing the LPCC-based spectral envelope 84

85 5.6.3 Computing a new set of LP coefficients There had been events handlers created to handle the changes in poles, zeros and the sensitivity of LP parameters in previous version Spana, so I needed not create any handler to handle these events. Since the handlers would compute a new set of LP coefficients, I was not required to add codes to these handlers to do so. Instead, what I needed to do was to get the new set of LP coefficients. After the computation of Ta new set of LP coefficients, the array, sp.lpc[index-1], would be updated with these LP coefficients. Therefore, the new set of LP coefficients could be referenced by the following code: sp.lpc[index-1]; //LP coefficients Computing the new LCPP-based spectral envelope After updating the array, sp.lpc[index-1], with the new set of LP coefficients, we could start the computation of the new LPCC-based spectral envelope. The computation was done in the PlotSpectralEnvelope( ) function. Listing Computation of the new LPCC-based spectral envelope (SpanaView.cpp) void CSpanaView::PlotSpectralenvelope() gain=calc_gain(sp.autocc[index-1], sp.lpc[index-1], sp.order); FFT(sp.w[index-1], sp.winbuffer1, sp.windowsize); spectral_envelope(sp.order, sp.lpc[index-1], sp.windowsize, sp.winbuffer2, gain, sp.dspflag); PlotLPCSpectral(); linear_to_log10(sp.winbuffer1, sp.windowsize/2, 1.0); // convert linear data to db linear_to_log10(sp.winbuffer2, sp.windowsize/2, 1.0); // convert linear data to db linear_to_log10(lpcepspecenv_buffer,sp.windowsize/2,1.0); // convert linear data to db 85

86 linear_to_log10(cepspecenv,sp.windowsize/2,1.0); // convert linear data to db Plotting the new LPCC-based spectral envelope Plotting of the new LPCC-based spectral envelope was also completed in the PlotSpectralEnvelope( ) function. Listing Plotting the new LPCC-based spectral envelope (SpanaView.cpp) void CSpanaView::PlotSpectralenvelope() // new (lpcepspecenv_buffer)start here CPen pen_lpcepspecenv(ps_solid, 1, RGB(0,0,255)); poldpen = dc.selectobject(&pen_lpcepspecenv); dc.moveto(int(xoffs),int((rect.bottom)+yoffs-((float)lpcepspecenv_buffer[0]-miny)*stepy)); x_coor=0; for(i=0;i<number_of_values;x_coor+=stepx,i++) dc.lineto((int)x_coor+xoffs,int((rect.bottom)-((float)lpcepspecenv_buffer[i]-miny)*stepy)); When there are changes in poles, zeros or the sensitivity of LP parameters next times, above process will be repeated. 86

87 Chapter 6 Results and Discussion 6.1 Plotting of the LPCC-based spectral envelope and the spectral envelope by FFT-based cepstral liftering Results: Figure The spectral envelopes including LPCC-based spectral envelope (Blue), spectral envelope by FFT-based cepstral liftering (Pink) and LPC spectral envelope (Red) Discussion It can be seen from Figure that the three spectral envelopes are very close to each other. By using this function, students can have a look on the relationship between these spectral envelopes. They can also verify that LPCC-based spectral envelope and spectral envelope by FFT-based cepstral liftering can model the frequency spectrum. Therefore, it is easier to tell them that the two spectral envelopes can help in locating the formants. 87

6.2 Plotting of Pitch Contour Results: (A) (B) Period in ms (C) Figure 6.2.1 The pitch contours for the speech seven from (A) WaveSurfer 1.

88 6.2 Plotting of Pitch Contour Results: (A) (B) Period in ms (C) Figure The pitch contours for the speech seven from (A) WaveSurfer 1.6.0, (B) Spana (current version), (C) Spana (previous version) 88

89 (A) (B) (C) Figure Pitch contours for the speech welcome from (A) WaveSurfer 1.6.0, (B) Spana (current version), (C) Spana (previous version) 89

90 Discussion By observing Figures and 6.2.2, it is found that the envelopes of pitch contours from current version Spana are closer to that from WaveSurfer than the envelopes of pitch contours from the previous version Spana. Thus, it can be concluded that the performance of plotting the pitch contour in current version has been enhanced. 90

91 63 Zooming the speech signals in time domain Results: Figure The zooming window Figure Zooming in greater scale 91

92 Discussion By seeing Figure 6.3.1, you will find that blow-up of waveform become possible with the zooming function. Figure shows that zooming scale can be changed. Another important feature of the zooming function is that the view in the zooming window will change accordingly with the mouse pointer. This is convenient to users because users need not to select a portion of waveform to zoom and then press the zoom button in order to zoom the speech signals. 92

6.4 Interactive Fast Display Results: Figure 6.4.1 Original view in the Fast Display dialog (frame 45) Figure 6.

93 6.4 Interactive Fast Display Results: Figure Original view in the Fast Display dialog (frame 45) Figure The next view in the Fast Display dialog (frame 46) when the key was pressed once 93

94 Discussion In previous version Spana, if the user wants to view frame 41 in the Fast Display dialog, she must use the mouse pointer to locate frame 41 and then click the mouse s left button at that location. If she wants to shift the view to the next frame, she must repeat above process. However, she must repeat above process 100 times if she wants to view the entire speech which contains 100 frames of signals. Thus, it is not convenient for her to do so. The Interactive Fast Display feature provides users a more convenient way to shift the view to the next frame of signals by using the or keys on keyboard. 94

6.5 Interactive Spectral Plot Figure 6.5.1 Changing the LPC

95 6.5 Interactive Spectral Plot Figure Changing the LPC spectral envelope and LPCC-based spectral envelope by moving the poles on Z-Plane Figure Changing the LPC spectral envelope and LPCC-based spectral envelope by adjusting the zeros on Z-Plane 95

Figure 6.5.3 Change the LPC spectral envelope and LPCC-based spectral envelope by adjusting the sensitivity of LP parameters Discussion By see Figures 6.5.1, 6.5.2, 6.5.3, it should be observed that the LPC spectral envelope and LPCC-based spectral envelope can be changed by adjusting the poles and zeros on Z-Plane and the sensitivity of LP parameters.

96 Figure Change the LPC spectral envelope and LPCC-based spectral envelope by adjusting the sensitivity of LP parameters Discussion By see Figures 6.5.1, 6.5.2, 6.5.3, it should be observed that the LPC spectral envelope and LPCC-based spectral envelope can be changed by adjusting the poles and zeros on Z-Plane and the sensitivity of LP parameters. This interactive function can help student know how the zeros, poles and sensitivity of LP parameters affect the LPC spectral envelope and LPCC-based spectral envelope. 96

2. AN INTROSPECTION OF THE MORPHING PROCESS

2. AN INTROSPECTION OF THE MORPHING PROCESS 1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,