Speech and Speaker Recognition for the Command of an Industrial Robot

Size: px
Start display at page:

Download "Speech and Speaker Recognition for the Command of an Industrial Robot"

Transcription

1 Speech and Speaker Recognition for the Command of an Industrial Robot CLAUDIA MOISA*, HELGA SILAGHI*, ANDREI SILAGHI** *Dept. of Electric Drives and Automation University of Oradea University Street, nr. 1, Oradea ROMANIA ** Politehnica University of Timisoara Vasile Parvan Bv, nr. 2, Timisoara ROMANIA Abstract - This paper presents a recognition method for ordering an industrial robot using isolated words, and also a method of speaker recognition, in case an order may be amended only by a particular person. Finally, some results of the simulation are presented. Key-Words: Voice processing, Speech recognition, Speaker recognition, Industrial robot 1 Introduction Automatic speech recognition has been studied for many years, having as a goal the man computer dialog. Nowadays, the communication with the computer can be done using a speech recognition and voice synthesis system. Speech recognition is a process of automatic drawing and description of linguistic information contained in the voice signal and it can be done using computers. Linguistic information is called phonetic information and speech recognition can be extended to the person, making the speaker recognition. The speech recognition is performed comparing the input voice signal with signals stored in a previously made library. Various parameters are extracted from the voice signal and the comparison is done based on various mathematical methods. There are two main types of recognition, namely isolated word recognition and continuous speech recognition. Also, to achieve advanced recognition systems you need to know if you use more speakers or speaker is always the same person. Independent speaker recognition systems are more complex because they must work adaptive, changing parameters with the change of the speaker. This paper aims to address technically and theoretically an effective example of application for registration in real-time voice processing and examination. Thus, it gives a series of theoretical and practical views on how the audio signal processing with description of possible applications can be developed using digital acquisition and processing systems. 2 Speech/silence algorithm Acoustic signal is composed of a sequence of sounds generated by human vocal tract system to control brain. Used voice signal is represented by a structure called the "wave", which contains the following fields: length analysis window, ie the framework ("LC"), the number of frames (N), number of samples and the total number frames of sound signal ("CN"). ISSN: ISBN:

2 wave.file = sound; wave.lc = 256;% analysis window length / length of a framework wave.n = length (wave.file)% number of samples of noise signal wave.nc = floor (wave.n / wave.lc)% number of frames A very important aspect in analysis, synthesis, speech recognition and coding is the correctly detection of the periods of silence and speech in the voice signal. Voice signal characteristics are useful in this regard. Thus, sound segments (vowels), are characterized by high energy and a strong correlation between adjacent samples, while no sound segments (consonants) are very similar to noise by their low energy and poor correlation between samples. In real-time, sound/no sound decision can be taken with two parameters: energy and zero crossing rate. Decision criteria are summarized in the following table: SIGNAL ENERGY ZCR Sound (vowel) Big Small No sound (consonants) Medium / Small Big Noise Small Big Quiet ~0 ~0 Tabel 1. Below is presented the silence-speech detection algorithm. It works both for detecting the period of silence / speech. The following algorithm is significantly improved in comparison with ordinary speech recognition algorithms. It recognizes isolated words, the results proved to be better than the classical algorithm. Steps: 1. Importing file containing the voice signal 2. Set i = 1 3. For i < Framework_no Silence_speech (i) = 0 4. For i < Framework_no If En (i)> En_min If En (i)> En_max Silence_speech (i) = 1 Else (NTZ (i)> NTZ_min) & (NTZ (i) <NTZ_max) Silence_speech (i) = 1 5. Display Silence_speech The algorithm was implemented in MATLAB. 3 Speech recognition The main difficulty faced by speech recognition programs is that the voices of two people may be in a way similar or on the other side the voice of the same person may vary in certain situations, especially when it is used for industrial robot control where with a command given by a user, it will act to meet the voice command. 3.1 Frequency analysis of the voice signal Voice signal in frequency analysis provides a more useful set of parameters in processing than the time domain analysis. Thus, the excitation and vocal tract can be easily separated in the spectral domain. Uttering different sentences differ in the same time, while they are similar in frequency. Also, the human auditory system is more sensitive to the voice signal issues than those related to phase. Therefore, spectral analysis is used to extract the majority voice signal parameters. 3.2 Linear Prediction Analysis A common method to analyze the voice signal is linear predictive analysis (Linear Predictive Coding), also known as LPC analysis or modeling AR (autoregressive). This is a simple, fast and at the same time, very efficient for calculating the parameters of voice signal. To determine LPC coefficients we analyzed the voice signal frame. LPC coefficients were obtained using MATLAB function Lpc (). The imported voice signal was examined and sampled with a sampling frequency fe = 16 khz. We used a LPC prediction order equal to 18 and we chose a frame length of 256 samples. Vocal tract is modeled by a numerical filter that has coefficients that may vary over time and a gain. The voice signal parameters are: sound / no sound decision fundamental frequency F0 for sound segments gain digital filter filter coefficients To note is that the number of LPC coefficients is always with one more than the order predictors, in this case 19. Also first rate is always 1 and corresponds to a sample immediately and unmodified. ISSN: ISBN:

3 3.3 Training Recognition of orders is based on a dictionary of words. The creation of the dictionary represents the training operation - to determine the specifics of each word, e.g. the way a word is said and save it for later recognition. Training is done by repeating the words and updating the dictionary after each pronounced command. The words are saved in the dictionary as following: 1 st line containing the string delimiter "$$$" and it marks that the next word is a new one a row that contains the command name a row that contains an integer N to specify the number of frames after feature extraction was performed a matrix of N rows and 19 columns (as predictors LPC order is 18), each line representing the LPC coefficients for the corresponding word/command It is very important that during training, the system is being trained exactly in the conditions in which it will be used. It is also recommended to use a sensitive microphone that will receive the voice command and it will reduce the noise. To obtain the desired performance during training, the speakers must be fluent and speak with normal voice, not to speak too slowly or too quickly. The system is designed to adapt to the user's voice. Obviously, it is known that the pronunciation of different words can be similar, so the system allows some small errors and it adapts to how the user speaks. However, during training, you should try to commit as little mistakes as you can. The training is implemented in the following way: the voice signal for training is recorded. Thus, the next operation is the feature extraction of the obtained sound: energy, zero crossing rate and LPC coefficients. To note that LPC coefficients are extracted only where the speech/silence algorithm detects speech. The parameters thus obtained are saved in a text file that represents the dictionary. 4 Speaker recognition Speaker recognition is basically divided into two parts: recognition and identification. This is a way to automatically identify who is the speaker on the basis of individual information included in speech. The main goal of this project is to identify the speaker from a list with reference speaker models. The algorithm has to compare a voice signal from an unknown speaker with a database consisting of known speakers. The system that has been previously trained with a number of speakers can recognize the unknown speaker. In the figure below is presented the fundamental process of speaker identification. In most applications, the voice is used to confirm the identity of a speaker. Figure 1.The fundamental process of speaker identification This diagram in Figure 1 was implemented in MATLAB code. 4.1 The Mel cepstral analysis In order to determine the correct speaker, we used the Mel cepstral analysis. It uses the Mel scale and it gives results by obtaining the MFCC (Mel Frequency Cepstral Coefficients) coefficients. After determining the power spectrum obtained using Discrete Fourier Transforme, the obtained signal is Mel frequency scale passed through a triangular filter bank. Then it follows the logarithmization and finally the MFCC coefficients are obtained by applying Discrete Cosine Transform on the new spectrum. Latter justification is that the coefficients resulting from the calculation of power spectrum are strongly correlated among themselves. Although it is known that the coefficients are not cepstral correlated, Discrete Cosine Transform coefficient allows the switch to the Mel scale. 4.2 The determination of distance in the acoustic environment Voice signal recognition involving comparison of models consisting of various parameters was performed by calculation of distances. Of the phonetic, spectral changes that lead to different sounds must meet distance and the same sound that is perceptually, must be ISSN: ISBN:

4 associated with smaller distances. Thus, for recognizing a word is calculated the distance between the parameters of each word in the dictionary. Word to the minimum distance is considered to be the word recognized. If the distance is greater than a maximum allowable threshold the algorithm considers that the word is unknown. Repeated experiments and statistical processing of the results shows that acceptable results are obtained by setting the threshold value to Graphical User Interface GUI (Graphical User Interface) has been implemented in order to give the user an application as easy to use. It consists of graphical widgets such as windows, menus, round buttons as old radios used to have (called radio buttons), check boxes and graphics windows for displaying results. Graphical interface uses in addition to keyboard, an pointing device (mouse). GUI consists of three parts of command and control, and the two windows used to display results as follows: The training part, where the user is given the opportunity to train the program with different commands for industrial robots that the program will have to recognize. Also the user can record and listen the given/trained commands. The recognition part does the speech and speaker recognition of user commands and displays the results in the recognized word s panel. The results part, is divided into 2 graphics, and it displays the waveforms and a selected feature parameter. The user can choose to see in the upper display panel - the graph of signal waveforms or the trained voice recognition introduced. Based on the previously selected option in the bottom panel graphics, it can choose between several speech parameters: - zero crossing rate - energy - autocorrelation - average magnitude difference function - speech/silence detection algorithm - spectogram - LPC spectrum Also, the "autocorrelation" and "Average magnitude difference function" calculates and displays the fundamental frequency. The program was implemented in MATLAB programming environment. Figure 2.The implemented program GUI 6 Results First results on which we will stop are those derived from basic operations of the module tools: We chose the waveform corresponding to the voice command "start". Figure 3.The waveform corresponding utterance voice command "start" On the signal we will apply the following processing: 1. Segmentation in frames of 256 samples 2. Applying a rectangular window 3. Removing continuous component 4. Voice signal detection Figure 4.The graph representing the zero crossing rate (ZC R) ISSN: ISBN:

5 Figure 5.The graph representing the signal energy In order to obtain the best results of speech/silence detection algorithm, we determined the minimum and maximum zero crossing rate and energy, with already implemented functions. On one hand, we analyzed parameters of signals and Gaussian noise with normal distribution. Then we performed the same analysis on traffic signals containing noise, the street and the bus. Next we analyzed parameters of speech signals acquired in terms of environmental noise. The results are presented in the following tables: Figure 6. The graph representing the result of speech/silence detection algorithm The percentage of speech recognition system is around 90%. Better results were obtained in speaker recognition algorithm. For a proper functioning of the program is important that the utterance is similar orders in all cases. Algorithm steps used are as follows: 1. Rectangular windows on the disabled to signal application. 2. Removing continuous component 3. Speech localization using speech/silence detection algorithm 4. Mel scale cepstral coefficients calculation 5. Calculating the distance between the user's voice command with the voice command and dictionary Percentage recognition system under these conditions is around 95%. Also user can view the voice signal spectrum and LPC spectrum by graphics, as follows: Table 2. Limits of energy and voice commands given NTZ for Industrial Robot After we have also analyzed the noise signal we extracted the following: Environmental noise parameters: ZCR = 70, Energy = 1-2 Parameters consonants: ZCR> 100 Using the results obtained, we chose the following statistical limits: - Minimum energy = 1 - Maximum energy = 10 - ZCR minimum = 10 - ZCR maximum = 80 Figure 7. The graph representing the spectrum of voice command Figure 8. The graph representing the spectrum of LPC voice command Also, the user has the option to display the position of "autocorrelation" and "Average magnitude difference ISSN: ISBN:

6 function" which also calculates and displays the fundamental frequency. Figure 9. The graph representing the function of "autocorrelation" of the voice command; it also calculates the fundamental frequency value Figure 10. The graph representing the "average magnitude difference function" command; it also calculates the fundamental frequency value Results obtained from experimental verification of the program developed, speech recognition testing male and female, are presented in two tables: 7 Conclusion The implemented algorithm wanted to be a tool to control industrial robots using the voice command. Besides the usefulness of tools already implemented in the application, it demonstrates its reliability and ease of future development. It allows users to easily add their own tools in any of the processing modules.based on obtained experimental results it demonstrates that the proposed algorithm is indeed functional and it can be used in voice command control of industrial robots. Percentage of correct recognition of commands is high enough, besides the fact that the used computational resources (CPU frequency, RAM) are lower compared to other algorithms. References: [1] Giurgiu M., Cepstral Analysis Of Speech, Proceedings of Rep 94, Bucuresti, pp [2] Furui S., Cepstral Analysis Technique For Automatic Speaker Verification, IEEE Transactions On Acoustics Speech, And Signal Processing, Vol. Assp-29, Nr. 2, pp [3] Giurgiu, M., Isolated Word Speech Recognition System Using Both Dtw And Vq, Proceeding of 2nd International Conference: Design To Manufacture In Modern Industry,1994, Bled, Slovenia, pp [4] Silaghi Helga, Electrical drive systems with induction machine. Data Acquisition. Informatic Techniques,Treira Publishing, ISBN ,Oradea, Romania, 2000 [5] Silaghi Helga, Silaghi M., About Using the Microrobot System RV-M1 in the Automatization of the Dimensional Control Operations, Electrical Drives and Power Electronics, vol.i, ISBN , High Tatras , Slovakia, p [6] Silaghi Helga, Control Problem of an Industrial Robot Equiped with DC Servomotors, Annals of University of Oradea, ISSN , Felix Spa, 1996, Romania, p [7] Silaghi Helga, The Challenge of Designing Actuated Medical Robots for Safe Human Interaction, Simpozionul National de Electrotehnica Teoretica SNET 07, ISBN , 2007, Bucuresti, Romania, pp Tabel 3. ISSN: ISBN:

Figure 1: Feature Vector Sequence Generator block diagram.

Figure 1: Feature Vector Sequence Generator block diagram. 1 Introduction Figure 1: Feature Vector Sequence Generator block diagram. We propose designing a simple isolated word speech recognition system in Verilog. Our design is naturally divided into two modules.

More information

Voice Controlled Car System

Voice Controlled Car System Voice Controlled Car System 6.111 Project Proposal Ekin Karasan & Driss Hafdi November 3, 2016 1. Overview Voice controlled car systems have been very important in providing the ability to drivers to adjust

More information

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL

More information

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution. CS 229 FINAL PROJECT A SOUNDHOUND FOR THE SOUNDS OF HOUNDS WEAKLY SUPERVISED MODELING OF ANIMAL SOUNDS ROBERT COLCORD, ETHAN GELLER, MATTHEW HORTON Abstract: We propose a hybrid approach to generating

More information

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication Proceedings of the 3 rd International Conference on Control, Dynamic Systems, and Robotics (CDSR 16) Ottawa, Canada May 9 10, 2016 Paper No. 110 DOI: 10.11159/cdsr16.110 A Parametric Autoregressive Model

More information

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication Journal of Energy and Power Engineering 10 (2016) 504-512 doi: 10.17265/1934-8975/2016.08.007 D DAVID PUBLISHING A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations

More information

Doubletalk Detection

Doubletalk Detection ELEN-E4810 Digital Signal Processing Fall 2004 Doubletalk Detection Adam Dolin David Klaver Abstract: When processing a particular voice signal it is often assumed that the signal contains only one speaker,

More information

2. AN INTROSPECTION OF THE MORPHING PROCESS

2. AN INTROSPECTION OF THE MORPHING PROCESS 1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,

More information

Design of Speech Signal Analysis and Processing System. Based on Matlab Gateway

Design of Speech Signal Analysis and Processing System. Based on Matlab Gateway 1 Design of Speech Signal Analysis and Processing System Based on Matlab Gateway Weidong Li,Zhongwei Qin,Tongyu Xiao Electronic Information Institute, University of Science and Technology, Shaanxi, China

More information

Non Stationary Signals (Voice) Verification System Using Wavelet Transform

Non Stationary Signals (Voice) Verification System Using Wavelet Transform Non Stationary Signals (Voice) Verification System Using Wavelet Transform PPS Subhashini Associate Professor, Department of ECE, RVR & JC College of Engineering, Guntur. Dr.M.Satya Sairam Professor &

More information

DIGITAL COMMUNICATION

DIGITAL COMMUNICATION 10EC61 DIGITAL COMMUNICATION UNIT 3 OUTLINE Waveform coding techniques (continued), DPCM, DM, applications. Base-Band Shaping for Data Transmission Discrete PAM signals, power spectra of discrete PAM signals.

More information

Chord Classification of an Audio Signal using Artificial Neural Network

Chord Classification of an Audio Signal using Artificial Neural Network Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

DATA COMPRESSION USING THE FFT

DATA COMPRESSION USING THE FFT EEE 407/591 PROJECT DUE: NOVEMBER 21, 2001 DATA COMPRESSION USING THE FFT INSTRUCTOR: DR. ANDREAS SPANIAS TEAM MEMBERS: IMTIAZ NIZAMI - 993 21 6600 HASSAN MANSOOR - 993 69 3137 Contents TECHNICAL BACKGROUND...

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

Singer Identification

Singer Identification Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27 Outline 1 Introduction Applications Challenges

More information

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS Item Type text; Proceedings Authors Habibi, A. Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings

More information

A Matlab toolbox for. Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE

A Matlab toolbox for. Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE Centre for Marine Science and Technology A Matlab toolbox for Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE Version 5.0b Prepared for: Centre for Marine Science and Technology Prepared

More information

MindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.

MindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK. Andrew Robbins MindMouse Project Description: MindMouse is an application that interfaces the user s mind with the computer s mouse functionality. The hardware that is required for MindMouse is the Emotiv

More information

Various Applications of Digital Signal Processing (DSP)

Various Applications of Digital Signal Processing (DSP) Various Applications of Digital Signal Processing (DSP) Neha Kapoor, Yash Kumar, Mona Sharma Student,ECE,DCE,Gurgaon, India EMAIL: neha04263@gmail.com, yashguptaip@gmail.com, monasharma1194@gmail.com ABSTRACT:-

More information

ISSN ICIRET-2014

ISSN ICIRET-2014 Robust Multilingual Voice Biometrics using Optimum Frames Kala A 1, Anu Infancia J 2, Pradeepa Natarajan 3 1,2 PG Scholar, SNS College of Technology, Coimbatore-641035, India 3 Assistant Professor, SNS

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

Research on sampling of vibration signals based on compressed sensing

Research on sampling of vibration signals based on compressed sensing Research on sampling of vibration signals based on compressed sensing Hongchun Sun 1, Zhiyuan Wang 2, Yong Xu 3 School of Mechanical Engineering and Automation, Northeastern University, Shenyang, China

More information

Acoustic Scene Classification

Acoustic Scene Classification Acoustic Scene Classification Marc-Christoph Gerasch Seminar Topics in Computer Music - Acoustic Scene Classification 6/24/2015 1 Outline Acoustic Scene Classification - definition History and state of

More information

Robert Alexandru Dobre, Cristian Negrescu

Robert Alexandru Dobre, Cristian Negrescu ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q

More information

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE

More information

Music Source Separation

Music Source Separation Music Source Separation Hao-Wei Tseng Electrical and Engineering System University of Michigan Ann Arbor, Michigan Email: blakesen@umich.edu Abstract In popular music, a cover version or cover song, or

More information

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong Appendix D UW DigiScope User s Manual Willis J. Tompkins and Annie Foong UW DigiScope is a program that gives the user a range of basic functions typical of a digital oscilloscope. Included are such features

More information

Spectrum Analyser Basics

Spectrum Analyser Basics Hands-On Learning Spectrum Analyser Basics Peter D. Hiscocks Syscomp Electronic Design Limited Email: phiscock@ee.ryerson.ca June 28, 2014 Introduction Figure 1: GUI Startup Screen In a previous exercise,

More information

Lab 5 Linear Predictive Coding

Lab 5 Linear Predictive Coding Lab 5 Linear Predictive Coding 1 of 1 Idea When plain speech audio is recorded and needs to be transmitted over a channel with limited bandwidth it is often necessary to either compress or encode the audio

More information

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam GCT535- Sound Technology for Multimedia Timbre Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Timbre Analysis Definition of Timbre Timbre Features Zero-crossing rate Spectral

More information

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,

More information

Improving Frame Based Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for

More information

CSC475 Music Information Retrieval

CSC475 Music Information Retrieval CSC475 Music Information Retrieval Monophonic pitch extraction George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 32 Table of Contents I 1 Motivation and Terminology 2 Psychacoustics 3 F0

More information

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:

More information

Introduction To LabVIEW and the DSP Board

Introduction To LabVIEW and the DSP Board EE-289, DIGITAL SIGNAL PROCESSING LAB November 2005 Introduction To LabVIEW and the DSP Board 1 Overview The purpose of this lab is to familiarize you with the DSP development system by looking at sampling,

More information

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring 2009 Week 6 Class Notes Pitch Perception Introduction Pitch may be described as that attribute of auditory sensation in terms

More information

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy

More information

Experiment 13 Sampling and reconstruction

Experiment 13 Sampling and reconstruction Experiment 13 Sampling and reconstruction Preliminary discussion So far, the experiments in this manual have concentrated on communications systems that transmit analog signals. However, digital transmission

More information

ME EN 363 ELEMENTARY INSTRUMENTATION Lab: Basic Lab Instruments and Data Acquisition

ME EN 363 ELEMENTARY INSTRUMENTATION Lab: Basic Lab Instruments and Data Acquisition ME EN 363 ELEMENTARY INSTRUMENTATION Lab: Basic Lab Instruments and Data Acquisition INTRODUCTION Many sensors produce continuous voltage signals. In this lab, you will learn about some common methods

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

1 Introduction to PSQM

1 Introduction to PSQM A Technical White Paper on Sage s PSQM Test Renshou Dai August 7, 2000 1 Introduction to PSQM 1.1 What is PSQM test? PSQM stands for Perceptual Speech Quality Measure. It is an ITU-T P.861 [1] recommended

More information

Rapid prototyping of of DSP algorithms. real-time. Mattias Arlbrant. Grupphandledare, ANC

Rapid prototyping of of DSP algorithms. real-time. Mattias Arlbrant. Grupphandledare, ANC Rapid prototyping of of DSP algorithms real-time Mattias Arlbrant Grupphandledare, ANC Agenda 1. 1. Our Our DSP DSP system system 2. 2. Creating Creating a Simulink Simulink model model 3. 3. Running Running

More information

Advanced Signal Processing 2

Advanced Signal Processing 2 Advanced Signal Processing 2 Synthesis of Singing 1 Outline Features and requirements of signing synthesizers HMM based synthesis of singing Articulatory synthesis of singing Examples 2 Requirements of

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,

More information

Normalized Cumulative Spectral Distribution in Music

Normalized Cumulative Spectral Distribution in Music Normalized Cumulative Spectral Distribution in Music Young-Hwan Song, Hyung-Jun Kwon, and Myung-Jin Bae Abstract As the remedy used music becomes active and meditation effect through the music is verified,

More information

An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset

An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset By: Abouzar Rahmati Authors: Abouzar Rahmati IS-International Services LLC Reza Adhami University of Alabama in Huntsville April

More information

Project Summary EPRI Program 1: Power Quality

Project Summary EPRI Program 1: Power Quality Project Summary EPRI Program 1: Power Quality April 2015 PQ Monitoring Evolving from Single-Site Investigations. to Wide-Area PQ Monitoring Applications DME w/pq 2 Equating to large amounts of PQ data

More information

The Measurement Tools and What They Do

The Measurement Tools and What They Do 2 The Measurement Tools The Measurement Tools and What They Do JITTERWIZARD The JitterWizard is a unique capability of the JitterPro package that performs the requisite scope setup chores while simplifying

More information

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY Eugene Mikyung Kim Department of Music Technology, Korea National University of Arts eugene@u.northwestern.edu ABSTRACT

More information

PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS

PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS 8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING" 19-21 April 2012, Tallinn, Estonia PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS Astapov,

More information

Supervised Learning in Genre Classification

Supervised Learning in Genre Classification Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music

More information

Speech Recognition Combining MFCCs and Image Features

Speech Recognition Combining MFCCs and Image Features Speech Recognition Combining MFCCs and Image Featres S. Karlos from Department of Mathematics N. Fazakis from Department of Electrical and Compter Engineering K. Karanikola from Department of Mathematics

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,

More information

Getting Started with the LabVIEW Sound and Vibration Toolkit

Getting Started with the LabVIEW Sound and Vibration Toolkit 1 Getting Started with the LabVIEW Sound and Vibration Toolkit This tutorial is designed to introduce you to some of the sound and vibration analysis capabilities in the industry-leading software tool

More information

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629

More information

Analyzing Modulated Signals with the V93000 Signal Analyzer Tool. Joe Kelly, Verigy, Inc.

Analyzing Modulated Signals with the V93000 Signal Analyzer Tool. Joe Kelly, Verigy, Inc. Analyzing Modulated Signals with the V93000 Signal Analyzer Tool Joe Kelly, Verigy, Inc. Abstract The Signal Analyzer Tool contained within the SmarTest software on the V93000 is a versatile graphical

More information

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1 International Conference on Applied Science and Engineering Innovation (ASEI 2015) Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1 1 China Satellite Maritime

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

Comparison Parameters and Speaker Similarity Coincidence Criteria:

Comparison Parameters and Speaker Similarity Coincidence Criteria: Comparison Parameters and Speaker Similarity Coincidence Criteria: The Easy Voice system uses two interrelating parameters of comparison (first and second error types). False Rejection, FR is a probability

More information

INSPECTION EQUIPMENT FOR SUPPLIED CASES WITH EMPTY BOTTLES

INSPECTION EQUIPMENT FOR SUPPLIED CASES WITH EMPTY BOTTLES INSPECTION EQUIPMENT FOR SUPPLIED CASES WITH EMPTY BOTTLES Georgy Slavchev Mihov 1, Stanimir Damyanov Mollov 1, Ratcho Marinov Ivanov 1, Stoyan Nikolov Jilov 2 1 Faculty of Electronic Engineering and Technologies,

More information

Topic 4. Single Pitch Detection

Topic 4. Single Pitch Detection Topic 4 Single Pitch Detection What is pitch? A perceptual attribute, so subjective Only defined for (quasi) harmonic sounds Harmonic sounds are periodic, and the period is 1/F0. Can be reliably matched

More information

Digital music synthesis using DSP

Digital music synthesis using DSP Digital music synthesis using DSP Rahul Bhat (124074002), Sandeep Bhagwat (123074011), Gaurang Naik (123079009), Shrikant Venkataramani (123079042) DSP Application Assignment, Group No. 4 Department of

More information

Features for Audio and Music Classification

Features for Audio and Music Classification Features for Audio and Music Classification Martin F. McKinney and Jeroen Breebaart Auditory and Multisensory Perception, Digital Signal Processing Group Philips Research Laboratories Eindhoven, The Netherlands

More information

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National

More information

Chapter 1. Introduction to Digital Signal Processing

Chapter 1. Introduction to Digital Signal Processing Chapter 1 Introduction to Digital Signal Processing 1. Introduction Signal processing is a discipline concerned with the acquisition, representation, manipulation, and transformation of signals required

More information

Tempo and Beat Analysis

Tempo and Beat Analysis Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION

ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION Travis M. Doll Ray V. Migneco Youngmoo E. Kim Drexel University, Electrical & Computer Engineering {tmd47,rm443,ykim}@drexel.edu

More information

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS Matthew Roddy Dept. of Computer Science and Information Systems, University of Limerick, Ireland Jacqueline Walker

More information

MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003

MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003 MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003 OBJECTIVE To become familiar with state-of-the-art digital data acquisition hardware and software. To explore common data acquisition

More information

Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm

Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm Majid Aghasi*, and Alireza Jalilian** *Department of Electrical Engineering, Iran University of Science and Technology,

More information

The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC

The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC INTRODUCTION The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC The Time Series Forecasting System (TSFS) is a component of SAS/ETS that provides a menu-based

More information

Swept-tuned spectrum analyzer. Gianfranco Miele, Ph.D

Swept-tuned spectrum analyzer. Gianfranco Miele, Ph.D Swept-tuned spectrum analyzer Gianfranco Miele, Ph.D www.eng.docente.unicas.it/gianfranco_miele g.miele@unicas.it Video section Up until the mid-1970s, spectrum analyzers were purely analog. The displayed

More information

MAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button

MAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button MAutoPitch Presets button Presets button shows a window with all available presets. A preset can be loaded from the preset window by double-clicking on it, using the arrow buttons or by using a combination

More information

APP USE USER MANUAL 2017 VERSION BASED ON WAVE TRACKING TECHNIQUE

APP USE USER MANUAL 2017 VERSION BASED ON WAVE TRACKING TECHNIQUE APP USE USER MANUAL 2017 VERSION BASED ON WAVE TRACKING TECHNIQUE All rights reserved All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in

More information

Introduction to QScan

Introduction to QScan Introduction to QScan Shourov K. Chatterji SciMon Camp LIGO Livingston Observatory 2006 August 18 QScan web page Much of this talk is taken from the QScan web page http://www.ligo.caltech.edu/~shourov/q/qscan/

More information

Phone-based Plosive Detection

Phone-based Plosive Detection Phone-based Plosive Detection 1 Andreas Madsack, Grzegorz Dogil, Stefan Uhlich, Yugu Zeng and Bin Yang Abstract We compare two segmentation approaches to plosive detection: One aproach is using a uniform

More information

SIDRA INTERSECTION 8.0 UPDATE HISTORY

SIDRA INTERSECTION 8.0 UPDATE HISTORY Akcelik & Associates Pty Ltd PO Box 1075G, Greythorn, Vic 3104 AUSTRALIA ABN 79 088 889 687 For all technical support, sales support and general enquiries: support.sidrasolutions.com SIDRA INTERSECTION

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

Acoustic Echo Canceling: Echo Equality Index

Acoustic Echo Canceling: Echo Equality Index Acoustic Echo Canceling: Echo Equality Index Mengran Du, University of Maryalnd Dr. Bogdan Kosanovic, Texas Instruments Industry Sponsored Projects In Research and Engineering (INSPIRE) Maryland Engineering

More information

Lab experience 1: Introduction to LabView

Lab experience 1: Introduction to LabView Lab experience 1: Introduction to LabView LabView is software for the real-time acquisition, processing and visualization of measured data. A LabView program is called a Virtual Instrument (VI) because

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;

More information

Automatic Classification of Instrumental Music & Human Voice Using Formant Analysis

Automatic Classification of Instrumental Music & Human Voice Using Formant Analysis Automatic Classification of Instrumental Music & Human Voice Using Formant Analysis I Diksha Raina, II Sangita Chakraborty, III M.R Velankar I,II Dept. of Information Technology, Cummins College of Engineering,

More information

The following exercises illustrate the execution of collaborative simulations in J-DSP. The exercises namely a

The following exercises illustrate the execution of collaborative simulations in J-DSP. The exercises namely a Exercises: The following exercises illustrate the execution of collaborative simulations in J-DSP. The exercises namely a Pole-zero cancellation simulation and a Peak-picking analysis and synthesis simulation

More information

Upgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server. Milos Sedlacek 1, Ondrej Tomiska 2

Upgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server. Milos Sedlacek 1, Ondrej Tomiska 2 Upgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server Milos Sedlacek 1, Ondrej Tomiska 2 1 Czech Technical University in Prague, Faculty of Electrical Engineeiring, Technicka

More information

Singing voice synthesis based on deep neural networks

Singing voice synthesis based on deep neural networks INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

More information

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad. Getting Started First thing you should do is to connect your iphone or ipad to SpikerBox with a green smartphone cable. Green cable comes with designators on each end of the cable ( Smartphone and SpikerBox

More information

A COMPUTER VISION SYSTEM TO READ METER DISPLAYS

A COMPUTER VISION SYSTEM TO READ METER DISPLAYS A COMPUTER VISION SYSTEM TO READ METER DISPLAYS Danilo Alves de Lima 1, Guilherme Augusto Silva Pereira 2, Flávio Henrique de Vasconcelos 3 Department of Electric Engineering, School of Engineering, Av.

More information

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer by: Matt Mazzola 12222670 Abstract The design of a spectrum analyzer on an embedded device is presented. The device achieves minimum

More information

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB Laboratory Assignment 3 Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB PURPOSE In this laboratory assignment, you will use MATLAB to synthesize the audio tones that make up a well-known

More information

System Identification

System Identification System Identification Arun K. Tangirala Department of Chemical Engineering IIT Madras July 26, 2013 Module 9 Lecture 2 Arun K. Tangirala System Identification July 26, 2013 16 Contents of Lecture 2 In

More information

Joseph Wakooli. Designing an Analysis Tool for Digital Signal Processing

Joseph Wakooli. Designing an Analysis Tool for Digital Signal Processing Joseph Wakooli Designing an Analysis Tool for Digital Signal Processing Helsinki Metropolia University of Applied Sciences Bachelor of Engineering Information Technology Thesis 30 May 2012 Abstract Author(s)

More information

Smart Traffic Control System Using Image Processing

Smart Traffic Control System Using Image Processing Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,

More information

Virtual Vibration Analyzer

Virtual Vibration Analyzer Virtual Vibration Analyzer Vibration/industrial systems LabVIEW DAQ by Ricardo Jaramillo, Manager, Ricardo Jaramillo y Cía; Daniel Jaramillo, Engineering Assistant, Ricardo Jaramillo y Cía The Challenge:

More information

Pitch-Synchronous Spectrogram: Principles and Applications

Pitch-Synchronous Spectrogram: Principles and Applications Pitch-Synchronous Spectrogram: Principles and Applications C. Julian Chen Department of Applied Physics and Applied Mathematics May 24, 2018 Outline The traditional spectrogram Observations with the electroglottograph

More information

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics Master Thesis Signal Processing Thesis no December 2011 Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics Md Zameari Islam GM Sabil Sajjad This thesis is presented

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information