Speech and Speaker Recognition for the Command of an Industrial Robot
|
|
- Aubrey Pitts
- 6 years ago
- Views:
Transcription
1 Speech and Speaker Recognition for the Command of an Industrial Robot CLAUDIA MOISA*, HELGA SILAGHI*, ANDREI SILAGHI** *Dept. of Electric Drives and Automation University of Oradea University Street, nr. 1, Oradea ROMANIA ** Politehnica University of Timisoara Vasile Parvan Bv, nr. 2, Timisoara ROMANIA Abstract - This paper presents a recognition method for ordering an industrial robot using isolated words, and also a method of speaker recognition, in case an order may be amended only by a particular person. Finally, some results of the simulation are presented. Key-Words: Voice processing, Speech recognition, Speaker recognition, Industrial robot 1 Introduction Automatic speech recognition has been studied for many years, having as a goal the man computer dialog. Nowadays, the communication with the computer can be done using a speech recognition and voice synthesis system. Speech recognition is a process of automatic drawing and description of linguistic information contained in the voice signal and it can be done using computers. Linguistic information is called phonetic information and speech recognition can be extended to the person, making the speaker recognition. The speech recognition is performed comparing the input voice signal with signals stored in a previously made library. Various parameters are extracted from the voice signal and the comparison is done based on various mathematical methods. There are two main types of recognition, namely isolated word recognition and continuous speech recognition. Also, to achieve advanced recognition systems you need to know if you use more speakers or speaker is always the same person. Independent speaker recognition systems are more complex because they must work adaptive, changing parameters with the change of the speaker. This paper aims to address technically and theoretically an effective example of application for registration in real-time voice processing and examination. Thus, it gives a series of theoretical and practical views on how the audio signal processing with description of possible applications can be developed using digital acquisition and processing systems. 2 Speech/silence algorithm Acoustic signal is composed of a sequence of sounds generated by human vocal tract system to control brain. Used voice signal is represented by a structure called the "wave", which contains the following fields: length analysis window, ie the framework ("LC"), the number of frames (N), number of samples and the total number frames of sound signal ("CN"). ISSN: ISBN:
2 wave.file = sound; wave.lc = 256;% analysis window length / length of a framework wave.n = length (wave.file)% number of samples of noise signal wave.nc = floor (wave.n / wave.lc)% number of frames A very important aspect in analysis, synthesis, speech recognition and coding is the correctly detection of the periods of silence and speech in the voice signal. Voice signal characteristics are useful in this regard. Thus, sound segments (vowels), are characterized by high energy and a strong correlation between adjacent samples, while no sound segments (consonants) are very similar to noise by their low energy and poor correlation between samples. In real-time, sound/no sound decision can be taken with two parameters: energy and zero crossing rate. Decision criteria are summarized in the following table: SIGNAL ENERGY ZCR Sound (vowel) Big Small No sound (consonants) Medium / Small Big Noise Small Big Quiet ~0 ~0 Tabel 1. Below is presented the silence-speech detection algorithm. It works both for detecting the period of silence / speech. The following algorithm is significantly improved in comparison with ordinary speech recognition algorithms. It recognizes isolated words, the results proved to be better than the classical algorithm. Steps: 1. Importing file containing the voice signal 2. Set i = 1 3. For i < Framework_no Silence_speech (i) = 0 4. For i < Framework_no If En (i)> En_min If En (i)> En_max Silence_speech (i) = 1 Else (NTZ (i)> NTZ_min) & (NTZ (i) <NTZ_max) Silence_speech (i) = 1 5. Display Silence_speech The algorithm was implemented in MATLAB. 3 Speech recognition The main difficulty faced by speech recognition programs is that the voices of two people may be in a way similar or on the other side the voice of the same person may vary in certain situations, especially when it is used for industrial robot control where with a command given by a user, it will act to meet the voice command. 3.1 Frequency analysis of the voice signal Voice signal in frequency analysis provides a more useful set of parameters in processing than the time domain analysis. Thus, the excitation and vocal tract can be easily separated in the spectral domain. Uttering different sentences differ in the same time, while they are similar in frequency. Also, the human auditory system is more sensitive to the voice signal issues than those related to phase. Therefore, spectral analysis is used to extract the majority voice signal parameters. 3.2 Linear Prediction Analysis A common method to analyze the voice signal is linear predictive analysis (Linear Predictive Coding), also known as LPC analysis or modeling AR (autoregressive). This is a simple, fast and at the same time, very efficient for calculating the parameters of voice signal. To determine LPC coefficients we analyzed the voice signal frame. LPC coefficients were obtained using MATLAB function Lpc (). The imported voice signal was examined and sampled with a sampling frequency fe = 16 khz. We used a LPC prediction order equal to 18 and we chose a frame length of 256 samples. Vocal tract is modeled by a numerical filter that has coefficients that may vary over time and a gain. The voice signal parameters are: sound / no sound decision fundamental frequency F0 for sound segments gain digital filter filter coefficients To note is that the number of LPC coefficients is always with one more than the order predictors, in this case 19. Also first rate is always 1 and corresponds to a sample immediately and unmodified. ISSN: ISBN:
3 3.3 Training Recognition of orders is based on a dictionary of words. The creation of the dictionary represents the training operation - to determine the specifics of each word, e.g. the way a word is said and save it for later recognition. Training is done by repeating the words and updating the dictionary after each pronounced command. The words are saved in the dictionary as following: 1 st line containing the string delimiter "$$$" and it marks that the next word is a new one a row that contains the command name a row that contains an integer N to specify the number of frames after feature extraction was performed a matrix of N rows and 19 columns (as predictors LPC order is 18), each line representing the LPC coefficients for the corresponding word/command It is very important that during training, the system is being trained exactly in the conditions in which it will be used. It is also recommended to use a sensitive microphone that will receive the voice command and it will reduce the noise. To obtain the desired performance during training, the speakers must be fluent and speak with normal voice, not to speak too slowly or too quickly. The system is designed to adapt to the user's voice. Obviously, it is known that the pronunciation of different words can be similar, so the system allows some small errors and it adapts to how the user speaks. However, during training, you should try to commit as little mistakes as you can. The training is implemented in the following way: the voice signal for training is recorded. Thus, the next operation is the feature extraction of the obtained sound: energy, zero crossing rate and LPC coefficients. To note that LPC coefficients are extracted only where the speech/silence algorithm detects speech. The parameters thus obtained are saved in a text file that represents the dictionary. 4 Speaker recognition Speaker recognition is basically divided into two parts: recognition and identification. This is a way to automatically identify who is the speaker on the basis of individual information included in speech. The main goal of this project is to identify the speaker from a list with reference speaker models. The algorithm has to compare a voice signal from an unknown speaker with a database consisting of known speakers. The system that has been previously trained with a number of speakers can recognize the unknown speaker. In the figure below is presented the fundamental process of speaker identification. In most applications, the voice is used to confirm the identity of a speaker. Figure 1.The fundamental process of speaker identification This diagram in Figure 1 was implemented in MATLAB code. 4.1 The Mel cepstral analysis In order to determine the correct speaker, we used the Mel cepstral analysis. It uses the Mel scale and it gives results by obtaining the MFCC (Mel Frequency Cepstral Coefficients) coefficients. After determining the power spectrum obtained using Discrete Fourier Transforme, the obtained signal is Mel frequency scale passed through a triangular filter bank. Then it follows the logarithmization and finally the MFCC coefficients are obtained by applying Discrete Cosine Transform on the new spectrum. Latter justification is that the coefficients resulting from the calculation of power spectrum are strongly correlated among themselves. Although it is known that the coefficients are not cepstral correlated, Discrete Cosine Transform coefficient allows the switch to the Mel scale. 4.2 The determination of distance in the acoustic environment Voice signal recognition involving comparison of models consisting of various parameters was performed by calculation of distances. Of the phonetic, spectral changes that lead to different sounds must meet distance and the same sound that is perceptually, must be ISSN: ISBN:
4 associated with smaller distances. Thus, for recognizing a word is calculated the distance between the parameters of each word in the dictionary. Word to the minimum distance is considered to be the word recognized. If the distance is greater than a maximum allowable threshold the algorithm considers that the word is unknown. Repeated experiments and statistical processing of the results shows that acceptable results are obtained by setting the threshold value to Graphical User Interface GUI (Graphical User Interface) has been implemented in order to give the user an application as easy to use. It consists of graphical widgets such as windows, menus, round buttons as old radios used to have (called radio buttons), check boxes and graphics windows for displaying results. Graphical interface uses in addition to keyboard, an pointing device (mouse). GUI consists of three parts of command and control, and the two windows used to display results as follows: The training part, where the user is given the opportunity to train the program with different commands for industrial robots that the program will have to recognize. Also the user can record and listen the given/trained commands. The recognition part does the speech and speaker recognition of user commands and displays the results in the recognized word s panel. The results part, is divided into 2 graphics, and it displays the waveforms and a selected feature parameter. The user can choose to see in the upper display panel - the graph of signal waveforms or the trained voice recognition introduced. Based on the previously selected option in the bottom panel graphics, it can choose between several speech parameters: - zero crossing rate - energy - autocorrelation - average magnitude difference function - speech/silence detection algorithm - spectogram - LPC spectrum Also, the "autocorrelation" and "Average magnitude difference function" calculates and displays the fundamental frequency. The program was implemented in MATLAB programming environment. Figure 2.The implemented program GUI 6 Results First results on which we will stop are those derived from basic operations of the module tools: We chose the waveform corresponding to the voice command "start". Figure 3.The waveform corresponding utterance voice command "start" On the signal we will apply the following processing: 1. Segmentation in frames of 256 samples 2. Applying a rectangular window 3. Removing continuous component 4. Voice signal detection Figure 4.The graph representing the zero crossing rate (ZC R) ISSN: ISBN:
5 Figure 5.The graph representing the signal energy In order to obtain the best results of speech/silence detection algorithm, we determined the minimum and maximum zero crossing rate and energy, with already implemented functions. On one hand, we analyzed parameters of signals and Gaussian noise with normal distribution. Then we performed the same analysis on traffic signals containing noise, the street and the bus. Next we analyzed parameters of speech signals acquired in terms of environmental noise. The results are presented in the following tables: Figure 6. The graph representing the result of speech/silence detection algorithm The percentage of speech recognition system is around 90%. Better results were obtained in speaker recognition algorithm. For a proper functioning of the program is important that the utterance is similar orders in all cases. Algorithm steps used are as follows: 1. Rectangular windows on the disabled to signal application. 2. Removing continuous component 3. Speech localization using speech/silence detection algorithm 4. Mel scale cepstral coefficients calculation 5. Calculating the distance between the user's voice command with the voice command and dictionary Percentage recognition system under these conditions is around 95%. Also user can view the voice signal spectrum and LPC spectrum by graphics, as follows: Table 2. Limits of energy and voice commands given NTZ for Industrial Robot After we have also analyzed the noise signal we extracted the following: Environmental noise parameters: ZCR = 70, Energy = 1-2 Parameters consonants: ZCR> 100 Using the results obtained, we chose the following statistical limits: - Minimum energy = 1 - Maximum energy = 10 - ZCR minimum = 10 - ZCR maximum = 80 Figure 7. The graph representing the spectrum of voice command Figure 8. The graph representing the spectrum of LPC voice command Also, the user has the option to display the position of "autocorrelation" and "Average magnitude difference ISSN: ISBN:
6 function" which also calculates and displays the fundamental frequency. Figure 9. The graph representing the function of "autocorrelation" of the voice command; it also calculates the fundamental frequency value Figure 10. The graph representing the "average magnitude difference function" command; it also calculates the fundamental frequency value Results obtained from experimental verification of the program developed, speech recognition testing male and female, are presented in two tables: 7 Conclusion The implemented algorithm wanted to be a tool to control industrial robots using the voice command. Besides the usefulness of tools already implemented in the application, it demonstrates its reliability and ease of future development. It allows users to easily add their own tools in any of the processing modules.based on obtained experimental results it demonstrates that the proposed algorithm is indeed functional and it can be used in voice command control of industrial robots. Percentage of correct recognition of commands is high enough, besides the fact that the used computational resources (CPU frequency, RAM) are lower compared to other algorithms. References: [1] Giurgiu M., Cepstral Analysis Of Speech, Proceedings of Rep 94, Bucuresti, pp [2] Furui S., Cepstral Analysis Technique For Automatic Speaker Verification, IEEE Transactions On Acoustics Speech, And Signal Processing, Vol. Assp-29, Nr. 2, pp [3] Giurgiu, M., Isolated Word Speech Recognition System Using Both Dtw And Vq, Proceeding of 2nd International Conference: Design To Manufacture In Modern Industry,1994, Bled, Slovenia, pp [4] Silaghi Helga, Electrical drive systems with induction machine. Data Acquisition. Informatic Techniques,Treira Publishing, ISBN ,Oradea, Romania, 2000 [5] Silaghi Helga, Silaghi M., About Using the Microrobot System RV-M1 in the Automatization of the Dimensional Control Operations, Electrical Drives and Power Electronics, vol.i, ISBN , High Tatras , Slovakia, p [6] Silaghi Helga, Control Problem of an Industrial Robot Equiped with DC Servomotors, Annals of University of Oradea, ISSN , Felix Spa, 1996, Romania, p [7] Silaghi Helga, The Challenge of Designing Actuated Medical Robots for Safe Human Interaction, Simpozionul National de Electrotehnica Teoretica SNET 07, ISBN , 2007, Bucuresti, Romania, pp Tabel 3. ISSN: ISBN:
Figure 1: Feature Vector Sequence Generator block diagram.
1 Introduction Figure 1: Feature Vector Sequence Generator block diagram. We propose designing a simple isolated word speech recognition system in Verilog. Our design is naturally divided into two modules.
More informationVoice Controlled Car System
Voice Controlled Car System 6.111 Project Proposal Ekin Karasan & Driss Hafdi November 3, 2016 1. Overview Voice controlled car systems have been very important in providing the ability to drivers to adjust
More informationInternational Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL
More informationhit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.
CS 229 FINAL PROJECT A SOUNDHOUND FOR THE SOUNDS OF HOUNDS WEAKLY SUPERVISED MODELING OF ANIMAL SOUNDS ROBERT COLCORD, ETHAN GELLER, MATTHEW HORTON Abstract: We propose a hybrid approach to generating
More informationA Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication
Proceedings of the 3 rd International Conference on Control, Dynamic Systems, and Robotics (CDSR 16) Ottawa, Canada May 9 10, 2016 Paper No. 110 DOI: 10.11159/cdsr16.110 A Parametric Autoregressive Model
More informationA Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication
Journal of Energy and Power Engineering 10 (2016) 504-512 doi: 10.17265/1934-8975/2016.08.007 D DAVID PUBLISHING A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations
More informationDoubletalk Detection
ELEN-E4810 Digital Signal Processing Fall 2004 Doubletalk Detection Adam Dolin David Klaver Abstract: When processing a particular voice signal it is often assumed that the signal contains only one speaker,
More information2. AN INTROSPECTION OF THE MORPHING PROCESS
1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,
More informationDesign of Speech Signal Analysis and Processing System. Based on Matlab Gateway
1 Design of Speech Signal Analysis and Processing System Based on Matlab Gateway Weidong Li,Zhongwei Qin,Tongyu Xiao Electronic Information Institute, University of Science and Technology, Shaanxi, China
More informationNon Stationary Signals (Voice) Verification System Using Wavelet Transform
Non Stationary Signals (Voice) Verification System Using Wavelet Transform PPS Subhashini Associate Professor, Department of ECE, RVR & JC College of Engineering, Guntur. Dr.M.Satya Sairam Professor &
More informationDIGITAL COMMUNICATION
10EC61 DIGITAL COMMUNICATION UNIT 3 OUTLINE Waveform coding techniques (continued), DPCM, DM, applications. Base-Band Shaping for Data Transmission Discrete PAM signals, power spectra of discrete PAM signals.
More informationChord Classification of an Audio Signal using Artificial Neural Network
Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------
More informationDATA COMPRESSION USING THE FFT
EEE 407/591 PROJECT DUE: NOVEMBER 21, 2001 DATA COMPRESSION USING THE FFT INSTRUCTOR: DR. ANDREAS SPANIAS TEAM MEMBERS: IMTIAZ NIZAMI - 993 21 6600 HASSAN MANSOOR - 993 69 3137 Contents TECHNICAL BACKGROUND...
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional
More informationSubjective Similarity of Music: Data Collection for Individuality Analysis
Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp
More informationSinger Identification
Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27 Outline 1 Introduction Applications Challenges
More informationDELTA MODULATION AND DPCM CODING OF COLOR SIGNALS
DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS Item Type text; Proceedings Authors Habibi, A. Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings
More informationA Matlab toolbox for. Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE
Centre for Marine Science and Technology A Matlab toolbox for Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE Version 5.0b Prepared for: Centre for Marine Science and Technology Prepared
More informationMindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.
Andrew Robbins MindMouse Project Description: MindMouse is an application that interfaces the user s mind with the computer s mouse functionality. The hardware that is required for MindMouse is the Emotiv
More informationVarious Applications of Digital Signal Processing (DSP)
Various Applications of Digital Signal Processing (DSP) Neha Kapoor, Yash Kumar, Mona Sharma Student,ECE,DCE,Gurgaon, India EMAIL: neha04263@gmail.com, yashguptaip@gmail.com, monasharma1194@gmail.com ABSTRACT:-
More informationISSN ICIRET-2014
Robust Multilingual Voice Biometrics using Optimum Frames Kala A 1, Anu Infancia J 2, Pradeepa Natarajan 3 1,2 PG Scholar, SNS College of Technology, Coimbatore-641035, India 3 Assistant Professor, SNS
More informationMusic Genre Classification and Variance Comparison on Number of Genres
Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques
More informationResearch on sampling of vibration signals based on compressed sensing
Research on sampling of vibration signals based on compressed sensing Hongchun Sun 1, Zhiyuan Wang 2, Yong Xu 3 School of Mechanical Engineering and Automation, Northeastern University, Shenyang, China
More informationAcoustic Scene Classification
Acoustic Scene Classification Marc-Christoph Gerasch Seminar Topics in Computer Music - Acoustic Scene Classification 6/24/2015 1 Outline Acoustic Scene Classification - definition History and state of
More informationRobert Alexandru Dobre, Cristian Negrescu
ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q
More informationGYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)
GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE
More informationMusic Source Separation
Music Source Separation Hao-Wei Tseng Electrical and Engineering System University of Michigan Ann Arbor, Michigan Email: blakesen@umich.edu Abstract In popular music, a cover version or cover song, or
More informationAppendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong
Appendix D UW DigiScope User s Manual Willis J. Tompkins and Annie Foong UW DigiScope is a program that gives the user a range of basic functions typical of a digital oscilloscope. Included are such features
More informationSpectrum Analyser Basics
Hands-On Learning Spectrum Analyser Basics Peter D. Hiscocks Syscomp Electronic Design Limited Email: phiscock@ee.ryerson.ca June 28, 2014 Introduction Figure 1: GUI Startup Screen In a previous exercise,
More informationLab 5 Linear Predictive Coding
Lab 5 Linear Predictive Coding 1 of 1 Idea When plain speech audio is recorded and needs to be transmitted over a channel with limited bandwidth it is often necessary to either compress or encode the audio
More informationGCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam
GCT535- Sound Technology for Multimedia Timbre Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Timbre Analysis Definition of Timbre Timbre Features Zero-crossing rate Spectral
More informationComputational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)
Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,
More informationImproving Frame Based Automatic Laughter Detection
Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for
More informationCSC475 Music Information Retrieval
CSC475 Music Information Retrieval Monophonic pitch extraction George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 32 Table of Contents I 1 Motivation and Terminology 2 Psychacoustics 3 F0
More informationClassification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors
Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:
More informationIntroduction To LabVIEW and the DSP Board
EE-289, DIGITAL SIGNAL PROCESSING LAB November 2005 Introduction To LabVIEW and the DSP Board 1 Overview The purpose of this lab is to familiarize you with the DSP development system by looking at sampling,
More informationDAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes
DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring 2009 Week 6 Class Notes Pitch Perception Introduction Pitch may be described as that attribute of auditory sensation in terms
More informationDepartment of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement
Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy
More informationExperiment 13 Sampling and reconstruction
Experiment 13 Sampling and reconstruction Preliminary discussion So far, the experiments in this manual have concentrated on communications systems that transmit analog signals. However, digital transmission
More informationME EN 363 ELEMENTARY INSTRUMENTATION Lab: Basic Lab Instruments and Data Acquisition
ME EN 363 ELEMENTARY INSTRUMENTATION Lab: Basic Lab Instruments and Data Acquisition INTRODUCTION Many sensors produce continuous voltage signals. In this lab, you will learn about some common methods
More informationResearch Article. ISSN (Print) *Corresponding author Shireen Fathima
Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)
More informationAutomatic Rhythmic Notation from Single Voice Audio Sources
Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung
More information1 Introduction to PSQM
A Technical White Paper on Sage s PSQM Test Renshou Dai August 7, 2000 1 Introduction to PSQM 1.1 What is PSQM test? PSQM stands for Perceptual Speech Quality Measure. It is an ITU-T P.861 [1] recommended
More informationRapid prototyping of of DSP algorithms. real-time. Mattias Arlbrant. Grupphandledare, ANC
Rapid prototyping of of DSP algorithms real-time Mattias Arlbrant Grupphandledare, ANC Agenda 1. 1. Our Our DSP DSP system system 2. 2. Creating Creating a Simulink Simulink model model 3. 3. Running Running
More informationAdvanced Signal Processing 2
Advanced Signal Processing 2 Synthesis of Singing 1 Outline Features and requirements of signing synthesizers HMM based synthesis of singing Articulatory synthesis of singing Examples 2 Requirements of
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,
More informationNormalized Cumulative Spectral Distribution in Music
Normalized Cumulative Spectral Distribution in Music Young-Hwan Song, Hyung-Jun Kwon, and Myung-Jin Bae Abstract As the remedy used music becomes active and meditation effect through the music is verified,
More informationAn Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset
An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset By: Abouzar Rahmati Authors: Abouzar Rahmati IS-International Services LLC Reza Adhami University of Alabama in Huntsville April
More informationProject Summary EPRI Program 1: Power Quality
Project Summary EPRI Program 1: Power Quality April 2015 PQ Monitoring Evolving from Single-Site Investigations. to Wide-Area PQ Monitoring Applications DME w/pq 2 Equating to large amounts of PQ data
More informationThe Measurement Tools and What They Do
2 The Measurement Tools The Measurement Tools and What They Do JITTERWIZARD The JitterWizard is a unique capability of the JitterPro package that performs the requisite scope setup chores while simplifying
More informationAN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY
AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY Eugene Mikyung Kim Department of Music Technology, Korea National University of Arts eugene@u.northwestern.edu ABSTRACT
More informationPRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS
8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING" 19-21 April 2012, Tallinn, Estonia PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS Astapov,
More informationSupervised Learning in Genre Classification
Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music
More informationSpeech Recognition Combining MFCCs and Image Features
Speech Recognition Combining MFCCs and Image Featres S. Karlos from Department of Mathematics N. Fazakis from Department of Electrical and Compter Engineering K. Karanikola from Department of Mathematics
More informationInstrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationThe Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng
The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,
More informationGetting Started with the LabVIEW Sound and Vibration Toolkit
1 Getting Started with the LabVIEW Sound and Vibration Toolkit This tutorial is designed to introduce you to some of the sound and vibration analysis capabilities in the industry-leading software tool
More informationStudy of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet
American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629
More informationAnalyzing Modulated Signals with the V93000 Signal Analyzer Tool. Joe Kelly, Verigy, Inc.
Analyzing Modulated Signals with the V93000 Signal Analyzer Tool Joe Kelly, Verigy, Inc. Abstract The Signal Analyzer Tool contained within the SmarTest software on the V93000 is a versatile graphical
More informationDetection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1
International Conference on Applied Science and Engineering Innovation (ASEI 2015) Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1 1 China Satellite Maritime
More informationAudio-Based Video Editing with Two-Channel Microphone
Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science
More informationComparison Parameters and Speaker Similarity Coincidence Criteria:
Comparison Parameters and Speaker Similarity Coincidence Criteria: The Easy Voice system uses two interrelating parameters of comparison (first and second error types). False Rejection, FR is a probability
More informationINSPECTION EQUIPMENT FOR SUPPLIED CASES WITH EMPTY BOTTLES
INSPECTION EQUIPMENT FOR SUPPLIED CASES WITH EMPTY BOTTLES Georgy Slavchev Mihov 1, Stanimir Damyanov Mollov 1, Ratcho Marinov Ivanov 1, Stoyan Nikolov Jilov 2 1 Faculty of Electronic Engineering and Technologies,
More informationTopic 4. Single Pitch Detection
Topic 4 Single Pitch Detection What is pitch? A perceptual attribute, so subjective Only defined for (quasi) harmonic sounds Harmonic sounds are periodic, and the period is 1/F0. Can be reliably matched
More informationDigital music synthesis using DSP
Digital music synthesis using DSP Rahul Bhat (124074002), Sandeep Bhagwat (123074011), Gaurang Naik (123079009), Shrikant Venkataramani (123079042) DSP Application Assignment, Group No. 4 Department of
More informationFeatures for Audio and Music Classification
Features for Audio and Music Classification Martin F. McKinney and Jeroen Breebaart Auditory and Multisensory Perception, Digital Signal Processing Group Philips Research Laboratories Eindhoven, The Netherlands
More informationDrum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods
Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National
More informationChapter 1. Introduction to Digital Signal Processing
Chapter 1 Introduction to Digital Signal Processing 1. Introduction Signal processing is a discipline concerned with the acquisition, representation, manipulation, and transformation of signals required
More informationTempo and Beat Analysis
Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:
More informationCS229 Project Report Polyphonic Piano Transcription
CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project
More informationONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION
ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION Travis M. Doll Ray V. Migneco Youngmoo E. Kim Drexel University, Electrical & Computer Engineering {tmd47,rm443,ykim}@drexel.edu
More informationA METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS
A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS Matthew Roddy Dept. of Computer Science and Information Systems, University of Limerick, Ireland Jacqueline Walker
More informationMIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003
MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003 OBJECTIVE To become familiar with state-of-the-art digital data acquisition hardware and software. To explore common data acquisition
More informationRemoval of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm
Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm Majid Aghasi*, and Alireza Jalilian** *Department of Electrical Engineering, Iran University of Science and Technology,
More informationThe Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC
INTRODUCTION The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC The Time Series Forecasting System (TSFS) is a component of SAS/ETS that provides a menu-based
More informationSwept-tuned spectrum analyzer. Gianfranco Miele, Ph.D
Swept-tuned spectrum analyzer Gianfranco Miele, Ph.D www.eng.docente.unicas.it/gianfranco_miele g.miele@unicas.it Video section Up until the mid-1970s, spectrum analyzers were purely analog. The displayed
More informationMAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button
MAutoPitch Presets button Presets button shows a window with all available presets. A preset can be loaded from the preset window by double-clicking on it, using the arrow buttons or by using a combination
More informationAPP USE USER MANUAL 2017 VERSION BASED ON WAVE TRACKING TECHNIQUE
APP USE USER MANUAL 2017 VERSION BASED ON WAVE TRACKING TECHNIQUE All rights reserved All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
More informationIntroduction to QScan
Introduction to QScan Shourov K. Chatterji SciMon Camp LIGO Livingston Observatory 2006 August 18 QScan web page Much of this talk is taken from the QScan web page http://www.ligo.caltech.edu/~shourov/q/qscan/
More informationPhone-based Plosive Detection
Phone-based Plosive Detection 1 Andreas Madsack, Grzegorz Dogil, Stefan Uhlich, Yugu Zeng and Bin Yang Abstract We compare two segmentation approaches to plosive detection: One aproach is using a uniform
More informationSIDRA INTERSECTION 8.0 UPDATE HISTORY
Akcelik & Associates Pty Ltd PO Box 1075G, Greythorn, Vic 3104 AUSTRALIA ABN 79 088 889 687 For all technical support, sales support and general enquiries: support.sidrasolutions.com SIDRA INTERSECTION
More informationHidden Markov Model based dance recognition
Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,
More informationAcoustic Echo Canceling: Echo Equality Index
Acoustic Echo Canceling: Echo Equality Index Mengran Du, University of Maryalnd Dr. Bogdan Kosanovic, Texas Instruments Industry Sponsored Projects In Research and Engineering (INSPIRE) Maryland Engineering
More informationLab experience 1: Introduction to LabView
Lab experience 1: Introduction to LabView LabView is software for the real-time acquisition, processing and visualization of measured data. A LabView program is called a Virtual Instrument (VI) because
More information19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;
More informationAutomatic Classification of Instrumental Music & Human Voice Using Formant Analysis
Automatic Classification of Instrumental Music & Human Voice Using Formant Analysis I Diksha Raina, II Sangita Chakraborty, III M.R Velankar I,II Dept. of Information Technology, Cummins College of Engineering,
More informationThe following exercises illustrate the execution of collaborative simulations in J-DSP. The exercises namely a
Exercises: The following exercises illustrate the execution of collaborative simulations in J-DSP. The exercises namely a Pole-zero cancellation simulation and a Peak-picking analysis and synthesis simulation
More informationUpgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server. Milos Sedlacek 1, Ondrej Tomiska 2
Upgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server Milos Sedlacek 1, Ondrej Tomiska 2 1 Czech Technical University in Prague, Faculty of Electrical Engineeiring, Technicka
More informationSinging voice synthesis based on deep neural networks
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
More informationGetting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.
Getting Started First thing you should do is to connect your iphone or ipad to SpikerBox with a green smartphone cable. Green cable comes with designators on each end of the cable ( Smartphone and SpikerBox
More informationA COMPUTER VISION SYSTEM TO READ METER DISPLAYS
A COMPUTER VISION SYSTEM TO READ METER DISPLAYS Danilo Alves de Lima 1, Guilherme Augusto Silva Pereira 2, Flávio Henrique de Vasconcelos 3 Department of Electric Engineering, School of Engineering, Av.
More informationECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer
ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer by: Matt Mazzola 12222670 Abstract The design of a spectrum analyzer on an embedded device is presented. The device achieves minimum
More informationLaboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB
Laboratory Assignment 3 Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB PURPOSE In this laboratory assignment, you will use MATLAB to synthesize the audio tones that make up a well-known
More informationSystem Identification
System Identification Arun K. Tangirala Department of Chemical Engineering IIT Madras July 26, 2013 Module 9 Lecture 2 Arun K. Tangirala System Identification July 26, 2013 16 Contents of Lecture 2 In
More informationJoseph Wakooli. Designing an Analysis Tool for Digital Signal Processing
Joseph Wakooli Designing an Analysis Tool for Digital Signal Processing Helsinki Metropolia University of Applied Sciences Bachelor of Engineering Information Technology Thesis 30 May 2012 Abstract Author(s)
More informationSmart Traffic Control System Using Image Processing
Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,
More informationVirtual Vibration Analyzer
Virtual Vibration Analyzer Vibration/industrial systems LabVIEW DAQ by Ricardo Jaramillo, Manager, Ricardo Jaramillo y Cía; Daniel Jaramillo, Engineering Assistant, Ricardo Jaramillo y Cía The Challenge:
More informationPitch-Synchronous Spectrogram: Principles and Applications
Pitch-Synchronous Spectrogram: Principles and Applications C. Julian Chen Department of Applied Physics and Applied Mathematics May 24, 2018 Outline The traditional spectrogram Observations with the electroglottograph
More informationSingle Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics
Master Thesis Signal Processing Thesis no December 2011 Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics Md Zameari Islam GM Sabil Sajjad This thesis is presented
More informationSinger Traits Identification using Deep Neural Network
Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic
More information