Blind Identification of Source Mobile Devices Using VoIP Calls

Size: px
Start display at page:

Download "Blind Identification of Source Mobile Devices Using VoIP Calls"

Transcription

1 Blind Identification of Source Mobile Devices Using VoIP Calls Mehdi Jahanirad 1, Ainuddin Wahid Abdul Wahab, Nor Badrul Anuar, Mohd Yamani Idna Idris, and Mohamad Nizam Ayub Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur, Malaysia Security Research Group (SECReg) 1 mehdijahanirad@siswa.um.edu.my Abstract Sources such as speakers and environments from different communication devices produce signal variations that result in interference generated by different communication devices. Despite these convolutions, signal variations produced by different mobile devices leave intrinsic fingerprints on recorded calls, thus allowing the tracking of the models and brands of engaged mobile devices. This study aims to investigate the use of recorded Voice over Internet Protocol calls in the blind identification of source mobile devices. The proposed scheme employs a combination of entropy and mel-frequency cepstrum coefficients to extract the intrinsic features of mobile devices and analyzes these features with a multi-class support vector machine classifier. The experimental results lead to an accurate identification of 10 source mobile devices with an average accuracy of 99.72%. Index Terms Pattern recognition, mel-frequency cepstrum coefficients, entropy, device-based detection technique I. INTRODUCTION Audio forensics has attracted increasing attention in recent years because of its application in different situations that require trust in the authenticity and integrity of audio signals [1]. An example of such application is the forensic acquisition, analysis, and evaluation of admissible audio recordings as evidence in court cases [2]. Current audio authenticity approaches are categorized according to the artifacts extracted from the signal itself. These approaches include: (a) environment-based techniques in which the the frequency spectra are forced through the recording environment, (b) device-based techniques in which the frequency spectra are produced by a recording device, and (c) ENF-based techniques in which the frequency spectra are generated by the power source of the recording device [3]. Although advanced research has been conducted on ENF-based techniques [4], [5] and environmentbased techniques [6], [7], few have explored the application of device-based techniques in real-time forensics [1], [8]. Device-based techniques are based on blind source camera identification in image forensics [9], [10], [11]. However, the adaptation of this approach in audio forensics is challenging because audio evidence is produced by the combination of audio sources such as speakers and environments. The first practical evaluation on source microphone authentication was developed by Kraetzer et al. [8] through statistical pattern recognition techniques. This method utilizes features from the detection of hidden communication and identifies the origin of audio streams. Buchholz et al. [12] focused on microphone classification by using Fourier coefficients histogram as audio features. This method eliminates speech convolution by computing the coefficients from near silent frames. Although microphone forensics allows the identification of source recording devices, it cannot provide sufficient evidence to identify source communication devices. Garcia- Romero and Espy-Wilson [13] proposed an automatic acquisition device identification method using speech recordings from the Lincoln-Labs Handset Database based on both microphone and landline telephone handsets. This method eliminates the effects of signal variations caused by speech signals through frequency response characterization of the device contextualized by the speech content. The method implements Gaussian mixture models on 23 mel-frequency cepstrum coefficients (MFCCs), 38 linear-frequency cepstrum coefficients, and their combination with the first-order derivative (delta) of both feature sets to determine the Gaussian supervector (GSV) associated with each device. The linear support vector machine (SVM) classifier builds the training model by using this vector as an intrinsic fingerprint of individual acquisition devices. Hanilçi et al. [14] proposed a method based on advanced speaker recognition research [15], [16], [17] to extract MFCCs as features from recorded speech. This method collects recorded speech samples from different mobile devices to identify their source brands and models. However, Hanilçi et al. [14] considered mobile devices as ordinary tape recorders to eliminate the complications when transmitting and receiving signals. Referring to the acquisition device identification methods in [13], [14], Panagakis and Kotropoulos [18] proposed a telephone handset identification method that uses random spectral features (RSFs) and labeled spectral features (LSFs). This method extracts the RSFs and LSFs from the mean spectrogram of speech signals. The method also uses sparse representation-based classification (SRC), as well as neural network (NN) and SVM classifiers to assess its performance in classifying the dataset in which are obtained from eight telephone handsets /14/$ IEEE 486

2 Previous device-based techniques focused on authentication based on recording devices. In the present work, we propose new approach to the identification of source mobile devices that are engaged in VoIP calls. The mobile devices used in this study are equipped with built-in circuits and electrical components. The digital signal processing in these devices produces signal variations. Thus, calls recorded using these devices contain intrinsic artifacts that are captured using a combination of entropy and MFCC features. Furthermore, this method uses the near-silent segments of signals for feature extraction to eliminate the interference resulting from the variation in speakers. Finally, the combined feature set is analyzed with a multiclass SVM classifier to identify 10 source mobile devices. The logarithm of the filterbank outputs is used to determine the spectral envelope in decibels. Eventually, the discrete cosine transform of these envelopes determine the MFCCs. During this process, 12 coefficients are computed by the MFCC algorithm. Each row in the mel cepstrum output represents the 12 coefficients computed for each frame (Fig. 2). II. ALGORITHM OVERVIEW The proposed algorithm includes two main stages: feature extraction and feature analysis. Feature extraction determines meaningful information from a collection of call recordings to distinguish the mobile devices from one another. Feature analysis utilizes these features in building the model for each mobile device and tests the model to evaluate its performance in detecting all mobile devices from the same class. The class represents the brand and model of the mobile devices. A. Feature extraction The combination of entropy and MFCC has been used in speech recognition to improve its performance and robustness in the presence of additive noise [19], [20], [21]. The present study explores the entropy of the mel frequency cepstrum spectrum as feature for the blind identification of source mobile devices. Fig. 1 illustrates the computation of MFCCs. Fig. 2. entropy-mfcc feature extraction At this point, the feature extraction algorithm uses the entropy to capture the peakiness of the distribution among all frames in the mel cepstrum output. For the mel cepstrum output of M i,j, M is the array with a size of {N 12}, when N is the total number of frames, i = {1, 2, 3,...,N}, and j = {1, 2, 3,...,12}. The algorithm computes the entropy for the 12 coefficients in two stages. First, it normalizes the spectrum into the probability mass function (PMF) through (2) X i x(i) = N i=1 X for i =1to N, (2) i where X i is the energy of the i th frequency component, and x i is the PMF of the signal. Second, it computes the entropy H(x) as Fig. 1. Computation of MFCCs H(x) = x X x i. log 2 x i. (3) The feature extraction algorithm generates blocks by splitting the audio frames and then extracts the MFCC features from all blocks that are generated from the sample data. The algorithm splits each block into approximately 23 ms frames, with each frame windowed with the Hamming window in the time domain. Subsequently, it determines the FFT magnitude spectrum from the windowed frame and filters the spectrum using 27 triangular-shaped filters in the mel domain. The mel domain is linear for frequencies below 1000 Hz and logarithmic for frequencies more than 1000 Hz. Equation (1) computes the mel domain central frequencies for the logarithms of base 10. log (1 + f/1000) f mel = 1000 (1) log 2 Moreover, the algorithm generates a total of 12 entropy- MFCC features through MATLAB functions. B. Feature analysis Feature analysis investigates the extracted features by using classification techniques. This study applies SVM in classification because of its satisfactory performance in pattern recognition approaches [22]. SVMs are initially designed for 1-to-1 classifications. The multi-class SVM classifier generates N(N 1)/2 binary SVM classifiers, where each classifier is trained to separate each pair of classes, and N represents the number of classes. The binary classifiers are combined using the classical voting system when the class with the maximal number of votes is estimated. The classification technique involves two stages: training and testing. Feature analysis organizes features in 10 classes /14/$ IEEE 487

3 with respect to the 10 mobile devices in Table I. For each class, the extracted features produce a total data subset represented by the same label. The method randomly selects 70% of the data subset for training and uses the remaining 30% for testing. The classifier builds the training model using the training data subset. Then, the classifier predicts the labels corresponding to the testing data subset based on the training model and without considering its true labels. For evaluation, the classifier compares the actual classes against the predicted labels to determine the number of correct matches. The method computes the identification accuracy with the fraction of the total number of correct matches to the total number of testing data. The method is repeated 10 times. In each repetition different training and testing data subsets are randomly selected, and the average identification accuracy is then determined for each mobile device. III. EXPERIMENTAL SETUP The proposed setup involves the collection of call recordings, as shown in Fig. 3. We record a total of 25 Skype calls for each device in a truly silent environment. The devices are listed in Table I. The silent session eliminates the possible convolutions caused by speech signals generated by different speakers. The MP3 Skype call recorder v.3.1 freeware application [23] records the signals in.mp3 format. The method converts the recorded files to.wav format and then digitizes them into sample data. Then, it enhances the sample data to remove the noise generated from environmental reverberations. The histograms illustrated in Fig. 4 indicate the distinctiveness of the spectrum of the recording signal obtained from the mobile devices of the same model. However, the clean signals are more distinct than the noisy signals. The color determines the noise level in decibels with respect to the color plot in the right side of the histogram. Evidently, the colors vary with respect to the fact that high-level noises are reduced by using the enhancement process. The enhancement process uses g=1 as the subtraction domain and e=1 as the gain exponent for the magnitude domain spectral subtraction [24]. TABLE I MOBILE DEVICES, MODELS, AND CLASS NAMES USED IN THE EXPERIMENTS Mobile Devices Models Operating Class System Name Galaxy Note 10.1-A GT-N8000 Android GNA Galaxy Note 10.1-B GT-N8000 Android GNB Galaxy Note GT-N7000 Android GN Galaxy Note II-A GT-N7100 Android GNIIA Galaxy Note II-B GT-N7100 Android GNIIB Galaxy Tab 10.1 GT-P7500 Android 3.1 GT Apple ipad MC775ZP Apple ios ipada Apple ipad New MD366ZP Apple ios ipadb Asus Nexus 7 Android Asus HTC Sensation XE - Android HTC The proposed algorithm segments the clean signals into overlapping frames with a length of 40 samples. The shorten frame signals consist of an array with a size of {N f n}, where N f is number of frames, and n is the frame length. The proposed frame shortening method segments the recorded signal with a length of 8 s into blocks of approximately 200 ms. As a result, we generate a total of 1000 blocks from the recorded calls from each mobile device. The 12 entropy- MFCC features are computed by using the generated blocks to obtain the data subset with a length 1000 for each mobile device. The method randomly selects 700 blocks for training and uses the remaining 300 blocks to test the data subset. We thus obtain 7000 training and 3000 testing data from the 10 mobile devices. The experiment is repeated 10 times, and the average accuracy is computed. Fig. 3. Proposed set up for recording conversations IV. RESULTS Table II shows the average confusion matrix generated by running 10 experiments using the 10-class SVM classifier. The diagonal values of the matrix represent the respective classification accuracies of the 10 mobile devices, whereas the non-diagonal values indicate the misclassification among the mobile devices. A high average classification accuracy of 99.72% is achieved for all mobile devices. The percentage of misclassification among the mobile devices is negligible (less than 0.27%). The mobile devices of the same model (Galaxy Note 10.1-A, B and Galaxy Note II-A, B) have high average accuracy rates of 99.74% and 99.76%, respectively. In an alternative approach, Fig. 5 visualizes the classification results by using the Euclidean distance similarity methods adopted from [25]. This method determines the similarity distances between the feature values with the Euclidean distance matrix of N N and then reduces its dimension to 2 to determine the X and Y components (Fig. 5). Each color represents the class label of the dataset associated with each mobile device. The unfilled markers represent data instance from the training data subset, and the filled markers represent data instance from the testing data subset. As shown in Fig. 5, the Euclidean distance method clusters both the training and testing data subsets into /14/$ IEEE 488

4 (a) Galaxy Note 10.1-A (b) Galaxy Note I0.1-B (c) Galaxy Note II-A (d) Galaxy Note Fig. 4. Histogram comparison of the recording signals from mobile devices of the same model (clean vs. noisy signal) /14/$ IEEE 489

5 TABLE II CONFUSION MATRIX FOR IDENTIFYING SOURCE MOBILE DEVICES BASED ON CALL RECORDINGS Total average accuracy rate Predicted (%) 99.72% GNA GNB GN GNIIA GNIIB GT ipada ipadb Asus HTC Actual GNA GNB GN GNIIA GNIIB GT ipad-a ipad-b Asus HTC Note: The cell marked with an asterisk indicates a value of less than 0.1%. Fig. 5. Clustering of training (unfilled markers) and testing (filled markers) data subsets into 10 groups using the Euclidean distance method. 10 groups. This observation confirms the results obtained by the ten-class SVM classifier. We can therefore infer that the proposed entropy-mfcc features are effective in the blind identification of source mobile devices using recorded VoIP calls. V. CONCLUSION In this work, we present an approach to the identification of source mobile devices using recorded VoIP calls. We adopted MFCC and entropy features from speech recognition studies to develop the framework for identifying the distinguishing pattern in different mobile devices. Given the use of a silent Skype session in the investigation, the difference between the samples is only caused by the different mobile devices. An average accuracy of 99.72% was achieved for all 10 devices. Most notably, this study is the first to investigate the distinguishing features of source mobile devices using VoIP calls. The results of our study suggest that the proposed approach should be tested in the identification of different types of mobile devices using conversations recorded during communication via any type of service provider, such as cellular, PSTN, and subsets. ACKNOWLEDGMENT We would like to thank the UM/MoHE High Impact Research Grant Allocation (UM.C/HIR/MOHE/FCSIT/17) for /14/$ IEEE 490

6 funding this research and all members of the Security Research Group (SECReg) of the Department of Computer System and Technology of the University of Malaya for sharing their knowledge and experience. They led us through many helpful discussions and have been a constant source of motivation, guidance, encouragement, and trust. REFERENCES [1] C. Kraetzer, K. Qian, and J. Dittmann, Extending a context model for microphone forensics, in Proc. Conference on Media Watermarking, Security, and Forensics, Burlingame, CA, [2] R. Maher, Audio forensic examination, IEEE Signal Process. Mag., vol. 26, no. 2, pp , Mar [3] S. Gupta, S. Cho, and C.-C. Kuo, Current developments and future trends in audio authentication, IEEE Multimedia, vol. 19, no. 1, pp , Jan [4] A. J. Cooper, Further considerations for the analysis of ENF data for forensic audio and video applications, International Journal of Speech Language and The Law, vol. 18, no. 1, pp , [5] J. Ode Ojowu, J. Johan Karlsson, and Y. Liu, ENF extraction from digital recordings using adaptive techniques and frequency tracking, IEEE Trans. Inf. Forensics Security, vol. 7, no. 4, pp , Aug [6] A. Rabaoui, M. Davy, S. Rossignol, and N. Ellouze, Using one-class SVMs and wavelets for audio surveillance, IEEE Trans. Inf. Forensics Security, vol. 3, no. 4, pp , dec [7] G. Muhammad and K. Alghathbar, Environment recognition for digital audio forensics using MPEG-7 and mel cepstral features, Journal of Electrical Engineering, vol. 62, no. 4, pp , Aug [8] C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang, Digital audio forensics: a first practical evaluation on microphone and environment classification, in Proc. Workshop on Multimedia & Security, Dallas, Texas, USA, 2007, pp [9] M. Kharrazi, H. Sencar, and N. Memon, Blind source camera identification, in Proc. International Conference on Image Processing (ICIP 04), vol. 1, Oct. 2004, pp [10] O. Celiktutan, B. Sankur, and I. Avcibas, Blind identification of source cell-phone model, IEEE Trans. Inf. Forensics Security, vol. 3, no. 3, pp , Sep [11] A. Swaminathan, M. Wu, and K. Liu, Nonintrusive component forensics of visual sensors using output images, IEEE Trans. Inf. Forensics Security, vol. 2, no. 1, pp , Mar [12] R. Buchholz, C. Kraetzer, and J. Dittman, Microphone classification using fourier coefficients, Information Hiding,LNCS, no. 5806, pp , [13] D. Garcia-Romero and C. Y. Espy-Wilson, Automatic acquisition device identification from speech recordings, in Proc IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Dallas, Texas, USA, 2010, p [14] C. Hanili, F. Erta, T. Erta, and mer Eskidere, Recognition of brand and models of cell-phones from recorded speech signals, IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, pp , Apr [15] F. Bimbot et al., A tutorial on text-independent speaker verification, EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, [16] W. Campbell, Generalized linear discriminant sequence kernels for speaker recognition, in Proc. Int. Conf. on Acoustics, Speech Signal Pro, 2002, pp [17] W. M. Campbell and K. T. Assaleh, Speaker recognition with polynomial classifiers, IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp , May [18] Y. Panagakis and C. Kotropoulos, Telephone handset identification by feature selection and sparse representations, in Proc Workshop on Information Forensics & Security, Tenerife, Spain, 2012, pp [19] H. Misra, S. Ikbal, H. Bourlard, and H. Hermansky, Spectral entropy based feature for robust ASR, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 04), vol. 1, 2004, pp. I [20] H. Yeganeh, S. Ahadi, S. Mirrezaie, and A. Ziaei, Weighting of mel sub-bands based on SNR/entropy for robust ASR, in Proc. IEEE International Symposium on Signal Processing and Information Technology,(ISSPIT 2008), 2008, pp [21] Y. H. Lee and H. K. Kim, Entropy coding of compressed feature parameters for distributed speech recognition, Speech Communication, vol. 52, no. 5, pp , [22] I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Techniques. Burlington, MA 01803, USA: Elsevier Inc., [23] MP3 Skype Recorder v.3.1. [Online]. Available: [24] R. Berouti, M. Schwartz and J. Makhoul, Enhancement of speech corrupted by acoustic noise, in Proc IEEE ICASSP, no. 4, 1979, pp [25] T. Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications. London, UK: OReilly Media, Inc., /14/$ IEEE 491

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL

More information

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION Hui Su, Adi Hajj-Ahmad, Min Wu, and Douglas W. Oard {hsu, adiha, minwu, oard}@umd.edu University of Maryland, College Park ABSTRACT The electric

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE

More information

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution. CS 229 FINAL PROJECT A SOUNDHOUND FOR THE SOUNDS OF HOUNDS WEAKLY SUPERVISED MODELING OF ANIMAL SOUNDS ROBERT COLCORD, ETHAN GELLER, MATTHEW HORTON Abstract: We propose a hybrid approach to generating

More information

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION Paulo V. K. Borges Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) 07942084331 vini@ieee.org PRESENTATION Electronic engineer working as researcher at University of London. Doctorate in digital image/video

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

Neural Network for Music Instrument Identi cation

Neural Network for Music Instrument Identi cation Neural Network for Music Instrument Identi cation Zhiwen Zhang(MSE), Hanze Tu(CCRMA), Yuan Li(CCRMA) SUN ID: zhiwen, hanze, yuanli92 Abstract - In the context of music, instrument identi cation would contribute

More information

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

Chapter 1. Introduction to Digital Signal Processing

Chapter 1. Introduction to Digital Signal Processing Chapter 1 Introduction to Digital Signal Processing 1. Introduction Signal processing is a discipline concerned with the acquisition, representation, manipulation, and transformation of signals required

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

Supervised Learning in Genre Classification

Supervised Learning in Genre Classification Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music

More information

Improving Frame Based Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Investigation

More information

Recognising Cello Performers using Timbre Models

Recognising Cello Performers using Timbre Models Recognising Cello Performers using Timbre Models Chudy, Magdalena; Dixon, Simon For additional information about this publication click this link. http://qmro.qmul.ac.uk/jspui/handle/123456789/5013 Information

More information

Acoustic Scene Classification

Acoustic Scene Classification Acoustic Scene Classification Marc-Christoph Gerasch Seminar Topics in Computer Music - Acoustic Scene Classification 6/24/2015 1 Outline Acoustic Scene Classification - definition History and state of

More information

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, Nobukata Ono, Shigeki Sagama The University of Tokyo, Graduate

More information

2. AN INTROSPECTION OF THE MORPHING PROCESS

2. AN INTROSPECTION OF THE MORPHING PROCESS 1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,

More information

Speech and Speaker Recognition for the Command of an Industrial Robot

Speech and Speaker Recognition for the Command of an Industrial Robot Speech and Speaker Recognition for the Command of an Industrial Robot CLAUDIA MOISA*, HELGA SILAGHI*, ANDREI SILAGHI** *Dept. of Electric Drives and Automation University of Oradea University Street, nr.

More information

Singer Identification

Singer Identification Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27 Outline 1 Introduction Applications Challenges

More information

Recognising Cello Performers Using Timbre Models

Recognising Cello Performers Using Timbre Models Recognising Cello Performers Using Timbre Models Magdalena Chudy and Simon Dixon Abstract In this paper, we compare timbre features of various cello performers playing the same instrument in solo cello

More information

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM 19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM Tomoko Matsui

More information

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES Zhiyao Duan 1, Bryan Pardo 2, Laurent Daudet 3 1 Department of Electrical and Computer Engineering, University

More information

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY (Invited Paper) Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University, Stanford, CA 94305 {amaaron,bgirod}@stanford.edu Abstract

More information

PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS

PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS 8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING" 19-21 April 2012, Tallinn, Estonia PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS Astapov,

More information

Chord Classification of an Audio Signal using Artificial Neural Network

Chord Classification of an Audio Signal using Artificial Neural Network Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

ISSN ICIRET-2014

ISSN ICIRET-2014 Robust Multilingual Voice Biometrics using Optimum Frames Kala A 1, Anu Infancia J 2, Pradeepa Natarajan 3 1,2 PG Scholar, SNS College of Technology, Coimbatore-641035, India 3 Assistant Professor, SNS

More information

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Fengyan Wu fengyanyy@163.com Shutao Sun stsun@cuc.edu.cn Weiyao Xue Wyxue_std@163.com Abstract Automatic extraction of

More information

Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor

Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor Ghulam Muhammad 1, Muneer H. Al-Hammadi 1, Muhammad Hussain 2, Anwar M. Mirza 1, and George Bebis 3 1 Dept.

More information

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC Arijit Ghosal, Rudrasis Chakraborty, Bibhas Chandra Dhara +, and Sanjoy Kumar Saha! * CSE Dept., Institute of Technology

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

Voice Controlled Car System

Voice Controlled Car System Voice Controlled Car System 6.111 Project Proposal Ekin Karasan & Driss Hafdi November 3, 2016 1. Overview Voice controlled car systems have been very important in providing the ability to drivers to adjust

More information

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:

More information

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons Róisín Loughran roisin.loughran@ul.ie Jacqueline Walker jacqueline.walker@ul.ie Michael O Neill University

More information

Figure 1: Feature Vector Sequence Generator block diagram.

Figure 1: Feature Vector Sequence Generator block diagram. 1 Introduction Figure 1: Feature Vector Sequence Generator block diagram. We propose designing a simple isolated word speech recognition system in Verilog. Our design is naturally divided into two modules.

More information

MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND

MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND Aleksander Kaminiarz, Ewa Łukasik Institute of Computing Science, Poznań University of Technology. Piotrowo 2, 60-965 Poznań, Poland e-mail: Ewa.Lukasik@cs.put.poznan.pl

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

Digital Signal Processing. Prof. Dietrich Klakow Rahil Mahdian

Digital Signal Processing. Prof. Dietrich Klakow Rahil Mahdian Digital Signal Processing Prof. Dietrich Klakow Rahil Mahdian Language Teaching: English Questions: English (or German) Slides: English Tutorials: one English and one German group Exercise sheets: most

More information

Normalized Cumulative Spectral Distribution in Music

Normalized Cumulative Spectral Distribution in Music Normalized Cumulative Spectral Distribution in Music Young-Hwan Song, Hyung-Jun Kwon, and Myung-Jin Bae Abstract As the remedy used music becomes active and meditation effect through the music is verified,

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Emotion Recognition. Jaesung Lee. Chung-Ang University Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or

More information

Speech Recognition Combining MFCCs and Image Features

Speech Recognition Combining MFCCs and Image Features Speech Recognition Combining MFCCs and Image Featres S. Karlos from Department of Mathematics N. Fazakis from Department of Electrical and Compter Engineering K. Karanikola from Department of Mathematics

More information

UC San Diego UC San Diego Previously Published Works

UC San Diego UC San Diego Previously Published Works UC San Diego UC San Diego Previously Published Works Title Classification of MPEG-2 Transport Stream Packet Loss Visibility Permalink https://escholarship.org/uc/item/9wk791h Authors Shin, J Cosman, P

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy

More information

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS Steven K. Tjoa and K. J. Ray Liu Signals and Information Group, Department of Electrical and Computer Engineering

More information

Voice & Music Pattern Extraction: A Review

Voice & Music Pattern Extraction: A Review Voice & Music Pattern Extraction: A Review 1 Pooja Gautam 1 and B S Kaushik 2 Electronics & Telecommunication Department RCET, Bhilai, Bhilai (C.G.) India pooja0309pari@gmail.com 2 Electrical & Instrumentation

More information

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics Master Thesis Signal Processing Thesis no December 2011 Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics Md Zameari Islam GM Sabil Sajjad This thesis is presented

More information

Experiments on musical instrument separation using multiplecause

Experiments on musical instrument separation using multiplecause Experiments on musical instrument separation using multiplecause models J Klingseisen and M D Plumbley* Department of Electronic Engineering King's College London * - Corresponding Author - mark.plumbley@kcl.ac.uk

More information

Line-Adaptive Color Transforms for Lossless Frame Memory Compression

Line-Adaptive Color Transforms for Lossless Frame Memory Compression Line-Adaptive Color Transforms for Lossless Frame Memory Compression Joungeun Bae 1 and Hoon Yoo 2 * 1 Department of Computer Science, SangMyung University, Jongno-gu, Seoul, South Korea. 2 Full Professor,

More information

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique Dhaval R. Bhojani Research Scholar, Shri JJT University, Jhunjunu, Rajasthan, India Ved Vyas Dwivedi, PhD.

More information

Automatic Construction of Synthetic Musical Instruments and Performers

Automatic Construction of Synthetic Musical Instruments and Performers Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.

More information

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling International Conference on Electronic Design and Signal Processing (ICEDSP) 0 Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling Aditya Acharya Dept. of

More information

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Jana Eggink and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 11

More information

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam GCT535- Sound Technology for Multimedia Timbre Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Timbre Analysis Definition of Timbre Timbre Features Zero-crossing rate Spectral

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

A Survey on: Sound Source Separation Methods

A Survey on: Sound Source Separation Methods Volume 3, Issue 11, November-2016, pp. 580-584 ISSN (O): 2349-7084 International Journal of Computer Engineering In Research Trends Available online at: www.ijcert.org A Survey on: Sound Source Separation

More information

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication Journal of Energy and Power Engineering 10 (2016) 504-512 doi: 10.17265/1934-8975/2016.08.007 D DAVID PUBLISHING A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations

More information

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES 12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions 1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,

More information

Singing voice synthesis based on deep neural networks

Singing voice synthesis based on deep neural networks INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

More information

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University

More information

Music Segmentation Using Markov Chain Methods

Music Segmentation Using Markov Chain Methods Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some

More information

A prototype system for rule-based expressive modifications of audio recordings

A prototype system for rule-based expressive modifications of audio recordings International Symposium on Performance Science ISBN 0-00-000000-0 / 000-0-00-000000-0 The Author 2007, Published by the AEC All rights reserved A prototype system for rule-based expressive modifications

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;

More information

Piya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan

Piya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan Piya Pal 1200 E. California Blvd MC 136-93 Pasadena, CA 91125 Tel: 626-379-0118 E-mail: piyapal@caltech.edu http://www.systems.caltech.edu/~piyapal/ Education Ph.D. in Electrical Engineering Sep. 2007

More information

Detecting Musical Key with Supervised Learning

Detecting Musical Key with Supervised Learning Detecting Musical Key with Supervised Learning Robert Mahieu Department of Electrical Engineering Stanford University rmahieu@stanford.edu Abstract This paper proposes and tests performance of two different

More information

MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION. Gregory Sell and Pascal Clark

MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION. Gregory Sell and Pascal Clark 214 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION Gregory Sell and Pascal Clark Human Language Technology Center

More information

Classification of Timbre Similarity

Classification of Timbre Similarity Classification of Timbre Similarity Corey Kereliuk McGill University March 15, 2007 1 / 16 1 Definition of Timbre What Timbre is Not What Timbre is A 2-dimensional Timbre Space 2 3 Considerations Common

More information

Comparison Parameters and Speaker Similarity Coincidence Criteria:

Comparison Parameters and Speaker Similarity Coincidence Criteria: Comparison Parameters and Speaker Similarity Coincidence Criteria: The Easy Voice system uses two interrelating parameters of comparison (first and second error types). False Rejection, FR is a probability

More information

ISSN (Print) Original Research Article. Coimbatore, Tamil Nadu, India

ISSN (Print) Original Research Article. Coimbatore, Tamil Nadu, India Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 016; 4(1):1-5 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources) www.saspublisher.com

More information

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

Singer Recognition and Modeling Singer Error

Singer Recognition and Modeling Singer Error Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing

More information

Topic 10. Multi-pitch Analysis

Topic 10. Multi-pitch Analysis Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds

More information

Smart Traffic Control System Using Image Processing

Smart Traffic Control System Using Image Processing Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,

More information

TERRESTRIAL broadcasting of digital television (DTV)

TERRESTRIAL broadcasting of digital television (DTV) IEEE TRANSACTIONS ON BROADCASTING, VOL 51, NO 1, MARCH 2005 133 Fast Initialization of Equalizers for VSB-Based DTV Transceivers in Multipath Channel Jong-Moon Kim and Yong-Hwan Lee Abstract This paper

More information

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University A Pseudo-Statistical Approach to Commercial Boundary Detection........ Prasanna V Rangarajan Dept of Electrical Engineering Columbia University pvr2001@columbia.edu 1. Introduction Searching and browsing

More information

Music Recommendation from Song Sets

Music Recommendation from Song Sets Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia

More information

MUSICAL NOTE AND INSTRUMENT CLASSIFICATION WITH LIKELIHOOD-FREQUENCY-TIME ANALYSIS AND SUPPORT VECTOR MACHINES

MUSICAL NOTE AND INSTRUMENT CLASSIFICATION WITH LIKELIHOOD-FREQUENCY-TIME ANALYSIS AND SUPPORT VECTOR MACHINES MUSICAL NOTE AND INSTRUMENT CLASSIFICATION WITH LIKELIHOOD-FREQUENCY-TIME ANALYSIS AND SUPPORT VECTOR MACHINES Mehmet Erdal Özbek 1, Claude Delpha 2, and Pierre Duhamel 2 1 Dept. of Electrical and Electronics

More information

DATA hiding technologies have been widely studied in

DATA hiding technologies have been widely studied in IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL 18, NO 6, JUNE 2008 769 A Novel Look-Up Table Design Method for Data Hiding With Reduced Distortion Xiao-Ping Zhang, Senior Member, IEEE,

More information

Tempo and Beat Analysis

Tempo and Beat Analysis Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:

More information

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,

More information

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music Mine Kim, Seungkwon Beack, Keunwoo Choi, and Kyeongok Kang Realistic Acoustics Research Team, Electronics and Telecommunications

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005. Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute

More information

A. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) =

A. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) = 1 Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks Yang Sun, Student Member, IEEE, Wenwu Wang, Senior Member, IEEE, Jonathon Chambers, Fellow, IEEE, and

More information

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629

More information

Outline. Why do we classify? Audio Classification

Outline. Why do we classify? Audio Classification Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify

More information

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication Proceedings of the 3 rd International Conference on Control, Dynamic Systems, and Robotics (CDSR 16) Ottawa, Canada May 9 10, 2016 Paper No. 110 DOI: 10.11159/cdsr16.110 A Parametric Autoregressive Model

More information

A Survey of Audio-Based Music Classification and Annotation

A Survey of Audio-Based Music Classification and Annotation A Survey of Audio-Based Music Classification and Annotation Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang IEEE Trans. on Multimedia, vol. 13, no. 2, April 2011 presenter: Yin-Tzu Lin ( 阿孜孜 ^.^)

More information