Blind Identification of Source Mobile Devices Using VoIP Calls

Similar documents
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Automatic Laughter Detection

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

Audio-Based Video Editing with Two-Channel Microphone

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION

Automatic Laughter Detection

MUSI-6201 Computational Music Analysis

Neural Network for Music Instrument Identi cation

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Topics in Computer Music Instrument Identification. Ioanna Karydi

Music Genre Classification and Variance Comparison on Number of Genres

Chapter 1. Introduction to Digital Signal Processing

Automatic Rhythmic Notation from Single Voice Audio Sources

Supervised Learning in Genre Classification

Improving Frame Based Automatic Laughter Detection

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Recognising Cello Performers using Timbre Models

Acoustic Scene Classification

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

2. AN INTROSPECTION OF THE MORPHING PROCESS

Speech and Speaker Recognition for the Command of an Industrial Robot

Singer Identification

Recognising Cello Performers Using Timbre Models

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS

Chord Classification of an Audio Signal using Artificial Neural Network

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

ISSN ICIRET-2014

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

Hidden Markov Model based dance recognition

Voice Controlled Car System

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons

Figure 1: Feature Vector Sequence Generator block diagram.

MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND

CS229 Project Report Polyphonic Piano Transcription

Digital Signal Processing. Prof. Dietrich Klakow Rahil Mahdian

Normalized Cumulative Spectral Distribution in Music

Singer Traits Identification using Deep Neural Network

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Speech Recognition Combining MFCCs and Image Features

UC San Diego UC San Diego Previously Published Works

Automatic Piano Music Transcription

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS

Voice & Music Pattern Extraction: A Review

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics

Experiments on musical instrument separation using multiplecause

Line-Adaptive Color Transforms for Lossless Frame Memory Compression

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

Automatic Construction of Synthetic Musical Instruments and Performers

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Subjective Similarity of Music: Data Collection for Individuality Analysis

A Survey on: Sound Source Separation Methods

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

Music Information Retrieval with Temporal Features and Timbre

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Singing voice synthesis based on deep neural networks

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

Music Segmentation Using Markov Chain Methods

A prototype system for rule-based expressive modifications of audio recordings

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Piya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan

Detecting Musical Key with Supervised Learning

MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION. Gregory Sell and Pascal Clark

Classification of Timbre Similarity

Comparison Parameters and Speaker Similarity Coincidence Criteria:

ISSN (Print) Original Research Article. Coimbatore, Tamil Nadu, India

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Singer Recognition and Modeling Singer Error

Topic 10. Multi-pitch Analysis

Smart Traffic Control System Using Image Processing

TERRESTRIAL broadcasting of digital television (DTV)

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

Music Recommendation from Song Sets

MUSICAL NOTE AND INSTRUMENT CLASSIFICATION WITH LIKELIHOOD-FREQUENCY-TIME ANALYSIS AND SUPPORT VECTOR MACHINES

DATA hiding technologies have been widely studied in

Tempo and Beat Analysis

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

A. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) =

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

Outline. Why do we classify? Audio Classification

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Survey of Audio-Based Music Classification and Annotation

Transcription:

Blind Identification of Source Mobile Devices Using VoIP Calls Mehdi Jahanirad 1, Ainuddin Wahid Abdul Wahab, Nor Badrul Anuar, Mohd Yamani Idna Idris, and Mohamad Nizam Ayub Faculty of Computer Science and Information Technology University of Malaya 50603 Kuala Lumpur, Malaysia Security Research Group (SECReg) 1 mehdijahanirad@siswa.um.edu.my Abstract Sources such as speakers and environments from different communication devices produce signal variations that result in interference generated by different communication devices. Despite these convolutions, signal variations produced by different mobile devices leave intrinsic fingerprints on recorded calls, thus allowing the tracking of the models and brands of engaged mobile devices. This study aims to investigate the use of recorded Voice over Internet Protocol calls in the blind identification of source mobile devices. The proposed scheme employs a combination of entropy and mel-frequency cepstrum coefficients to extract the intrinsic features of mobile devices and analyzes these features with a multi-class support vector machine classifier. The experimental results lead to an accurate identification of 10 source mobile devices with an average accuracy of 99.72%. Index Terms Pattern recognition, mel-frequency cepstrum coefficients, entropy, device-based detection technique I. INTRODUCTION Audio forensics has attracted increasing attention in recent years because of its application in different situations that require trust in the authenticity and integrity of audio signals [1]. An example of such application is the forensic acquisition, analysis, and evaluation of admissible audio recordings as evidence in court cases [2]. Current audio authenticity approaches are categorized according to the artifacts extracted from the signal itself. These approaches include: (a) environment-based techniques in which the the frequency spectra are forced through the recording environment, (b) device-based techniques in which the frequency spectra are produced by a recording device, and (c) ENF-based techniques in which the frequency spectra are generated by the power source of the recording device [3]. Although advanced research has been conducted on ENF-based techniques [4], [5] and environmentbased techniques [6], [7], few have explored the application of device-based techniques in real-time forensics [1], [8]. Device-based techniques are based on blind source camera identification in image forensics [9], [10], [11]. However, the adaptation of this approach in audio forensics is challenging because audio evidence is produced by the combination of audio sources such as speakers and environments. The first practical evaluation on source microphone authentication was developed by Kraetzer et al. [8] through statistical pattern recognition techniques. This method utilizes features from the detection of hidden communication and identifies the origin of audio streams. Buchholz et al. [12] focused on microphone classification by using Fourier coefficients histogram as audio features. This method eliminates speech convolution by computing the coefficients from near silent frames. Although microphone forensics allows the identification of source recording devices, it cannot provide sufficient evidence to identify source communication devices. Garcia- Romero and Espy-Wilson [13] proposed an automatic acquisition device identification method using speech recordings from the Lincoln-Labs Handset Database based on both microphone and landline telephone handsets. This method eliminates the effects of signal variations caused by speech signals through frequency response characterization of the device contextualized by the speech content. The method implements Gaussian mixture models on 23 mel-frequency cepstrum coefficients (MFCCs), 38 linear-frequency cepstrum coefficients, and their combination with the first-order derivative (delta) of both feature sets to determine the Gaussian supervector (GSV) associated with each device. The linear support vector machine (SVM) classifier builds the training model by using this vector as an intrinsic fingerprint of individual acquisition devices. Hanilçi et al. [14] proposed a method based on advanced speaker recognition research [15], [16], [17] to extract MFCCs as features from recorded speech. This method collects recorded speech samples from different mobile devices to identify their source brands and models. However, Hanilçi et al. [14] considered mobile devices as ordinary tape recorders to eliminate the complications when transmitting and receiving signals. Referring to the acquisition device identification methods in [13], [14], Panagakis and Kotropoulos [18] proposed a telephone handset identification method that uses random spectral features (RSFs) and labeled spectral features (LSFs). This method extracts the RSFs and LSFs from the mean spectrogram of speech signals. The method also uses sparse representation-based classification (SRC), as well as neural network (NN) and SVM classifiers to assess its performance in classifying the dataset in which are obtained from eight telephone handsets. 978-1-4799-2027-3/14/$31.00 2014 IEEE 486

Previous device-based techniques focused on authentication based on recording devices. In the present work, we propose new approach to the identification of source mobile devices that are engaged in VoIP calls. The mobile devices used in this study are equipped with built-in circuits and electrical components. The digital signal processing in these devices produces signal variations. Thus, calls recorded using these devices contain intrinsic artifacts that are captured using a combination of entropy and MFCC features. Furthermore, this method uses the near-silent segments of signals for feature extraction to eliminate the interference resulting from the variation in speakers. Finally, the combined feature set is analyzed with a multiclass SVM classifier to identify 10 source mobile devices. The logarithm of the filterbank outputs is used to determine the spectral envelope in decibels. Eventually, the discrete cosine transform of these envelopes determine the MFCCs. During this process, 12 coefficients are computed by the MFCC algorithm. Each row in the mel cepstrum output represents the 12 coefficients computed for each frame (Fig. 2). II. ALGORITHM OVERVIEW The proposed algorithm includes two main stages: feature extraction and feature analysis. Feature extraction determines meaningful information from a collection of call recordings to distinguish the mobile devices from one another. Feature analysis utilizes these features in building the model for each mobile device and tests the model to evaluate its performance in detecting all mobile devices from the same class. The class represents the brand and model of the mobile devices. A. Feature extraction The combination of entropy and MFCC has been used in speech recognition to improve its performance and robustness in the presence of additive noise [19], [20], [21]. The present study explores the entropy of the mel frequency cepstrum spectrum as feature for the blind identification of source mobile devices. Fig. 1 illustrates the computation of MFCCs. Fig. 2. entropy-mfcc feature extraction At this point, the feature extraction algorithm uses the entropy to capture the peakiness of the distribution among all frames in the mel cepstrum output. For the mel cepstrum output of M i,j, M is the array with a size of {N 12}, when N is the total number of frames, i = {1, 2, 3,...,N}, and j = {1, 2, 3,...,12}. The algorithm computes the entropy for the 12 coefficients in two stages. First, it normalizes the spectrum into the probability mass function (PMF) through (2) X i x(i) = N i=1 X for i =1to N, (2) i where X i is the energy of the i th frequency component, and x i is the PMF of the signal. Second, it computes the entropy H(x) as Fig. 1. Computation of MFCCs H(x) = x X x i. log 2 x i. (3) The feature extraction algorithm generates blocks by splitting the audio frames and then extracts the MFCC features from all blocks that are generated from the sample data. The algorithm splits each block into approximately 23 ms frames, with each frame windowed with the Hamming window in the time domain. Subsequently, it determines the FFT magnitude spectrum from the windowed frame and filters the spectrum using 27 triangular-shaped filters in the mel domain. The mel domain is linear for frequencies below 1000 Hz and logarithmic for frequencies more than 1000 Hz. Equation (1) computes the mel domain central frequencies for the logarithms of base 10. log (1 + f/1000) f mel = 1000 (1) log 2 Moreover, the algorithm generates a total of 12 entropy- MFCC features through MATLAB functions. B. Feature analysis Feature analysis investigates the extracted features by using classification techniques. This study applies SVM in classification because of its satisfactory performance in pattern recognition approaches [22]. SVMs are initially designed for 1-to-1 classifications. The multi-class SVM classifier generates N(N 1)/2 binary SVM classifiers, where each classifier is trained to separate each pair of classes, and N represents the number of classes. The binary classifiers are combined using the classical voting system when the class with the maximal number of votes is estimated. The classification technique involves two stages: training and testing. Feature analysis organizes features in 10 classes 978-1-4799-2027-3/14/$31.00 2014 IEEE 487

with respect to the 10 mobile devices in Table I. For each class, the extracted features produce a total data subset represented by the same label. The method randomly selects 70% of the data subset for training and uses the remaining 30% for testing. The classifier builds the training model using the training data subset. Then, the classifier predicts the labels corresponding to the testing data subset based on the training model and without considering its true labels. For evaluation, the classifier compares the actual classes against the predicted labels to determine the number of correct matches. The method computes the identification accuracy with the fraction of the total number of correct matches to the total number of testing data. The method is repeated 10 times. In each repetition different training and testing data subsets are randomly selected, and the average identification accuracy is then determined for each mobile device. III. EXPERIMENTAL SETUP The proposed setup involves the collection of call recordings, as shown in Fig. 3. We record a total of 25 Skype calls for each device in a truly silent environment. The devices are listed in Table I. The silent session eliminates the possible convolutions caused by speech signals generated by different speakers. The MP3 Skype call recorder v.3.1 freeware application [23] records the signals in.mp3 format. The method converts the recorded files to.wav format and then digitizes them into sample data. Then, it enhances the sample data to remove the noise generated from environmental reverberations. The histograms illustrated in Fig. 4 indicate the distinctiveness of the spectrum of the recording signal obtained from the mobile devices of the same model. However, the clean signals are more distinct than the noisy signals. The color determines the noise level in decibels with respect to the color plot in the right side of the histogram. Evidently, the colors vary with respect to the fact that high-level noises are reduced by using the enhancement process. The enhancement process uses g=1 as the subtraction domain and e=1 as the gain exponent for the magnitude domain spectral subtraction [24]. TABLE I MOBILE DEVICES, MODELS, AND CLASS NAMES USED IN THE EXPERIMENTS Mobile Devices Models Operating Class System Name Galaxy Note 10.1-A GT-N8000 Android 4.1.2 GNA Galaxy Note 10.1-B GT-N8000 Android 4.1.2 GNB Galaxy Note GT-N7000 Android 2.3.6 GN Galaxy Note II-A GT-N7100 Android 4.1.2 GNIIA Galaxy Note II-B GT-N7100 Android 4.1.2 GNIIB Galaxy Tab 10.1 GT-P7500 Android 3.1 GT Apple ipad MC775ZP Apple ios 5.1.1 ipada Apple ipad New MD366ZP Apple ios 5.1.1 ipadb Asus Nexus 7 Android 4.2.2 Asus HTC Sensation XE - Android 4.0.3 HTC The proposed algorithm segments the clean signals into overlapping frames with a length of 40 samples. The shorten frame signals consist of an array with a size of {N f n}, where N f is number of frames, and n is the frame length. The proposed frame shortening method segments the recorded signal with a length of 8 s into blocks of approximately 200 ms. As a result, we generate a total of 1000 blocks from the recorded calls from each mobile device. The 12 entropy- MFCC features are computed by using the generated blocks to obtain the data subset with a length 1000 for each mobile device. The method randomly selects 700 blocks for training and uses the remaining 300 blocks to test the data subset. We thus obtain 7000 training and 3000 testing data from the 10 mobile devices. The experiment is repeated 10 times, and the average accuracy is computed. Fig. 3. Proposed set up for recording conversations IV. RESULTS Table II shows the average confusion matrix generated by running 10 experiments using the 10-class SVM classifier. The diagonal values of the matrix represent the respective classification accuracies of the 10 mobile devices, whereas the non-diagonal values indicate the misclassification among the mobile devices. A high average classification accuracy of 99.72% is achieved for all mobile devices. The percentage of misclassification among the mobile devices is negligible (less than 0.27%). The mobile devices of the same model (Galaxy Note 10.1-A, B and Galaxy Note II-A, B) have high average accuracy rates of 99.74% and 99.76%, respectively. In an alternative approach, Fig. 5 visualizes the classification results by using the Euclidean distance similarity methods adopted from [25]. This method determines the similarity distances between the feature values with the Euclidean distance matrix of N N and then reduces its dimension to 2 to determine the X and Y components (Fig. 5). Each color represents the class label of the dataset associated with each mobile device. The unfilled markers represent data instance from the training data subset, and the filled markers represent data instance from the testing data subset. As shown in Fig. 5, the Euclidean distance method clusters both the training and testing data subsets into 978-1-4799-2027-3/14/$31.00 2014 IEEE 488

(a) Galaxy Note 10.1-A (b) Galaxy Note I0.1-B (c) Galaxy Note II-A (d) Galaxy Note Fig. 4. Histogram comparison of the recording signals from mobile devices of the same model (clean vs. noisy signal). 978-1-4799-2027-3/14/$31.00 2014 IEEE 489

TABLE II CONFUSION MATRIX FOR IDENTIFYING SOURCE MOBILE DEVICES BASED ON CALL RECORDINGS Total average accuracy rate Predicted (%) 99.72% GNA GNB GN GNIIA GNIIB GT ipada ipadb Asus HTC Actual GNA 99.67 0.1 0.13 GNB 99.8 0.1 GN 99.83 GNIIA 0.1 99.83 GNIIB 0.2 99.7 0.1 GT 99.77 ipad-a 0.1 0.17 99.6 ipad-b 99.73 0.23 Asus 0.27 0.1 99.63 HTC 0.17 0.13 99.63 Note: The cell marked with an asterisk indicates a value of less than 0.1%. Fig. 5. Clustering of training (unfilled markers) and testing (filled markers) data subsets into 10 groups using the Euclidean distance method. 10 groups. This observation confirms the results obtained by the ten-class SVM classifier. We can therefore infer that the proposed entropy-mfcc features are effective in the blind identification of source mobile devices using recorded VoIP calls. V. CONCLUSION In this work, we present an approach to the identification of source mobile devices using recorded VoIP calls. We adopted MFCC and entropy features from speech recognition studies to develop the framework for identifying the distinguishing pattern in different mobile devices. Given the use of a silent Skype session in the investigation, the difference between the samples is only caused by the different mobile devices. An average accuracy of 99.72% was achieved for all 10 devices. Most notably, this study is the first to investigate the distinguishing features of source mobile devices using VoIP calls. The results of our study suggest that the proposed approach should be tested in the identification of different types of mobile devices using conversations recorded during communication via any type of service provider, such as cellular, PSTN, and subsets. ACKNOWLEDGMENT We would like to thank the UM/MoHE High Impact Research Grant Allocation (UM.C/HIR/MOHE/FCSIT/17) for 978-1-4799-2027-3/14/$31.00 2014 IEEE 490

funding this research and all members of the Security Research Group (SECReg) of the Department of Computer System and Technology of the University of Malaya for sharing their knowledge and experience. They led us through many helpful discussions and have been a constant source of motivation, guidance, encouragement, and trust. REFERENCES [1] C. Kraetzer, K. Qian, and J. Dittmann, Extending a context model for microphone forensics, in Proc. Conference on Media Watermarking, Security, and Forensics, Burlingame, CA, 2012. [2] R. Maher, Audio forensic examination, IEEE Signal Process. Mag., vol. 26, no. 2, pp. 84 94, Mar. 2009. [3] S. Gupta, S. Cho, and C.-C. Kuo, Current developments and future trends in audio authentication, IEEE Multimedia, vol. 19, no. 1, pp. 50 59, Jan. 2012. [4] A. J. Cooper, Further considerations for the analysis of ENF data for forensic audio and video applications, International Journal of Speech Language and The Law, vol. 18, no. 1, pp. 99 120, 2011. [5] J. Ode Ojowu, J. Johan Karlsson, and Y. Liu, ENF extraction from digital recordings using adaptive techniques and frequency tracking, IEEE Trans. Inf. Forensics Security, vol. 7, no. 4, pp. 1330 1338, Aug. 2012. [6] A. Rabaoui, M. Davy, S. Rossignol, and N. Ellouze, Using one-class SVMs and wavelets for audio surveillance, IEEE Trans. Inf. Forensics Security, vol. 3, no. 4, pp. 763 775, dec 2008. [7] G. Muhammad and K. Alghathbar, Environment recognition for digital audio forensics using MPEG-7 and mel cepstral features, Journal of Electrical Engineering, vol. 62, no. 4, pp. 199 205, Aug. 2011. [8] C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang, Digital audio forensics: a first practical evaluation on microphone and environment classification, in Proc. Workshop on Multimedia & Security, Dallas, Texas, USA, 2007, pp. 63 74. [9] M. Kharrazi, H. Sencar, and N. Memon, Blind source camera identification, in Proc. International Conference on Image Processing (ICIP 04), vol. 1, Oct. 2004, pp. 709 712. [10] O. Celiktutan, B. Sankur, and I. Avcibas, Blind identification of source cell-phone model, IEEE Trans. Inf. Forensics Security, vol. 3, no. 3, pp. 553 566, Sep. 2008. [11] A. Swaminathan, M. Wu, and K. Liu, Nonintrusive component forensics of visual sensors using output images, IEEE Trans. Inf. Forensics Security, vol. 2, no. 1, pp. 91 106, Mar. 2007. [12] R. Buchholz, C. Kraetzer, and J. Dittman, Microphone classification using fourier coefficients, Information Hiding,LNCS, no. 5806, pp. 235 246, 2009. [13] D. Garcia-Romero and C. Y. Espy-Wilson, Automatic acquisition device identification from speech recordings, in Proc. 2010 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Dallas, Texas, USA, 2010, p. 18061809. [14] C. Hanili, F. Erta, T. Erta, and mer Eskidere, Recognition of brand and models of cell-phones from recorded speech signals, IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, pp. 625 634, Apr. 2012. [15] F. Bimbot et al., A tutorial on text-independent speaker verification, EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, 2004. [16] W. Campbell, Generalized linear discriminant sequence kernels for speaker recognition, in Proc. Int. Conf. on Acoustics, Speech Signal Pro, 2002, pp. 161 164. [17] W. M. Campbell and K. T. Assaleh, Speaker recognition with polynomial classifiers, IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 205 212, May 2002. [18] Y. Panagakis and C. Kotropoulos, Telephone handset identification by feature selection and sparse representations, in Proc. 2010 Workshop on Information Forensics & Security, Tenerife, Spain, 2012, pp. 73 78. [19] H. Misra, S. Ikbal, H. Bourlard, and H. Hermansky, Spectral entropy based feature for robust ASR, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 04), vol. 1, 2004, pp. I 193 6. [20] H. Yeganeh, S. Ahadi, S. Mirrezaie, and A. Ziaei, Weighting of mel sub-bands based on SNR/entropy for robust ASR, in Proc. IEEE International Symposium on Signal Processing and Information Technology,(ISSPIT 2008), 2008, pp. 292 296. [21] Y. H. Lee and H. K. Kim, Entropy coding of compressed feature parameters for distributed speech recognition, Speech Communication, vol. 52, no. 5, pp. 405 412, 2010. [22] I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Techniques. Burlington, MA 01803, USA: Elsevier Inc., 2011. [23] MP3 Skype Recorder v.3.1. [Online]. Available: http://voipcallrecording.com [24] R. Berouti, M. Schwartz and J. Makhoul, Enhancement of speech corrupted by acoustic noise, in Proc IEEE ICASSP, no. 4, 1979, pp. 208 211. [25] T. Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications. London, UK: OReilly Media, Inc., 2007. 978-1-4799-2027-3/14/$31.00 2014 IEEE 491