Blind Identification of Source Mobile Devices Using VoIP Calls
|
|
- Belinda Lang
- 6 years ago
- Views:
Transcription
1 Blind Identification of Source Mobile Devices Using VoIP Calls Mehdi Jahanirad 1, Ainuddin Wahid Abdul Wahab, Nor Badrul Anuar, Mohd Yamani Idna Idris, and Mohamad Nizam Ayub Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur, Malaysia Security Research Group (SECReg) 1 mehdijahanirad@siswa.um.edu.my Abstract Sources such as speakers and environments from different communication devices produce signal variations that result in interference generated by different communication devices. Despite these convolutions, signal variations produced by different mobile devices leave intrinsic fingerprints on recorded calls, thus allowing the tracking of the models and brands of engaged mobile devices. This study aims to investigate the use of recorded Voice over Internet Protocol calls in the blind identification of source mobile devices. The proposed scheme employs a combination of entropy and mel-frequency cepstrum coefficients to extract the intrinsic features of mobile devices and analyzes these features with a multi-class support vector machine classifier. The experimental results lead to an accurate identification of 10 source mobile devices with an average accuracy of 99.72%. Index Terms Pattern recognition, mel-frequency cepstrum coefficients, entropy, device-based detection technique I. INTRODUCTION Audio forensics has attracted increasing attention in recent years because of its application in different situations that require trust in the authenticity and integrity of audio signals [1]. An example of such application is the forensic acquisition, analysis, and evaluation of admissible audio recordings as evidence in court cases [2]. Current audio authenticity approaches are categorized according to the artifacts extracted from the signal itself. These approaches include: (a) environment-based techniques in which the the frequency spectra are forced through the recording environment, (b) device-based techniques in which the frequency spectra are produced by a recording device, and (c) ENF-based techniques in which the frequency spectra are generated by the power source of the recording device [3]. Although advanced research has been conducted on ENF-based techniques [4], [5] and environmentbased techniques [6], [7], few have explored the application of device-based techniques in real-time forensics [1], [8]. Device-based techniques are based on blind source camera identification in image forensics [9], [10], [11]. However, the adaptation of this approach in audio forensics is challenging because audio evidence is produced by the combination of audio sources such as speakers and environments. The first practical evaluation on source microphone authentication was developed by Kraetzer et al. [8] through statistical pattern recognition techniques. This method utilizes features from the detection of hidden communication and identifies the origin of audio streams. Buchholz et al. [12] focused on microphone classification by using Fourier coefficients histogram as audio features. This method eliminates speech convolution by computing the coefficients from near silent frames. Although microphone forensics allows the identification of source recording devices, it cannot provide sufficient evidence to identify source communication devices. Garcia- Romero and Espy-Wilson [13] proposed an automatic acquisition device identification method using speech recordings from the Lincoln-Labs Handset Database based on both microphone and landline telephone handsets. This method eliminates the effects of signal variations caused by speech signals through frequency response characterization of the device contextualized by the speech content. The method implements Gaussian mixture models on 23 mel-frequency cepstrum coefficients (MFCCs), 38 linear-frequency cepstrum coefficients, and their combination with the first-order derivative (delta) of both feature sets to determine the Gaussian supervector (GSV) associated with each device. The linear support vector machine (SVM) classifier builds the training model by using this vector as an intrinsic fingerprint of individual acquisition devices. Hanilçi et al. [14] proposed a method based on advanced speaker recognition research [15], [16], [17] to extract MFCCs as features from recorded speech. This method collects recorded speech samples from different mobile devices to identify their source brands and models. However, Hanilçi et al. [14] considered mobile devices as ordinary tape recorders to eliminate the complications when transmitting and receiving signals. Referring to the acquisition device identification methods in [13], [14], Panagakis and Kotropoulos [18] proposed a telephone handset identification method that uses random spectral features (RSFs) and labeled spectral features (LSFs). This method extracts the RSFs and LSFs from the mean spectrogram of speech signals. The method also uses sparse representation-based classification (SRC), as well as neural network (NN) and SVM classifiers to assess its performance in classifying the dataset in which are obtained from eight telephone handsets /14/$ IEEE 486
2 Previous device-based techniques focused on authentication based on recording devices. In the present work, we propose new approach to the identification of source mobile devices that are engaged in VoIP calls. The mobile devices used in this study are equipped with built-in circuits and electrical components. The digital signal processing in these devices produces signal variations. Thus, calls recorded using these devices contain intrinsic artifacts that are captured using a combination of entropy and MFCC features. Furthermore, this method uses the near-silent segments of signals for feature extraction to eliminate the interference resulting from the variation in speakers. Finally, the combined feature set is analyzed with a multiclass SVM classifier to identify 10 source mobile devices. The logarithm of the filterbank outputs is used to determine the spectral envelope in decibels. Eventually, the discrete cosine transform of these envelopes determine the MFCCs. During this process, 12 coefficients are computed by the MFCC algorithm. Each row in the mel cepstrum output represents the 12 coefficients computed for each frame (Fig. 2). II. ALGORITHM OVERVIEW The proposed algorithm includes two main stages: feature extraction and feature analysis. Feature extraction determines meaningful information from a collection of call recordings to distinguish the mobile devices from one another. Feature analysis utilizes these features in building the model for each mobile device and tests the model to evaluate its performance in detecting all mobile devices from the same class. The class represents the brand and model of the mobile devices. A. Feature extraction The combination of entropy and MFCC has been used in speech recognition to improve its performance and robustness in the presence of additive noise [19], [20], [21]. The present study explores the entropy of the mel frequency cepstrum spectrum as feature for the blind identification of source mobile devices. Fig. 1 illustrates the computation of MFCCs. Fig. 2. entropy-mfcc feature extraction At this point, the feature extraction algorithm uses the entropy to capture the peakiness of the distribution among all frames in the mel cepstrum output. For the mel cepstrum output of M i,j, M is the array with a size of {N 12}, when N is the total number of frames, i = {1, 2, 3,...,N}, and j = {1, 2, 3,...,12}. The algorithm computes the entropy for the 12 coefficients in two stages. First, it normalizes the spectrum into the probability mass function (PMF) through (2) X i x(i) = N i=1 X for i =1to N, (2) i where X i is the energy of the i th frequency component, and x i is the PMF of the signal. Second, it computes the entropy H(x) as Fig. 1. Computation of MFCCs H(x) = x X x i. log 2 x i. (3) The feature extraction algorithm generates blocks by splitting the audio frames and then extracts the MFCC features from all blocks that are generated from the sample data. The algorithm splits each block into approximately 23 ms frames, with each frame windowed with the Hamming window in the time domain. Subsequently, it determines the FFT magnitude spectrum from the windowed frame and filters the spectrum using 27 triangular-shaped filters in the mel domain. The mel domain is linear for frequencies below 1000 Hz and logarithmic for frequencies more than 1000 Hz. Equation (1) computes the mel domain central frequencies for the logarithms of base 10. log (1 + f/1000) f mel = 1000 (1) log 2 Moreover, the algorithm generates a total of 12 entropy- MFCC features through MATLAB functions. B. Feature analysis Feature analysis investigates the extracted features by using classification techniques. This study applies SVM in classification because of its satisfactory performance in pattern recognition approaches [22]. SVMs are initially designed for 1-to-1 classifications. The multi-class SVM classifier generates N(N 1)/2 binary SVM classifiers, where each classifier is trained to separate each pair of classes, and N represents the number of classes. The binary classifiers are combined using the classical voting system when the class with the maximal number of votes is estimated. The classification technique involves two stages: training and testing. Feature analysis organizes features in 10 classes /14/$ IEEE 487
3 with respect to the 10 mobile devices in Table I. For each class, the extracted features produce a total data subset represented by the same label. The method randomly selects 70% of the data subset for training and uses the remaining 30% for testing. The classifier builds the training model using the training data subset. Then, the classifier predicts the labels corresponding to the testing data subset based on the training model and without considering its true labels. For evaluation, the classifier compares the actual classes against the predicted labels to determine the number of correct matches. The method computes the identification accuracy with the fraction of the total number of correct matches to the total number of testing data. The method is repeated 10 times. In each repetition different training and testing data subsets are randomly selected, and the average identification accuracy is then determined for each mobile device. III. EXPERIMENTAL SETUP The proposed setup involves the collection of call recordings, as shown in Fig. 3. We record a total of 25 Skype calls for each device in a truly silent environment. The devices are listed in Table I. The silent session eliminates the possible convolutions caused by speech signals generated by different speakers. The MP3 Skype call recorder v.3.1 freeware application [23] records the signals in.mp3 format. The method converts the recorded files to.wav format and then digitizes them into sample data. Then, it enhances the sample data to remove the noise generated from environmental reverberations. The histograms illustrated in Fig. 4 indicate the distinctiveness of the spectrum of the recording signal obtained from the mobile devices of the same model. However, the clean signals are more distinct than the noisy signals. The color determines the noise level in decibels with respect to the color plot in the right side of the histogram. Evidently, the colors vary with respect to the fact that high-level noises are reduced by using the enhancement process. The enhancement process uses g=1 as the subtraction domain and e=1 as the gain exponent for the magnitude domain spectral subtraction [24]. TABLE I MOBILE DEVICES, MODELS, AND CLASS NAMES USED IN THE EXPERIMENTS Mobile Devices Models Operating Class System Name Galaxy Note 10.1-A GT-N8000 Android GNA Galaxy Note 10.1-B GT-N8000 Android GNB Galaxy Note GT-N7000 Android GN Galaxy Note II-A GT-N7100 Android GNIIA Galaxy Note II-B GT-N7100 Android GNIIB Galaxy Tab 10.1 GT-P7500 Android 3.1 GT Apple ipad MC775ZP Apple ios ipada Apple ipad New MD366ZP Apple ios ipadb Asus Nexus 7 Android Asus HTC Sensation XE - Android HTC The proposed algorithm segments the clean signals into overlapping frames with a length of 40 samples. The shorten frame signals consist of an array with a size of {N f n}, where N f is number of frames, and n is the frame length. The proposed frame shortening method segments the recorded signal with a length of 8 s into blocks of approximately 200 ms. As a result, we generate a total of 1000 blocks from the recorded calls from each mobile device. The 12 entropy- MFCC features are computed by using the generated blocks to obtain the data subset with a length 1000 for each mobile device. The method randomly selects 700 blocks for training and uses the remaining 300 blocks to test the data subset. We thus obtain 7000 training and 3000 testing data from the 10 mobile devices. The experiment is repeated 10 times, and the average accuracy is computed. Fig. 3. Proposed set up for recording conversations IV. RESULTS Table II shows the average confusion matrix generated by running 10 experiments using the 10-class SVM classifier. The diagonal values of the matrix represent the respective classification accuracies of the 10 mobile devices, whereas the non-diagonal values indicate the misclassification among the mobile devices. A high average classification accuracy of 99.72% is achieved for all mobile devices. The percentage of misclassification among the mobile devices is negligible (less than 0.27%). The mobile devices of the same model (Galaxy Note 10.1-A, B and Galaxy Note II-A, B) have high average accuracy rates of 99.74% and 99.76%, respectively. In an alternative approach, Fig. 5 visualizes the classification results by using the Euclidean distance similarity methods adopted from [25]. This method determines the similarity distances between the feature values with the Euclidean distance matrix of N N and then reduces its dimension to 2 to determine the X and Y components (Fig. 5). Each color represents the class label of the dataset associated with each mobile device. The unfilled markers represent data instance from the training data subset, and the filled markers represent data instance from the testing data subset. As shown in Fig. 5, the Euclidean distance method clusters both the training and testing data subsets into /14/$ IEEE 488
4 (a) Galaxy Note 10.1-A (b) Galaxy Note I0.1-B (c) Galaxy Note II-A (d) Galaxy Note Fig. 4. Histogram comparison of the recording signals from mobile devices of the same model (clean vs. noisy signal) /14/$ IEEE 489
5 TABLE II CONFUSION MATRIX FOR IDENTIFYING SOURCE MOBILE DEVICES BASED ON CALL RECORDINGS Total average accuracy rate Predicted (%) 99.72% GNA GNB GN GNIIA GNIIB GT ipada ipadb Asus HTC Actual GNA GNB GN GNIIA GNIIB GT ipad-a ipad-b Asus HTC Note: The cell marked with an asterisk indicates a value of less than 0.1%. Fig. 5. Clustering of training (unfilled markers) and testing (filled markers) data subsets into 10 groups using the Euclidean distance method. 10 groups. This observation confirms the results obtained by the ten-class SVM classifier. We can therefore infer that the proposed entropy-mfcc features are effective in the blind identification of source mobile devices using recorded VoIP calls. V. CONCLUSION In this work, we present an approach to the identification of source mobile devices using recorded VoIP calls. We adopted MFCC and entropy features from speech recognition studies to develop the framework for identifying the distinguishing pattern in different mobile devices. Given the use of a silent Skype session in the investigation, the difference between the samples is only caused by the different mobile devices. An average accuracy of 99.72% was achieved for all 10 devices. Most notably, this study is the first to investigate the distinguishing features of source mobile devices using VoIP calls. The results of our study suggest that the proposed approach should be tested in the identification of different types of mobile devices using conversations recorded during communication via any type of service provider, such as cellular, PSTN, and subsets. ACKNOWLEDGMENT We would like to thank the UM/MoHE High Impact Research Grant Allocation (UM.C/HIR/MOHE/FCSIT/17) for /14/$ IEEE 490
6 funding this research and all members of the Security Research Group (SECReg) of the Department of Computer System and Technology of the University of Malaya for sharing their knowledge and experience. They led us through many helpful discussions and have been a constant source of motivation, guidance, encouragement, and trust. REFERENCES [1] C. Kraetzer, K. Qian, and J. Dittmann, Extending a context model for microphone forensics, in Proc. Conference on Media Watermarking, Security, and Forensics, Burlingame, CA, [2] R. Maher, Audio forensic examination, IEEE Signal Process. Mag., vol. 26, no. 2, pp , Mar [3] S. Gupta, S. Cho, and C.-C. Kuo, Current developments and future trends in audio authentication, IEEE Multimedia, vol. 19, no. 1, pp , Jan [4] A. J. Cooper, Further considerations for the analysis of ENF data for forensic audio and video applications, International Journal of Speech Language and The Law, vol. 18, no. 1, pp , [5] J. Ode Ojowu, J. Johan Karlsson, and Y. Liu, ENF extraction from digital recordings using adaptive techniques and frequency tracking, IEEE Trans. Inf. Forensics Security, vol. 7, no. 4, pp , Aug [6] A. Rabaoui, M. Davy, S. Rossignol, and N. Ellouze, Using one-class SVMs and wavelets for audio surveillance, IEEE Trans. Inf. Forensics Security, vol. 3, no. 4, pp , dec [7] G. Muhammad and K. Alghathbar, Environment recognition for digital audio forensics using MPEG-7 and mel cepstral features, Journal of Electrical Engineering, vol. 62, no. 4, pp , Aug [8] C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang, Digital audio forensics: a first practical evaluation on microphone and environment classification, in Proc. Workshop on Multimedia & Security, Dallas, Texas, USA, 2007, pp [9] M. Kharrazi, H. Sencar, and N. Memon, Blind source camera identification, in Proc. International Conference on Image Processing (ICIP 04), vol. 1, Oct. 2004, pp [10] O. Celiktutan, B. Sankur, and I. Avcibas, Blind identification of source cell-phone model, IEEE Trans. Inf. Forensics Security, vol. 3, no. 3, pp , Sep [11] A. Swaminathan, M. Wu, and K. Liu, Nonintrusive component forensics of visual sensors using output images, IEEE Trans. Inf. Forensics Security, vol. 2, no. 1, pp , Mar [12] R. Buchholz, C. Kraetzer, and J. Dittman, Microphone classification using fourier coefficients, Information Hiding,LNCS, no. 5806, pp , [13] D. Garcia-Romero and C. Y. Espy-Wilson, Automatic acquisition device identification from speech recordings, in Proc IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Dallas, Texas, USA, 2010, p [14] C. Hanili, F. Erta, T. Erta, and mer Eskidere, Recognition of brand and models of cell-phones from recorded speech signals, IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, pp , Apr [15] F. Bimbot et al., A tutorial on text-independent speaker verification, EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, [16] W. Campbell, Generalized linear discriminant sequence kernels for speaker recognition, in Proc. Int. Conf. on Acoustics, Speech Signal Pro, 2002, pp [17] W. M. Campbell and K. T. Assaleh, Speaker recognition with polynomial classifiers, IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp , May [18] Y. Panagakis and C. Kotropoulos, Telephone handset identification by feature selection and sparse representations, in Proc Workshop on Information Forensics & Security, Tenerife, Spain, 2012, pp [19] H. Misra, S. Ikbal, H. Bourlard, and H. Hermansky, Spectral entropy based feature for robust ASR, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 04), vol. 1, 2004, pp. I [20] H. Yeganeh, S. Ahadi, S. Mirrezaie, and A. Ziaei, Weighting of mel sub-bands based on SNR/entropy for robust ASR, in Proc. IEEE International Symposium on Signal Processing and Information Technology,(ISSPIT 2008), 2008, pp [21] Y. H. Lee and H. K. Kim, Entropy coding of compressed feature parameters for distributed speech recognition, Speech Communication, vol. 52, no. 5, pp , [22] I. H. Witten, E. Frank, and M. A. Hall, Data Mining Practical Machine Learning Tools and Techniques. Burlington, MA 01803, USA: Elsevier Inc., [23] MP3 Skype Recorder v.3.1. [Online]. Available: [24] R. Berouti, M. Schwartz and J. Makhoul, Enhancement of speech corrupted by acoustic noise, in Proc IEEE ICASSP, no. 4, 1979, pp [25] T. Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications. London, UK: OReilly Media, Inc., /14/$ IEEE 491
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional
More informationINTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION
INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for
More informationInternational Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL
More informationEXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION
EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION Hui Su, Adi Hajj-Ahmad, Min Wu, and Douglas W. Oard {hsu, adiha, minwu, oard}@umd.edu University of Maryland, College Park ABSTRACT The electric
More informationAudio-Based Video Editing with Two-Channel Microphone
Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science
More informationGYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)
GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE
More informationhit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.
CS 229 FINAL PROJECT A SOUNDHOUND FOR THE SOUNDS OF HOUNDS WEAKLY SUPERVISED MODELING OF ANIMAL SOUNDS ROBERT COLCORD, ETHAN GELLER, MATTHEW HORTON Abstract: We propose a hybrid approach to generating
More informationPaulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION
Paulo V. K. Borges Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) 07942084331 vini@ieee.org PRESENTATION Electronic engineer working as researcher at University of London. Doctorate in digital image/video
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,
More informationMUSI-6201 Computational Music Analysis
MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)
More informationNeural Network for Music Instrument Identi cation
Neural Network for Music Instrument Identi cation Zhiwen Zhang(MSE), Hanze Tu(CCRMA), Yuan Li(CCRMA) SUN ID: zhiwen, hanze, yuanli92 Abstract - In the context of music, instrument identi cation would contribute
More informationMUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES
MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate
More informationTopics in Computer Music Instrument Identification. Ioanna Karydi
Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches
More informationMusic Genre Classification and Variance Comparison on Number of Genres
Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques
More informationChapter 1. Introduction to Digital Signal Processing
Chapter 1 Introduction to Digital Signal Processing 1. Introduction Signal processing is a discipline concerned with the acquisition, representation, manipulation, and transformation of signals required
More informationAutomatic Rhythmic Notation from Single Voice Audio Sources
Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung
More informationSupervised Learning in Genre Classification
Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music
More informationImproving Frame Based Automatic Laughter Detection
Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for
More informationAutomatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting
Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced
More informationWHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?
WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.
More informationKeywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox
Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Investigation
More informationRecognising Cello Performers using Timbre Models
Recognising Cello Performers using Timbre Models Chudy, Magdalena; Dixon, Simon For additional information about this publication click this link. http://qmro.qmul.ac.uk/jspui/handle/123456789/5013 Information
More informationAcoustic Scene Classification
Acoustic Scene Classification Marc-Christoph Gerasch Seminar Topics in Computer Music - Acoustic Scene Classification 6/24/2015 1 Outline Acoustic Scene Classification - definition History and state of
More informationAUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION
AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, Nobukata Ono, Shigeki Sagama The University of Tokyo, Graduate
More information2. AN INTROSPECTION OF THE MORPHING PROCESS
1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,
More informationSpeech and Speaker Recognition for the Command of an Industrial Robot
Speech and Speaker Recognition for the Command of an Industrial Robot CLAUDIA MOISA*, HELGA SILAGHI*, ANDREI SILAGHI** *Dept. of Electric Drives and Automation University of Oradea University Street, nr.
More informationSinger Identification
Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27 Outline 1 Introduction Applications Challenges
More informationRecognising Cello Performers Using Timbre Models
Recognising Cello Performers Using Timbre Models Magdalena Chudy and Simon Dixon Abstract In this paper, we compare timbre features of various cello performers playing the same instrument in solo cello
More informationGRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM
19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM Tomoko Matsui
More informationA NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES
A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES Zhiyao Duan 1, Bryan Pardo 2, Laurent Daudet 3 1 Department of Electrical and Computer Engineering, University
More informationWYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY
WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY (Invited Paper) Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University, Stanford, CA 94305 {amaaron,bgirod}@stanford.edu Abstract
More informationPRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS
8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING" 19-21 April 2012, Tallinn, Estonia PRODUCTION MACHINERY UTILIZATION MONITORING BASED ON ACOUSTIC AND VIBRATION SIGNAL ANALYSIS Astapov,
More informationChord Classification of an Audio Signal using Artificial Neural Network
Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------
More informationA QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM
A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr
More informationISSN ICIRET-2014
Robust Multilingual Voice Biometrics using Optimum Frames Kala A 1, Anu Infancia J 2, Pradeepa Natarajan 3 1,2 PG Scholar, SNS College of Technology, Coimbatore-641035, India 3 Assistant Professor, SNS
More informationAutomatic Extraction of Popular Music Ringtones Based on Music Structure Analysis
Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Fengyan Wu fengyanyy@163.com Shutao Sun stsun@cuc.edu.cn Weiyao Xue Wyxue_std@163.com Abstract Automatic extraction of
More informationCopy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor
Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor Ghulam Muhammad 1, Muneer H. Al-Hammadi 1, Muhammad Hussain 2, Anwar M. Mirza 1, and George Bebis 3 1 Dept.
More informationAutomatic Identification of Instrument Type in Music Signal using Wavelet and MFCC
Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC Arijit Ghosal, Rudrasis Chakraborty, Bibhas Chandra Dhara +, and Sanjoy Kumar Saha! * CSE Dept., Institute of Technology
More informationHidden Markov Model based dance recognition
Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,
More informationVoice Controlled Car System
Voice Controlled Car System 6.111 Project Proposal Ekin Karasan & Driss Hafdi November 3, 2016 1. Overview Voice controlled car systems have been very important in providing the ability to drivers to adjust
More informationClassification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors
Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:
More informationMusical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons
Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons Róisín Loughran roisin.loughran@ul.ie Jacqueline Walker jacqueline.walker@ul.ie Michael O Neill University
More informationFigure 1: Feature Vector Sequence Generator block diagram.
1 Introduction Figure 1: Feature Vector Sequence Generator block diagram. We propose designing a simple isolated word speech recognition system in Verilog. Our design is naturally divided into two modules.
More informationMPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND
MPEG-7 AUDIO SPECTRUM BASIS AS A SIGNATURE OF VIOLIN SOUND Aleksander Kaminiarz, Ewa Łukasik Institute of Computing Science, Poznań University of Technology. Piotrowo 2, 60-965 Poznań, Poland e-mail: Ewa.Lukasik@cs.put.poznan.pl
More informationCS229 Project Report Polyphonic Piano Transcription
CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project
More informationDigital Signal Processing. Prof. Dietrich Klakow Rahil Mahdian
Digital Signal Processing Prof. Dietrich Klakow Rahil Mahdian Language Teaching: English Questions: English (or German) Slides: English Tutorials: one English and one German group Exercise sheets: most
More informationNormalized Cumulative Spectral Distribution in Music
Normalized Cumulative Spectral Distribution in Music Young-Hwan Song, Hyung-Jun Kwon, and Myung-Jin Bae Abstract As the remedy used music becomes active and meditation effect through the music is verified,
More informationSinger Traits Identification using Deep Neural Network
Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic
More informationMusic Emotion Recognition. Jaesung Lee. Chung-Ang University
Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or
More informationSpeech Recognition Combining MFCCs and Image Features
Speech Recognition Combining MFCCs and Image Featres S. Karlos from Department of Mathematics N. Fazakis from Department of Electrical and Compter Engineering K. Karanikola from Department of Mathematics
More informationUC San Diego UC San Diego Previously Published Works
UC San Diego UC San Diego Previously Published Works Title Classification of MPEG-2 Transport Stream Packet Loss Visibility Permalink https://escholarship.org/uc/item/9wk791h Authors Shin, J Cosman, P
More informationAutomatic Piano Music Transcription
Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening
More informationDepartment of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement
Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine Project: Real-Time Speech Enhancement Introduction Telephones are increasingly being used in noisy
More informationMUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS
MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS Steven K. Tjoa and K. J. Ray Liu Signals and Information Group, Department of Electrical and Computer Engineering
More informationVoice & Music Pattern Extraction: A Review
Voice & Music Pattern Extraction: A Review 1 Pooja Gautam 1 and B S Kaushik 2 Electronics & Telecommunication Department RCET, Bhilai, Bhilai (C.G.) India pooja0309pari@gmail.com 2 Electrical & Instrumentation
More informationSingle Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics
Master Thesis Signal Processing Thesis no December 2011 Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics Md Zameari Islam GM Sabil Sajjad This thesis is presented
More informationExperiments on musical instrument separation using multiplecause
Experiments on musical instrument separation using multiplecause models J Klingseisen and M D Plumbley* Department of Electronic Engineering King's College London * - Corresponding Author - mark.plumbley@kcl.ac.uk
More informationLine-Adaptive Color Transforms for Lossless Frame Memory Compression
Line-Adaptive Color Transforms for Lossless Frame Memory Compression Joungeun Bae 1 and Hoon Yoo 2 * 1 Department of Computer Science, SangMyung University, Jongno-gu, Seoul, South Korea. 2 Full Professor,
More informationA Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique
A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique Dhaval R. Bhojani Research Scholar, Shri JJT University, Jhunjunu, Rajasthan, India Ved Vyas Dwivedi, PhD.
More informationAutomatic Construction of Synthetic Musical Instruments and Performers
Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.
More informationRegion Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling
International Conference on Electronic Design and Signal Processing (ICEDSP) 0 Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling Aditya Acharya Dept. of
More informationApplication Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio
Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Jana Eggink and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 11
More informationGCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam
GCT535- Sound Technology for Multimedia Timbre Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Timbre Analysis Definition of Timbre Timbre Features Zero-crossing rate Spectral
More informationSubjective Similarity of Music: Data Collection for Individuality Analysis
Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp
More informationA Survey on: Sound Source Separation Methods
Volume 3, Issue 11, November-2016, pp. 580-584 ISSN (O): 2349-7084 International Journal of Computer Engineering In Research Trends Available online at: www.ijcert.org A Survey on: Sound Source Separation
More informationA Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication
Journal of Energy and Power Engineering 10 (2016) 504-512 doi: 10.17265/1934-8975/2016.08.007 D DAVID PUBLISHING A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations
More informationA PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES
12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou
More informationMusic Information Retrieval with Temporal Features and Timbre
Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC
More informationAn Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions
1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,
More informationSinging voice synthesis based on deep neural networks
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
More informationMUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES
MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University
More informationMusic Segmentation Using Markov Chain Methods
Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some
More informationA prototype system for rule-based expressive modifications of audio recordings
International Symposium on Performance Science ISBN 0-00-000000-0 / 000-0-00-000000-0 The Author 2007, Published by the AEC All rights reserved A prototype system for rule-based expressive modifications
More information19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;
More informationPiya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan
Piya Pal 1200 E. California Blvd MC 136-93 Pasadena, CA 91125 Tel: 626-379-0118 E-mail: piyapal@caltech.edu http://www.systems.caltech.edu/~piyapal/ Education Ph.D. in Electrical Engineering Sep. 2007
More informationDetecting Musical Key with Supervised Learning
Detecting Musical Key with Supervised Learning Robert Mahieu Department of Electrical Engineering Stanford University rmahieu@stanford.edu Abstract This paper proposes and tests performance of two different
More informationMUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION. Gregory Sell and Pascal Clark
214 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION Gregory Sell and Pascal Clark Human Language Technology Center
More informationClassification of Timbre Similarity
Classification of Timbre Similarity Corey Kereliuk McGill University March 15, 2007 1 / 16 1 Definition of Timbre What Timbre is Not What Timbre is A 2-dimensional Timbre Space 2 3 Considerations Common
More informationComparison Parameters and Speaker Similarity Coincidence Criteria:
Comparison Parameters and Speaker Similarity Coincidence Criteria: The Easy Voice system uses two interrelating parameters of comparison (first and second error types). False Rejection, FR is a probability
More informationISSN (Print) Original Research Article. Coimbatore, Tamil Nadu, India
Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 016; 4(1):1-5 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources) www.saspublisher.com
More informationDrum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods
Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National
More informationResearch Article. ISSN (Print) *Corresponding author Shireen Fathima
Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)
More informationSinger Recognition and Modeling Singer Error
Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing
More informationTopic 10. Multi-pitch Analysis
Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds
More informationSmart Traffic Control System Using Image Processing
Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,
More informationTERRESTRIAL broadcasting of digital television (DTV)
IEEE TRANSACTIONS ON BROADCASTING, VOL 51, NO 1, MARCH 2005 133 Fast Initialization of Equalizers for VSB-Based DTV Transceivers in Multipath Channel Jong-Moon Kim and Yong-Hwan Lee Abstract This paper
More information... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University
A Pseudo-Statistical Approach to Commercial Boundary Detection........ Prasanna V Rangarajan Dept of Electrical Engineering Columbia University pvr2001@columbia.edu 1. Introduction Searching and browsing
More informationMusic Recommendation from Song Sets
Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia
More informationMUSICAL NOTE AND INSTRUMENT CLASSIFICATION WITH LIKELIHOOD-FREQUENCY-TIME ANALYSIS AND SUPPORT VECTOR MACHINES
MUSICAL NOTE AND INSTRUMENT CLASSIFICATION WITH LIKELIHOOD-FREQUENCY-TIME ANALYSIS AND SUPPORT VECTOR MACHINES Mehmet Erdal Özbek 1, Claude Delpha 2, and Pierre Duhamel 2 1 Dept. of Electrical and Electronics
More informationDATA hiding technologies have been widely studied in
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL 18, NO 6, JUNE 2008 769 A Novel Look-Up Table Design Method for Data Hiding With Reduced Distortion Xiao-Ping Zhang, Senior Member, IEEE,
More informationTempo and Beat Analysis
Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:
More informationComputational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)
Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,
More informationGaussian Mixture Model for Singing Voice Separation from Stereophonic Music
Gaussian Mixture Model for Singing Voice Separation from Stereophonic Music Mine Kim, Seungkwon Beack, Keunwoo Choi, and Kyeongok Kang Realistic Acoustics Research Team, Electronics and Telecommunications
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.
Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute
More informationA. Ideal Ratio Mask If there is no RIR, the IRM for time frame t and frequency f can be expressed as [17]: ( IRM(t, f) =
1 Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks Yang Sun, Student Member, IEEE, Wenwu Wang, Senior Member, IEEE, Jonathon Chambers, Fellow, IEEE, and
More informationStudy of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet
American International Journal of Research in Science, Technology, Engineering & Mathematics Available online at http://www.iasir.net ISSN (Print): 2328-3491, ISSN (Online): 2328-3580, ISSN (CD-ROM): 2328-3629
More informationOutline. Why do we classify? Audio Classification
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify
More informationA Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication
Proceedings of the 3 rd International Conference on Control, Dynamic Systems, and Robotics (CDSR 16) Ottawa, Canada May 9 10, 2016 Paper No. 110 DOI: 10.11159/cdsr16.110 A Parametric Autoregressive Model
More informationA Survey of Audio-Based Music Classification and Annotation
A Survey of Audio-Based Music Classification and Annotation Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang IEEE Trans. on Multimedia, vol. 13, no. 2, April 2011 presenter: Yin-Tzu Lin ( 阿孜孜 ^.^)
More information