A Music Retrieval System Using Melody and Lyric
|
|
- Regina Moody
- 6 years ago
- Views:
Transcription
1 202 IEEE International Conference on Multimedia and Expo Workshops A Music Retrieval System Using Melody and Lyric Zhiyuan Guo, Qiang Wang, Gang Liu, Jun Guo, Yueming Lu 2 Pattern Recognition and Intelligent System Laboratory 2 Key Laboratory of Trustworthy Distributed Computing and Service, Ministry of Education,2 Beiing University of Posts and Telecommunications Beiing, China guozhiyuan.cathie@gmail.com Abstract Using melody and/or lyric to query a music retrieval system is convenient for users but challenging for developers. This paper proposes efficient schemes for realizing key algorithms in such a kind of system. Specifically, we characterize our system by adding lyric to query as follows: A Support Vector Machine (SVM) is employed to distinguish humming queries from singing queries; For a singing query, lyrics of candidates, which are pre-selected by the commonly used melody matching method, are used to dynamically build up the recognition network; A novel fusion strategy, which is based on the classification confidence, is proposed to combine the lyric and melody scores. The experimental results show that error reduction rates as much as 22.9%, 25.0%, 28.7% and 33.5% for mean reciprocal rank (MRR) are achieved by using the proposed method, respectively for four existing query by singing/humming (QBSH) systems. Keywords-QBSH; SVM; isolated-word recognition; music retrieval I. INTRODUCTION Query-by-singing/humming (QBSH) systems, which help user to find the wanted song based on the singing or humming queries, provide an intuitive and practical way for music retrieval. In the past decades, substantial research has been devoted to the QBSH systems [-0], and various effective matching method have been proposed, such as dynamic time warping (DTW) [2], linear scaling (LS) [3], recursive alignment (RA) [4], and earth mover s distance (EMD) [5]. However, most of QBSH systems use only the melody features, but lyric information is ignored. Moreover, many researchers believe that singing queries are more difficult to handle than the humming queries in QBSH systems, because the speech in the singing audio will reduce the accuracy of melody feature extracting, which closely connects to the quality of a QBSH system. To deal with this problem, Haus et al [6] applied signal processing techniques for the singing queries to extract melody feature more accurately. In fact, lyric information is very useful for song identification. Guo et al [] developed a music retrieval system using spoken lyric queries. Since most of users are non-professional singers, it is very likely that the input singing/humming queries contain errors and biases. In this case, QBSH systems, which only based on melody, may result a failed retrieval. Obviously, lyric is the additive complementary information for song identification if the input query is sung. Using both melody and lyric for QBSH systems is intuitive but challenging for researchers. Firstly, because only the singing queries include lyric information, there is the risk to extract false lyrics, which do not actually exist in humming queries. That will lead to a serious deterioration of performance. Secondly, it is difficult to extract lyric feature from singing queries, because the speech is deformed. There has been a few research studies devote to melody and lyric based QBSH systems. Suzuki et al [7] proposed a QBSH method based on both lyric and melody information, but it can not handle humming queries, which contain no lyric information. To solve this issue, Wang et al [8] used a singing/humming discriminator (SHD) to distinguish the humming query from singing query. He firstly converted the query into a phone string by using a phone-level continuous speech recognizer, and then counted the number of distinct phones in the string. Considering a singing query usually has more distinct phones than s humming query, he classified the input query as humming or singing. A singing query would be converted into a syllable string, and each candidate obtained a lyric score using a syllable-level recognizer. His method provided a slight improvement. However, the processing time was greatly increased due to that two recognition procedures are added. Moreover, as the classification accuracy was extremely dependent on the phone recognition results, which unfortunately were not accurate enough, the improvement of retrieval accuracy was insignificant. This paper proposes a novel QBSH method using both melody and lyric information. Different from Wang s method, we use a well-trained SVM to identify the singing query, and a dynamically constructed isolated-word recognizer to recognize lyric of the singing query. Moreover, a robust fusion method, which is based on classification confidence, is used to combine the lyric and melody scores. Experimental results show that our classifier significantly outperforms than the classifier proposed by Wang. The error reduction rates of 22.9%, 25.0%, 28.7% and 33.5% for mean reciprocal rank (MRR) are achieved by using the proposed method, respectively for four existing QBSH systems. The remainder of this paper is organized as follows: Section describes the overview of the proposed QBSH system. The proposed method is introduced in Section. In /2 $ IEEE DOI 0.09/ICMEW
2 Section, the experimental results are demonstrated. The conclusion follows in Section. II. OVERVIEW OF THE PROPOSED QBSH SYSTEM Fig. gives an overview of the proposed QBSH system, which processes as follows: Step : We use a melody matching method to sort music in the database. Music clips, which have top K highest melody scores, are selected as candidates. Four different methods were used in our experiments, viz., DTW [2], LS [3], RA [4] and EMD [5], which have been proved effective in QBSH systems. Step 2: A well-trained SVM is employed to classify the input query into the categories of humming or singing. Step 3: If the query is classified as a humming query, the ranked candidates according to the melody scores will be returned to the user. Step 4: If the query is classified as a singing query, a dynamically constructed isolated-word recognizer is employed to assign lyric scores for all candidates, and then two kinds of scores including lyric and melody scores are fused. The ranked results will be returned to the user according to the combined scores. III. THE PROPOSED QBSH METHOD A. Melody retrieval Melody retrieval aims at finding the probable candidate clips, which are most similar to the query in respect of melody. Many melody matching methods can achieve this goal. Most of them, such as LS, DTW, and RA, can calculate the distance between two pitch sequences. Others, such as EMD, can calculate the distance between two note sequences. A clip, which has a smaller distance with the query, is considered to be more similar to the query, and it obtains a higher melody score. Let Q and P represent the query and a clip in the database respectively. It should to be noted that both them have been converted into pitch or note sequences according to the adopted melody matching method. Let D Q,P represents the distance between the query Q and the clip P. MS(P), the melody score of P, can be calculated by (). Clips, which have the top K largest melody scores, are selected as candidates. MS( P) = () D Q, P We provide a brief description of the four commonly used melody matching methods. ) DTW: Dynamic time warping (DTW) [2] is a pitch based matching method. The distance between two pitch sequences S = p, p,..., p ), S = q, q,..., q ) can be ( 2 n 2 ( 2 m iteratively calculated by (2). Here, n and m mean the length of S, S 2 respectively. p i is the i-th pitch of S, and q SVM Training Corpus Lyrics Database Query SVM Lyrics Recognition Lyrics Scores Score Fusion Singing Melody Retrieval Humming Candidate Fragments Melody Scores Result MIDI Database Figure. The framework of the proposed QBSH system is the -th pitch of S 2. d(i, ) is the cost associated with p i and q, which can be defined as: d(i, )= p i - q -, where is the absolute value operation and is a constant. D(i, ) represents the minimum distance from the start point to the lattice point (i, ). Obviously, D(n, m) is the distance between S and S 2. D( i 2, ) ( i, ) = d( i, ) + min D( i, ) D( i, 2) D (2) 2) LS: Linear scaling (LS) [3] is a simple but effective pitch-based melody matching method. The main idea of this method is rescaling the input audio, based on the analysis that the length of the input audio is not always equal to the corresponding part in the MIDI data. LS choose different rescaling factors to stretch or compress the pitch contour of the input audio to more accurately match the correct part in the MIDI file. The most appropriate rescaling factor will result in minimum distance between the input audio and the music clip in the database. 3) RA: Recursive alignment (RA) [4] is another pitchbased melody matching method. Since linear scaling can not solve the problem of nonlinear alignment, RA solves this problem in a top-down fashion which is more capable of capturing long distance information in human singing. This method differs from DTW, because it starts optimization from a global view. RA utilizes LS as a subroutine and it tries to tune local matching recursively in order to optimize the alignment. Further details may be found in [4]. 4) EMD: Earth mover s distance (EMD) [5] measures the minimal cost that must be paid to transform one distribution into the other. Melody matching can be naturally cast as a transportation problem by defining one clips as the supplier and the other as the consumer. 344
3 To obtain EMD between the input query and a candidate clip, we need convert the clip into a set of notes with weights. Let P = {( p, ωp ),( p, ),...,(, )} 2 ω p p 2 n ω be the notes set pn of a candidate clip as supplier, where p i is a note occurred in the candidate, and ω p i is the duration of p i. Similarly, let Q = {( q, ωq ),( q, ),...,(, )} 2 ωq q 2 m ω represents qm the query as demander. The EMD, which represents the melody distance between two clips here, can be quickly calculated by many algorithms. B. SVM-based Singing/Humming Classification SVM [2] has attracted lots of researchers due to its excellent performance on many classification problems. It is reported that SVMs can achieve greater or equal performance comparing to other classifiers, while requiring significantly less training data. In our system, we use an SVM to distinguish humming clips from singing clips. An SVM is trained using training data, including 30 humming clips and 30 singing clips. All the training date and input audio were segmented into 0.25-second frames with 50% overlap, and then 32-dimensional features were extracted for each frame. ) Features: 32-dimensional features, which includes one-dimensional zero cross rate, one-dimensional spectral energy, one-dimensional spectral centroid, one-dimensional spectral bandwidth, eight-dimensional spectral band energy, eight-dimensional sub-band spectral flux, twelvedimensional mel- frequency cepstral coefficients (MFCC), are extracted to represent a frame in this work. 2) Singing/Humming classification using SVM: In the classification process, we first segment an input audio into frames, and then classify each frame into the categories of singing or humming using the trained SVM. To mitigate the impact of inevitable classification errors, a median filter is used to smooth the classification results contour. Fig. 2 shows the results of the first 90 frames of a query before and after smoothing. The width of filter window is set to 3 in the example. It can be seen that two itters are removed. Let N s represents the number of singing frames of an input query, which can be counted from the SVM classification results after smoothing, and N h represents the number of humming frames. N s /(N s +N h ) represents the proportion of singing frames in the input query. A larger value of N s /(N s +N h ) represents a higher possibility that the input query is sung. But it is important to note that different misclassifications have different costs. The cost of a humming query being misclassified as singing is larger than the cost of a singing query being misclassified as humming. Because if a humming query is misclassified as singing, the lyric information, which does not actually exist, will be exacted, and it will lead to deterioration. But in the opposite situation, a singing query still can find its corresponding song using only the melody information even if it is misclassified as humming. So, we should improve the classification accuracy of category singing. A threshold T s (0.5 T s <) is used to handle this situation, the input query will be classified as a singing query when the value of N s /(N s +N h ) is greater than or equal to T s. A larger value of T s leads to a higher classification accuracy of singing. That is to say, the classification results of singing are more reliable. C. Lyric Recognition If the query clip is classified into singing, a lyric recognizer is used to assign lyric score for each candidate clip, which is selected by melody matching method. Since melody matching method has located each candidate clip in their corresponding songs, it is easy to obtain their lyrics. By using each lyric as a word, an isolated-word recognition network can be easily constructed. Fig. 3 shows the structure of the recognition network, which has K paths representing K candidate lyrics. K is usually between 20 to00. The isolated-word recognizer uses continuous density hidden Markov models with cross word, context dependent tied state tri-phones. 39 MFCC is extracted from each frame for recognition. When the recognition process finished, each word can get a posterior probability. The lyric score of a candidate is the posterior probability of its corresponding lyric. An isolated word recognizer performs better comparing with a continuous speech recognizer in the system, since the Lyrics Lyrics 2 Begin End Lyrics K- Figure 2. An example of smoothing results. In the vertical axis, represents humming and - represents singing. The left panel shows the initial classification results of the first 90 frames in the input query, and the right panel shows the smoothed results. Figure 3. Lyrics K The lyric recognition network. 345
4 lyric of the input singing query is one of the K candidate lyrics. Due to the simplicity of the recognition network, the lyric recognition is fast and accurate. D. Combination of melody and lyric scores A score level fusion strategy is proposed to combine the lyric and melody scores for the candidate clips of a singing query. Various methods were proposed for score level fusion [9], such as the MIN, MAX, SUM, PRODUCT, and Weighted SUM rules. MIN means selecting the minimum value of all of the scores, MAX means selecting the maximum value, PRODUCT means to obtain the multiplied value of all the scores, SUM means to obtain the summed value and Weighted SUM means to obtain the summed value of all the scores with weights. In the proposed QBSH system, we use Weighted SUM rule, which has been verified that it can achieve the best performance comparing with other rules. The final score of a candidate clip can be calculated as follows: CS(c ) = p MS(c ) + (- p) LS(c ) (3) Where c represents the -th candidate, MS(c ) is the melody score of c, LS(c ) is its lyric score (As mentioned in Section, the lyric score of a candidate is the posterior probability of its corresponding lyric.), and CS(c ) is the fused score. p is the weight coefficient which can be determined empirically. Furthermore, the QBSH system will be deteriorated in the case that the humming query was wrongly classified as a singing query. The classification confidence is used to weight the lyric score according the classification confidence of singing. The improved score level fusion method is as follows: N s CS(c ) = p MS(c ) + ( - p) LS(c ) (4) (N s + N h ) Where N s means the number of frames classified as singing frames for one query, and N h means the number of frames classified as humming frames. N s /(N s +N h ) represents the confidence that the input query is sung. The improved fusion method is more robust against classification errors. IV. EXPERIMENTS A. Experimental Data and Setup The MIREX (Music Information Retrieval Evaluation Exchange) QBSH corpus released by Jang [3] is used to demonstrate the proposed method. The corpus includes 48 MIDI files and 443 singing or humming queries. All the queries are from the beginning. We add 000 MIDI files to MIREX corpus to compose the MIDI database. The Lyrics database consists of lyrics of all songs in the MIDI database. Since our lyric recognizer is for Mandarin, we selected 878 queries belonging to Chinese songs in the corpus. 60 queries, including 30 humming clips and 30 singing clips, are randomly selected to train the SVM. The left 88 clips, including 47 singing queries and 40 humming queries, compose the test set. The acoustic model (AM) of the lyric recognizer were trained using Chinese speech recognition corpus of 863 Program [4], which is a database provided by Chinese National High Technology Proect 863 for Chinese LVCSR system development, and all the audios in the corpus are normal speech. All the experiments were conducted on a platform of PC and C++. B. Evaluation Metrics The evaluation measurements are top-m hit rate and mean reciprocal rank (MRR). Let r i denotes the rank of correct song, the top-m hit rate is the proportion of queries for which r i M. MRR is the average of the reciprocal ranks across all queries, and it can be calculated as (5). Here, n is the number of queries, and rank i means the rank of the correct song corresponding to the i-th query. MRR = n n i= rank C. Singing/humming classification results using SVM Table shows the singing/humming classification accuracies of Wang s method and the proposed SVM based method with different values of T s described in Section. The second column of the table gives the results of Wang s method [8]. Here, the classification accuracy of Singing / Humming is defined as the proportion of correctly classified clips among the clips which are classified as Singing / Humming. An overall classification accuracy of 89.27% has been yielded when the threshold T s is set to 0.5. It can be seen that the proposed SVM based classifier significantly outperforms Wang s method not only in the overall classification accuracy, but also in the classification accuracy of category Singing, which is more important in the QBH system according to the analysis in Section. Moreover, our method can easily control the classification accuracy of category Singing by setting different values of T s. The classification accuracy of category Singing increases with the increasing of T s. D. The performances using different lyric recognition methods All of 47 singing clips are used to test the performance using our recognition methods (with different values of K) and Wang s methods. Fig. 4 shows the experimental results. SCSR is a short name of syllable based continuous speech recognition, which is used by Wang et al [8]. The other 4 curves represent using our lyric recognition methods. K represents the number of candidates, which are selected by RA [4]. In these experiments, we only use the lyric scores to rank the candidates. i (5) 346
5 TABLE I. THE HUMMING AND SINGING CLASSIFICATION ACCURACIES. Categories SHD[8] T s=0.50 T s=0.55 T s=0.60 T s=0.65 Humming 54.80% 70.36% 68.6% 65.54% 63.05% Singing 95.43% 96.57% 97.2% 97.30% 97.64% 98% 97% 96% 95% 00% 95% 90% 85% 80% 94% 93% 92% 9% 90% 75% 70% 65% 60% Figure 5. The performance of RA, DTW and the combinations with lyric recognition. 00% Figure 4. The retrieval performances using only lyric scores derived by different recognition methods. The vertical axis means the hit rate, while the horizontal axis shows the top T candidates and MRR. As can be seen, the proposed isolated-word lyric recognizer is much effective than the continuous speech recognizer used in Wang s method [8]. Besides, the recognition speed is increased by approximately 3 times comparing with SCSR. It should be noted that the top-20 rate reduced when the value of K increases from 0 to 20. Smaller value of K means smaller amount of competition paths in the network, which is helpful for recognition. But if K is too small, the right lyric may be not included in the K lyrics, and this will lead to recognition errors definitely. E. Retrieval accuracy of the proposed QBSH systems The melody retrieval part in our system can adopt any existing melody matching methods. We realize four systems, namely, Melody&Lyric RA, Melody&Lyric DTW, Melody&Lyric LS and Melody&Lyric EMD, by using RA, DTW, LS and EMD as melody matching method respectively. Fig. 5 and Fig. 6 show the performance of the above four systems (K=50, and T s =0.55 for all four systems). The axes in Fig. 5 and Fig. 6 have the same meaning as that in Fig. 4. As can be seen, Melody&Lyric RA, which use both lyric and melody information, performs better than Melody RA, which is only based on melody information. The same conclusions can be obtained for the other three systems. From Fig. 5 and Fig. 6, we can see that RA performs best among the four methods, DTW is the second, LS is the third and EMD performs worst. But after adding lyric information, the corresponding four systems achieve error reduction rates of 22.9%, 25.0%, 28.7% and 33.5% respectively. It indicates that the worse the performance of melody-only system is, the greater the improvement will be. 95% 90% 85% 80% 75% 70% Figure 6. The performance of LS, EMD and the combinations with lyric recognition. V. CONCLUSION In this paper, we proposed a novel QBSH method by adding lyrics information. An SVM classifier was used to identify the singing query, and an isolated-word recognizer was used to recognize lyrics. In addition, fusion method was proposed to combine melody and lyric scores. Our experiments demonstrate that this method shows promising results on the test data. Our current lyrics recognizer is designed for Mandarin. It can not handle English songs. We will try to develop a Mandarin-English bilingual recognizer to solve this issue in the future. ACKNOWLEDGMENT This work was partially supported by the proect under Grant No.B08004, a key proect of the Ministry of Science and Technology of China under grant no.202zx , Innovation Fund of Information and Communication Engineering School of BUPT in 20, Development Program (863) of China under grant No. 20AA0A205, and the Next-Generation Broadband 347
6 Wireless Mobile Communications Network Technology Key Proect under Grant No. 20ZX REFERENCES [] A. Ghias, J. Logan, D. Chamberlin, B.C. Smith, Query by humming: Musical information retrieval in an audio database, Proc. ACM Multimedia, pp , 995. [2] J.S.R. Jang, M.Y. Gao, A query-by-singing system based on dynamic programming, Proc. International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP , Hsinchu, Taiwan, Dec [3] J.S.R. Jang, H. Lee, M. Kao, Content-based music retrieval using linear scaling and branch-and-bound tree search, Proc. ICME, 200. [4] X. Wu, M. Li, J. Liu, J. Yang, Y. Yan, A top-down approach to melody match in pitch contour for query by humming, Proc. International Conference of Chinese Spoken Language Processing, [5] S. Huang, L. Wang, S. Hu, H. Jiang, B. Xu, Query by humming via multiscale transportation distance in random query occurrence context, Proc. ICME, [6] G. Haus, E. Pollastri, An audio front end for query-by-humming systems, Proc. ISMIR, 200. [7] M. Suzuki, T. Hosoya, A. Ito, S. Makino, Music information retrieval from a singing voice using lyrics and melody information, EURASIP Journal on Advances in Signal Processing, vol.2007, [8] C.C. Wang, J.S.R. Jang, W. Wang, An improved query by singing/humming system using melody and lyrics information, Proc. ISMIR, 200. [9] G.P. Nam, T.T.T. Luong, H.H. Nam, Intelligent query by humming system based on score level fusion of multiple classifiers, EURASIP Journal on Advances in Signal Processing, vol. 20, pp.22, 20. [0] Q. Wang, Z. Guo, G. Liu, J. Guo, Y. Lu, Query by humming by using locality sensitive hashing based on combination of pitch and note, Proc. International Conference on Multimedia & Expo Workshops (ICMEW), 202. [] Z. Guo, Q. Wang, G. Liu, J. Guo, A music retrieval system based on spoken lyric queries, International Journal of Advancements in Computing Technology, in press. [2] B.E. Boser, I. Guyon, V. Vapnik, A training algorithm for optimal margin classifiers, Proc. COLT, 992, pp [3] [4] Y. Qian, S. Lin, Y. Zhang, Y. Liu, H. Liu, Q. Liu, An introduction to corpora resources of 863 program for Chinese language processing and human machine interaction, Proc. ALR2004, affiliated to IJCNLP,
Music Radar: A Web-based Query by Humming System
Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,
More informationAutomatic Extraction of Popular Music Ringtones Based on Music Structure Analysis
Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Fengyan Wu fengyanyy@163.com Shutao Sun stsun@cuc.edu.cn Weiyao Xue Wyxue_std@163.com Abstract Automatic extraction of
More informationAutomatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting
Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced
More informationA QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM
A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr
More informationWeek 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University
Week 14 Query-by-Humming and Music Fingerprinting Roger B. Dannenberg Professor of Computer Science, Art and Music Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Metadata-based
More informationA LYRICS-MATCHING QBH SYSTEM FOR INTER- ACTIVE ENVIRONMENTS
A LYRICS-MATCHING QBH SYSTEM FOR INTER- ACTIVE ENVIRONMENTS Panagiotis Papiotis Music Technology Group, Universitat Pompeu Fabra panos.papiotis@gmail.com Hendrik Purwins Music Technology Group, Universitat
More informationAutomatic Rhythmic Notation from Single Voice Audio Sources
Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung
More informationINTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION
INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for
More informationSinger Recognition and Modeling Singer Error
Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing
More informationInstrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationCS229 Project Report Polyphonic Piano Transcription
CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project
More informationBi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset
Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Ricardo Malheiro, Renato Panda, Paulo Gomes, Rui Paiva CISUC Centre for Informatics and Systems of the University of Coimbra {rsmal,
More informationComposer Identification of Digital Audio Modeling Content Specific Features Through Markov Models
Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has
More informationImproving Frame Based Automatic Laughter Detection
Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for
More informationMusic Recommendation from Song Sets
Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional
More informationSinger Traits Identification using Deep Neural Network
Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic
More informationMelody Retrieval On The Web
Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,
More informationSupervised Learning in Genre Classification
Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music
More informationTHE importance of music content analysis for musical
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 1, JANUARY 2007 333 Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates With
More informationTopics in Computer Music Instrument Identification. Ioanna Karydi
Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches
More informationMUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES
MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate
More informationA PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES
12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou
More informationAPPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC
APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,
More informationA Note Based Query By Humming System using Convolutional Neural Network
INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden A Note Based Query By Humming System using Convolutional Neural Network Naziba Mostafa, Pascale Fung The Hong Kong University of Science and Technology
More informationDrum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods
Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National
More informationRetrieval of textual song lyrics from sung inputs
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Retrieval of textual song lyrics from sung inputs Anna M. Kruspe Fraunhofer IDMT, Ilmenau, Germany kpe@idmt.fraunhofer.de Abstract Retrieving the
More informationGRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM
19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM Tomoko Matsui
More informationSINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION
th International Society for Music Information Retrieval Conference (ISMIR ) SINGING PITCH EXTRACTION BY VOICE VIBRATO/TREMOLO ESTIMATION AND INSTRUMENT PARTIAL DELETION Chao-Ling Hsu Jyh-Shing Roger Jang
More informationSubjective Similarity of Music: Data Collection for Individuality Analysis
Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp
More informationTOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC
TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu
More informationSinger Identification
Singer Identification Bertrand SCHERRER McGill University March 15, 2007 Bertrand SCHERRER (McGill University) Singer Identification March 15, 2007 1 / 27 Outline 1 Introduction Applications Challenges
More informationAUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION
AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, Nobukata Ono, Shigeki Sagama The University of Tokyo, Graduate
More informationThe Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng
The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,
More informationMusic Database Retrieval Based on Spectral Similarity
Music Database Retrieval Based on Spectral Similarity Cheng Yang Department of Computer Science Stanford University yangc@cs.stanford.edu Abstract We present an efficient algorithm to retrieve similar
More information19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;
More informationInternational Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL
More informationAudio-Based Video Editing with Two-Channel Microphone
Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science
More informationDetecting Musical Key with Supervised Learning
Detecting Musical Key with Supervised Learning Robert Mahieu Department of Electrical Engineering Stanford University rmahieu@stanford.edu Abstract This paper proposes and tests performance of two different
More informationChord Classification of an Audio Signal using Artificial Neural Network
Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------
More informationDAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval
DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca
More informationWHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?
WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.
More informationStatistical Modeling and Retrieval of Polyphonic Music
Statistical Modeling and Retrieval of Polyphonic Music Erdem Unal Panayiotis G. Georgiou and Shrikanth S. Narayanan Speech Analysis and Interpretation Laboratory University of Southern California Los Angeles,
More informationA Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon
A Study of Synchronization of Audio Data with Symbolic Data Music254 Project Report Spring 2007 SongHui Chon Abstract This paper provides an overview of the problem of audio and symbolic synchronization.
More informationHUMMING METHOD FOR CONTENT-BASED MUSIC INFORMATION RETRIEVAL
12th International Society for Music Information Retrieval Conference (ISMIR 211) HUMMING METHOD FOR CONTENT-BASED MUSIC INFORMATION RETRIEVAL Cristina de la Bandera, Ana M. Barbancho, Lorenzo J. Tardón,
More informationA Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music
A Query-by-singing Technique for Retrieving Polyphonic Objects of Popular Music Hung-Ming Yu, Wei-Ho Tsai, and Hsin-Min Wang Institute of Information Science, Academia Sinica, Taipei, Taiwan, Republic
More informationError Resilience for Compressed Sensing with Multiple-Channel Transmission
Journal of Information Hiding and Multimedia Signal Processing c 2015 ISSN 2073-4212 Ubiquitous International Volume 6, Number 5, September 2015 Error Resilience for Compressed Sensing with Multiple-Channel
More informationMUSI-6201 Computational Music Analysis
MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)
More informationSinging voice synthesis based on deep neural networks
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
More informationPOST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS
POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music
More information2. AN INTROSPECTION OF THE MORPHING PROCESS
1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,
More informationWAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS. A. Zehetner, M. Hagmüller, and F. Pernkopf
WAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS A. Zehetner, M. Hagmüller, and F. Pernkopf Graz University of Technology Signal Processing and Speech Communication Laboratory, Austria ABSTRACT Wake-up-word (WUW)
More informationNormalized Cumulative Spectral Distribution in Music
Normalized Cumulative Spectral Distribution in Music Young-Hwan Song, Hyung-Jun Kwon, and Myung-Jin Bae Abstract As the remedy used music becomes active and meditation effect through the music is verified,
More informationComputational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)
Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,
More informationComparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction
Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction Hsuan-Huei Shih, Shrikanth S. Narayanan and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical
More informationToward Evaluation Techniques for Music Similarity
Toward Evaluation Techniques for Music Similarity Beth Logan, Daniel P.W. Ellis 1, Adam Berenzweig 1 Cambridge Research Laboratory HP Laboratories Cambridge HPL-2003-159 July 29 th, 2003* E-mail: Beth.Logan@hp.com,
More informationResearch on sampling of vibration signals based on compressed sensing
Research on sampling of vibration signals based on compressed sensing Hongchun Sun 1, Zhiyuan Wang 2, Yong Xu 3 School of Mechanical Engineering and Automation, Northeastern University, Shenyang, China
More informationCreating Data Resources for Designing User-centric Frontends for Query by Humming Systems
Creating Data Resources for Designing User-centric Frontends for Query by Humming Systems Erdem Unal S. S. Narayanan H.-H. Shih Elaine Chew C.-C. Jay Kuo Speech Analysis and Interpretation Laboratory,
More informationCharacteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals
Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals Eita Nakamura and Shinji Takaki National Institute of Informatics, Tokyo 101-8430, Japan eita.nakamura@gmail.com, takaki@nii.ac.jp
More informationA Pattern Recognition Approach for Melody Track Selection in MIDI Files
A Pattern Recognition Approach for Melody Track Selection in MIDI Files David Rizo, Pedro J. Ponce de León, Carlos Pérez-Sancho, Antonio Pertusa, José M. Iñesta Departamento de Lenguajes y Sistemas Informáticos
More informationTANSEN: A QUERY-BY-HUMMING BASED MUSIC RETRIEVAL SYSTEM. M. Anand Raju, Bharat Sundaram* and Preeti Rao
TANSEN: A QUERY-BY-HUMMING BASE MUSIC RETRIEVAL SYSTEM M. Anand Raju, Bharat Sundaram* and Preeti Rao epartment of Electrical Engineering, Indian Institute of Technology, Bombay Powai, Mumbai 400076 {maji,prao}@ee.iitb.ac.in
More informationAutomatic Piano Music Transcription
Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening
More informationSinging Pitch Extraction and Singing Voice Separation
Singing Pitch Extraction and Singing Voice Separation Advisor: Jyh-Shing Roger Jang Presenter: Chao-Ling Hsu Multimedia Information Retrieval Lab (MIR) Department of Computer Science National Tsing Hua
More informationOptimized Color Based Compression
Optimized Color Based Compression 1 K.P.SONIA FENCY, 2 C.FELSY 1 PG Student, Department Of Computer Science Ponjesly College Of Engineering Nagercoil,Tamilnadu, India 2 Asst. Professor, Department Of Computer
More informationEfficient Vocal Melody Extraction from Polyphonic Music Signals
http://dx.doi.org/1.5755/j1.eee.19.6.4575 ELEKTRONIKA IR ELEKTROTECHNIKA, ISSN 1392-1215, VOL. 19, NO. 6, 213 Efficient Vocal Melody Extraction from Polyphonic Music Signals G. Yao 1,2, Y. Zheng 1,2, L.
More informationDETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION
DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION H. Pan P. van Beek M. I. Sezan Electrical & Computer Engineering University of Illinois Urbana, IL 6182 Sharp Laboratories
More informationA Categorical Approach for Recognizing Emotional Effects of Music
A Categorical Approach for Recognizing Emotional Effects of Music Mohsen Sahraei Ardakani 1 and Ehsan Arbabi School of Electrical and Computer Engineering, College of Engineering, University of Tehran,
More informationToward Multi-Modal Music Emotion Classification
Toward Multi-Modal Music Emotion Classification Yi-Hsuan Yang 1, Yu-Ching Lin 1, Heng-Tze Cheng 1, I-Bin Liao 2, Yeh-Chin Ho 2, and Homer H. Chen 1 1 National Taiwan University 2 Telecommunication Laboratories,
More informationA Fast Alignment Scheme for Automatic OCR Evaluation of Books
A Fast Alignment Scheme for Automatic OCR Evaluation of Books Ismet Zeki Yalniz, R. Manmatha Multimedia Indexing and Retrieval Group Dept. of Computer Science, University of Massachusetts Amherst, MA,
More informationMUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES
MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University
More informationAUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS
AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS Christian Fremerey, Meinard Müller,Frank Kurth, Michael Clausen Computer Science III University of Bonn Bonn, Germany Max-Planck-Institut (MPI)
More informationA Framework for Segmentation of Interview Videos
A Framework for Segmentation of Interview Videos Omar Javed, Sohaib Khan, Zeeshan Rasheed, Mubarak Shah Computer Vision Lab School of Electrical Engineering and Computer Science University of Central Florida
More informationTranscription of the Singing Melody in Polyphonic Music
Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,
More informationSelective Intra Prediction Mode Decision for H.264/AVC Encoders
Selective Intra Prediction Mode Decision for H.264/AVC Encoders Jun Sung Park, and Hyo Jung Song Abstract H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression
More informationMood Tracking of Radio Station Broadcasts
Mood Tracking of Radio Station Broadcasts Jacek Grekow Faculty of Computer Science, Bialystok University of Technology, Wiejska 45A, Bialystok 15-351, Poland j.grekow@pb.edu.pl Abstract. This paper presents
More informationRobert Alexandru Dobre, Cristian Negrescu
ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q
More informationMusic Mood Classification - an SVM based approach. Sebastian Napiorkowski
Music Mood Classification - an SVM based approach Sebastian Napiorkowski Topics on Computer Music (Seminar Report) HPAC - RWTH - SS2015 Contents 1. Motivation 2. Quantification and Definition of Mood 3.
More informationOutline. Why do we classify? Audio Classification
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify
More informationVISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,
VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS O. Javed, S. Khan, Z. Rasheed, M.Shah {ojaved, khan, zrasheed, shah}@cs.ucf.edu Computer Vision Lab School of Electrical Engineering and Computer
More informationAdaptive Key Frame Selection for Efficient Video Coding
Adaptive Key Frame Selection for Efficient Video Coding Jaebum Jun, Sunyoung Lee, Zanming He, Myungjung Lee, and Euee S. Jang Digital Media Lab., Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul,
More informationFree Viewpoint Switching in Multi-view Video Streaming Using. Wyner-Ziv Video Coding
Free Viewpoint Switching in Multi-view Video Streaming Using Wyner-Ziv Video Coding Xun Guo 1,, Yan Lu 2, Feng Wu 2, Wen Gao 1, 3, Shipeng Li 2 1 School of Computer Sciences, Harbin Institute of Technology,
More informationA Bootstrap Method for Training an Accurate Audio Segmenter
A Bootstrap Method for Training an Accurate Audio Segmenter Ning Hu and Roger B. Dannenberg Computer Science Department Carnegie Mellon University 5000 Forbes Ave Pittsburgh, PA 1513 {ninghu,rbd}@cs.cmu.edu
More informationGYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)
GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE
More informationEffects of acoustic degradations on cover song recognition
Signal Processing in Acoustics: Paper 68 Effects of acoustic degradations on cover song recognition Julien Osmalskyj (a), Jean-Jacques Embrechts (b) (a) University of Liège, Belgium, josmalsky@ulg.ac.be
More informationPattern Based Melody Matching Approach to Music Information Retrieval
Pattern Based Melody Matching Approach to Music Information Retrieval 1 D.Vikram and 2 M.Shashi 1,2 Department of CSSE, College of Engineering, Andhra University, India 1 daravikram@yahoo.co.in, 2 smogalla2000@yahoo.com
More informationNEW QUERY-BY-HUMMING MUSIC RETRIEVAL SYSTEM CONCEPTION AND EVALUATION BASED ON A QUERY NATURE STUDY
Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Limerick, Ireland, December 6-8,2 NEW QUERY-BY-HUMMING MUSIC RETRIEVAL SYSTEM CONCEPTION AND EVALUATION BASED ON A QUERY NATURE
More informationEnhancing Music Maps
Enhancing Music Maps Jakob Frank Vienna University of Technology, Vienna, Austria http://www.ifs.tuwien.ac.at/mir frank@ifs.tuwien.ac.at Abstract. Private as well as commercial music collections keep growing
More informationAudio Structure Analysis
Lecture Music Processing Audio Structure Analysis Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Music Structure Analysis Music segmentation pitch content
More informationA repetition-based framework for lyric alignment in popular songs
A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine
More informationBilbo-Val: Automatic Identification of Bibliographical Zone in Papers
Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers Amal Htait, Sebastien Fournier and Patrice Bellot Aix Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,13397,
More informationImproving Performance in Neural Networks Using a Boosting Algorithm
- Improving Performance in Neural Networks Using a Boosting Algorithm Harris Drucker AT&T Bell Laboratories Holmdel, NJ 07733 Robert Schapire AT&T Bell Laboratories Murray Hill, NJ 07974 Patrice Simard
More informationIMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC
IMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC Ashwin Lele #, Saurabh Pinjani #, Kaustuv Kanti Ganguli, and Preeti Rao Department of Electrical Engineering, Indian
More informationStory Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004
Story Tracking in Video News Broadcasts Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004 Acknowledgements Motivation Modern world is awash in information Coming from multiple sources Around the clock
More informationMusic Segmentation Using Markov Chain Methods
Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some
More informationColor Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT
CSVT -02-05-09 1 Color Quantization of Compressed Video Sequences Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 Abstract This paper presents a novel color quantization algorithm for compressed video
More informationAutomatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 31, 821-838 (2015) Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * Department of Electronic Engineering National Taipei
More informationMusic Genre Classification and Variance Comparison on Number of Genres
Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques
More informationFigure 1: Feature Vector Sequence Generator block diagram.
1 Introduction Figure 1: Feature Vector Sequence Generator block diagram. We propose designing a simple isolated word speech recognition system in Verilog. Our design is naturally divided into two modules.
More informationVoice & Music Pattern Extraction: A Review
Voice & Music Pattern Extraction: A Review 1 Pooja Gautam 1 and B S Kaushik 2 Electronics & Telecommunication Department RCET, Bhilai, Bhilai (C.G.) India pooja0309pari@gmail.com 2 Electrical & Instrumentation
More information