GEOGRAPHICAL ORIGIN PREDICTION OF FOLK MUSIC RECORDINGS FROM THE UNITED KINGDOM

Size: px
Start display at page:

Download "GEOGRAPHICAL ORIGIN PREDICTION OF FOLK MUSIC RECORDINGS FROM THE UNITED KINGDOM"

Transcription

1 GEOGRAPHICAL ORIGIN PREDICTION OF FOLK MUSIC RECORDINGS FROM THE UNITED KINGDOM Vytaute Kedyte 1 Maria Panteli 2 Tillman Weyde 1 Simon Dixon 2 1 Department of Computer Science, City University of London, United Kingdom 2 Centre for Digital Music, Queen Mary University of London, United Kingdom {Vytaute.Kedyte, T.E.Weyde}@city.ac.uk, {m.panteli, s.e.dixon}@qmul.ac.uk ABSTRACT Field recordings from etnomusicological researc since te beginning of te 2t century are available today in large digitised music arcives. Te application of music information retrieval and data mining tecnologies can aid large-scale data processing leading to a better understanding of te istory of cultural excange. In tis paper we focus on folk and traditional music from te United Kingdom and study te correlation between spatial origins and musical caracteristics. In particular, we investigate weter te geograpical location of music recordings can be predicted solely from te content of te audio signal. We build a neural network tat takes as input a feature vector capturing musical aspects of te audio signal and predicts te latitude and longitude of te origins of te music recording. We explore te performance of te model for different sets of features and compare te prediction accuracy between geograpical regions of te UK. Our model predicts te geograpical coordinates of music recordings wit an average error of less tan 12 km. Te model can be used in a similar manner to identify te origins of recordings in large unlabelled music collections and reveal patterns of similarity in music from around te world. 1. INTRODUCTION Since te beginning of te 2t century etnomusicological researc as contributed significantly to te collection of recorded music from around te world. Collections of field recordings are preserved today in digital arcives suc as te Britis Library Sound Arcive. Te advances of Music Information Retrieval (MIR) tecnologies make it possible to process large numbers of music recordings. We are interested in applying tese computational tools to study a large collection of folk and traditional music from te United Kingdom (UK). We focus on exploring music attributes wit respect to geograpical regions of te UK and investigate patterns of music similarity. c Vytaute Kedyte, Maria Panteli, Tillman Weyde, Simon Dixon. Licensed under a Creative Commons Attribution 4. International License (CC BY 4.). Attribution: Vytaute Kedyte, Maria Panteli, Tillman Weyde, Simon Dixon. Geograpical origin prediction of folk music recordings from te United Kingdom, 18t International Society for Music Information Retrieval Conference, Suzou, Cina, 217. Te comparison of music from different geograpical regions as been te topic of several studies from te field of etnomusicology and in particular te branc of comparative musicology [13]. Savage et al. [17] studied stylistic similarity witin music cultures of Taiwan. In particular, tey formed music clusters for a collection of 259 traditional songs from twelve indigenous populations of Taiwan and studied te distribution of tese clusters across geograpical regions of Taiwan. Tey sowed tat songs of Taiwan can be grouped into 5 clusters correlated wit geograpical factors and repertoire diversity. Savage et al. [18] analysed 34 recordings contained in te Garland Encyclopedia of World Music [14] and investigated te distribution of music attributes across music recordings from around te world. Tey proposed 18 music features tat are sared amongst many music cultures of te world and a network of 1 features tat often occur togeter. Te aforementioned studies incorporated knowledge from uman experts in order to annotate music caracteristics for eac recording. Wile expert knowledge provides reliable and in-dept insigts into te music, te amount of uman labour involved in te process makes it impractical for large-scale music corpora. Computational tools on te oter and provide an efficient solution to processing large numbers of music recordings. In te field of MIR several studies ave used computational tools to study large music corpora. For example, Mauc et al. [1] studied te evolution of popular music in te USA in a collection of approximately 17 recordings. Tey concluded tat popular music in te US evolved wit particular rapidity during tree stylistic revolutions, around 1964, 1983 and Wit respect to non-western music repertoires Moelants et al. [12] studied pitc distributions in 91 recordings from Central Africa from te beginning until te end of te 2t century. Tey observed tat recent recordings tend to use more equally-tempered scales tan older recordings. Computational studies ave also focused on predicting te geograpic location of recordings from teir music content. Gomez et al. [3] approaced prediction of musical cultures as a classification problem, and classified music tracks into Western and non-western. Tey identified correlations between te latitude and tonal features, and te longitude and rytmic descriptors. Teir work illustrates te complexity of using regression to predict te geograpical coordinates of music origin. Zou et al. [23] also approaced tis as a regression problem, predicting latitudes 664

2 Proceedings of te 18t ISMIR Conference, Suzou, Cina, October 23-27, and longitudes of te capital city of te music s country of origin, for pieces of music from 73 countries. Tey used K-nearest neigbours and Random Forest regression tecniques, and acieved a mean distance error between predicted and target coordinates of 3113 kilometres (km). Te advantage of treating geograpic origin prediction as a regression problem is tat it allows te latitude and longitude correlations found by Gomez et al. [3] to be considered as well as te topology of te Eart. Te disadvantage is not accounting for latitudes getting distorted towards te poles, and longitudes diverging at ±18 degrees. Location is usually used as an input feature in regression models, owever some studies ave explored prediction of geograpical origin in a continuous space in te domains of linguistics [2], criminology [22], and genetics [15, 21]. In tis paper we study te correlation between spatial origins and musical caracteristics of field recordings from te UK. We investigate weter te geograpical location of a music recording can be predicted solely based on its audio content. We extract features capturing musical aspects of te audio signal and train a neural network to predict te latitude and longitude of te origins of te recording. We investigate te model s performance for different network arcitectures and learning parameters. We also compare te performance accuracy for several feature sets as well as te accuracy across different geograpical regions of te UK. Our developments contribute to te evaluation of existing audio features and teir applicability to folk music analysis. Our results provide insigts for music patterns across te UK, but te model can be expanded to process music recordings from all around te world. Tis could contribute to identifying te location of recordings in large unlabelled music collections as well as studying patterns of music similarity in world music. Tis paper is organised as follows: Section 2 provides an overview of te music collection and Section 3 describes te different sets of audio features considered in tis study. Section 4 provides a detailed description of te neural network arcitecture as well as te training and testing procedures. Section 5 presents te results of te model for different learning parameters, audio features, and geograpical areas. We conclude wit a discussion and directions for future work. 2. DATASET Our music dataset is drawn from te World & Traditional music collection of te Britis Library Sound Arcive 1 wic includes tousands of music recordings collected over decades of etnomusicological researc. In particular, we use a subset of te World & Traditional music collection curated for te Digital Music Lab project [1]. Tis subset consists of more tan 29 audio recordings wit a large representation (17) from te UK. We focus solely on recordings from te UK and process information on te recording s location (if available) to extract te latitude and 1 ttp://sounds.bl.uk/world-and-traditional-music (a) Geograpical spread Count Year (b) Year distribution Figure 1: Geograpical spread and year distribution in our dataset of 155 traditional music recordings from te UK. longitude coordinates. We keep only tose tracks wose extracted coordinates lie witin te spatial boundaries of te UK. Te final dataset consists of a total of 155 recordings. Te recordings span te years between 194 and 2 wit median year 1983 and standard deviation 12.3 years. See Figure 1 for an overview of te geograpical and temporal distribution of te dataset. Te origins of te recordings span a range of maximum 1222 km. From te origins of all 155 recordings we compute te average latitude and average longitude coordinates and estimate te distance between eac recording s location and te average latitude, longitude. Tis results in a mean distance of 167 wit standard deviation of 85 km. A similar estimate is computed from recordings in te training set and used as te random baseline for our regression predictions (Section 5). 3. AUDIO FEATURES We aim to process music recordings to extract audio features tat capture relevant music caracteristics. We use a speec/music segmentation algoritm as a preprocessing step and extract features from te music segments using available VAMP plugins 2. We post-process te output of te VAMP plugins to compute musical descriptors based on state of te art MIR researc. Additional dimensionality reduction and scaling is considered as a final step. Te metodology is summarised in Figure 2 and details are explained below. Several recordings in our dataset consist of compilations of multiple songs or a mixture of speec and music segments. Te first step in our metodology is to use a speec/music segmentation algoritm to extract relevant music segments from wic te rest of te analysis is derived. We coose te best performing segmentation algoritm [9] based on te results of te Music/Speec Detection task of te MIREX 215 evaluation 3. We apply te segmentation algoritm to extract music segments from 2 ttp:// 3 ttp:// Music/Speec_Classification_and_Detection

3 666 Proceedings of te 18t ISMIR Conference, Suzou, Cina, October 23-27, 217 Britis Library Speec Music Speec Onset Times Melody Cromagram Mel-Freq. Cepstral Coeff. IOI Ratio Histogram Pitc Histogram Contour Features Min, Max, Mean, Std Min, Max, Mean, Std PCA Lat Lon Figure 2: Summary of te metodology: UK folk music recordings are processed wit a speec/music segmentation algoritm and VAMP plugins are applied to music segments. Audio features are derived from te output of te VAMP plugins, PCA is applied, and output is fed to a neural network tat predicts te latitude and longitude of te recording. eac recording in our dataset. We require a minimum of 1 seconds of music for eac recording and discard any recordings wit total duration of music segments less tan tis tresold. Our analysis aims to capture relevant musical caracteristics wic are informative for te spatial origins of te music. We focus on aspects of rytm, melody, timbre, and armony. We derive audio features from te following VAMP plugins: MELODIA - Melody Extraction 4, Queen Mary - Cromagram 5, Queen Mary - Mel-Frequency Cepstral Coefficients 6, and Queen Mary - Note Onset Detector 7. We apply tese plugins for eac recording in our dataset and omit frames tat correspond to non-music segments as annotated by te previous step of speec/music segmentation. Te raw output of te VAMP plugins cannot be directly incorporated in our regression model. We post-process te output to low-dimensional and musically meaningful descriptors as explained below. Rytm. We post-process te output of te Queen Mary - Note Onset Detector plugin to derive istograms of inter-onset interval (IOI) ratios [4]. Let O = {o 1,..., o n } denote a sequence of n onset locations (in seconds) as output by te VAMP plugin. Te IOIs are defined as IOI = {o i+1 o i } for index i = 1,..., n 1. Te IOI ratios IOI j+1 are defined as IOIR = { IOI j } for index j = 1,..., n 2. Te IOI ratios denote tempo-independent descriptors because te tempo information carried wit te magnitude of IOIs vanises wit te ratio estimation. We compute a istogram for te IOIR values wit 1 bins uniformly distributed between [, 1). Timbre. We extract summary statistics from te output of te Queen Mary - Mel-Frequency Cepstral Coefficients (MFCC) plugin [8] wit te default values of frame and op size. In particular, we remove te first coefficient (DC component) and extract te min, max, mean, and standard deviation of te remaining 19 MFCCs over time. Melody. Te output of te MELODIA - Melody Extraction plugin denotes te frequency estimates over time 4 ttp://mtg.upf.edu/tecnologies/melodia 5 ttp://vamp-plugins.org/plugin-doc/ qm-vamp-plugins.tml#qm-cromagram 6 ttp://vamp-plugins.org/plugin-doc/ qm-vamp-plugins.tml#qm-mfcc 7 ttp://vamp-plugins.org/plugin-doc/ qm-vamp-plugins.tml#qm-onsetdetector of te lead melody. We extract a set of features capturing caracteristics of te pitc contour sape and melodic embellisments [16]. In particular, we extract statistics of te pitc range and duration, fit a polynomial curve to model te overall sape and turning points of te contour, and estimate te vibrato range and extent of melodic embellisments. Eac recording may consist of multiple sorter pitc contours. We keep te mean and standard deviation of features across all pitc contours extracted from te audio recording. We also post-process te output from MELODIA to compute an octave-wrapped pitc istogram [2] wit 1-cent resolution. Harmony. Te output of te Queen Mary - Cromagram plugin is an octave-wrapped cromagram wit 1- cent resolution [5]. We use te default frame and op size and extract summary statistics denoting te min, max, mean, and standard deviation of croma vectors over time. Te above process results in a total of 1484 features per recording. Before furter processing, te features were standardised wit z-scores. Dimensionality reduction was also applied wit Principal Component Analysis (PCA) including witening and keeping enoug components to represent 99% of te variance. 4. REGRESSION MODEL Te prediction of spatial coordinates from music data as been treated as a regression problem in previous researc using K-nearest neigbours and Random Forest Regression metods [23]. We explore te application of a neural network metod. Neural networks ave been sown to outperform existing metods in supervised tasks of music similarity [7, 11, 19]. We evaluate te performance of a neural network under different parameters for te regression problem of predicting latitude and longitudes from music features. A neural network wit two continuous value outputs, latitude and longitude predictions, was built in Tensorflow. We used te Adaptive Moment Estimation (Adam) algoritm for optimisation, Rectified Linear Unit (ReLU) as activation function, and drop-out rate of.5 for regularisation. Te evaluation of te model performance was based on te mean distance error in km, calculated using te Haversine formula [6]. Te Haversine distance d between two points in km is given by

4 Proceedings of te 18t ISMIR Conference, Suzou, Cina, October 23-27, Parameters Values Target Scaling True or False Number of idden layers {3, 4} Cost function Haversine or MSE Learning Rate {.5,.1,.5} L1 regularisation {,.5,.5} L2 regularisation {,.5,.5} Table 1: Te yper-parameters and teir range of values for optimisation. d = 2r arcsin([sin 2 ( φ 2 φ 1 )+ 2 cos(φ 1 ) cos(φ 2 ) sin 2 ( λ 2 λ 1 )] 1 2 ) (1) 2 were φ represents te latitude, λ longitude, and r te radius of te spere (wit r fixed to 6367 km in tis study). We furter explored te performance of te model under arcitectures wit different numbers of idden layers, two different cost functions, and a range of regularisation parameters as explained below. 4.1 Parameter Optimisation A grid-searc of model yper-parameters was performed to identify te combination tat acieves best performance in cross-validation. Te following yper-parameters were considered for optimisation: weter or not to scale te targets (i.e., z-score standardisation of te ground trut latitude/longitude coordinates of eac recording), te number of idden layers, two possible cost functions, namely, te Haversine distance in km and te Mean Squared Error (MSE), and a range of values for learning rate, L1 and L2 regularisation parameters. Te parameter optimisation is summarised in Table 1. We tested in total 216 combinations of yper-parameters and selected te best performing combination to tune parameters and retrain te model for te final results. 4.2 Train-test splits Te training of te model was done in two pases. First te model was trained using te full set of features (Section 3) and te different yper-parameters as defined in Table 1. Te yper-parameters were tuned based on te optimal performance obtained troug cross-validation. In te second pase, te yper-parameters were fixed to teir optimal values and te model was retrained for different sets of features. Eac new model s performance was assessed on a test set unique to tat model. In te first training pase, we sampled at random 7% from te total number of 155 recordings for training. Tis resulted in a total of 738 samples in te training set, of wic 3% (2111) was set aside for validation. Following PCA, te feature dimensionality of te dataset was 368. Target Hidden Cost Training Validation Scaling Layers Function Error (km) Error (km) True 3 Haversine True 3 MSE True 4 Haversine True 4 MSE False 3 Haversine False 3 MSE False 4 Haversine False 4 MSE Table 2: Results for parameter optimisation. Learning rate, L1, and L2 regularisation parameters are fixed to.5,,.5 respectively. Best performance is obtained wen target scaling is combined wit 3 idden layers and Haversine distance as cost function. We used cross-validation wit K = 5 folds and tuned parameters based on te mean of te distance error on te validation set (Equation 1). In te second pase we retrained te model for different feature sets. For eac feature set, te dataset was split into training (random 7%) and test (remaining 3%) and te performance of te model was assessed on te test set. 5. RESULTS 5.1 Parameter Optimisation Te model tat produced te lowest mean error on te validation set (119 km) used te following yper parameters: target scaling, 3 idden layers, Haversine distance as cost function, learning rate of.5, and L1, L2 regularisation parameters of and.5, respectively. Te main yperparameters tat determined te accuracy of te model were te use of Haversine distance as te cost function, and te application of target scaling. Te performance of te model for different parameter values is sown in Table Results for different feature sets Te second set of experiments explored te performance of te model wen trained for different sets of features. We estimated te random baseline from te origins of recordings in te training set. In particular, we computed te average latitude and average longitude coordinates of recordings and estimated te distance between eac recording s location and te average latitude, longitude. Based on tis estimate te mean distance error of te baseline approac was km. Eac model was compared to te baseline approac (i.e., te mean distance error of its test targets) wit a Wilcoxon signed-rank test. Te performances of te models trained on different sets of features and evaluated on separate test sets were compared wit a pairwise Wilcoxon rank sum test (also known as Mann-Witney) wit Bonferroni correction for multiple comparisons. We consider a significance level of α =.5 and denote te Bonferroni corrected level by ˆα.

5 f f f 668 Proceedings of te 18t ISMIR Conference, Suzou, Cina, October 23-27, 217 Model Feature Set Error No. Name (km) 1 All features Rytm: IOIR istogram Harmony: Cromagram statistics Timbre: MFCC statistics Pitc istogram Contour features mean Contour features standard deviation Melody: Pitc ist., contour features Rytm and Harmony Rytm and Timbre Rytm and Melody Melody and Harmony Melody and Timbre Timbre and Harmony Rytm, Harmony, and Timbre Rytm, Harmony, and Melody Rytm, Timbre, and Melody Harmony, Timbre, and Melody 14.3 Baseline Table 3: Te mean distance error (in km) of te test set for 18 models trained on different sets of features. Distance error (km) a bc af g Feature sets Figure 3: Distance error of predictions for different sets of features (see Table 3 for te feature set used to train eac model). Labels a l in indicate features sets tat ave nonsignificantly different results (p > ˆα) were tey sare te same letter. For example, feature set 3 sares te label a wit feature set 8 but sares no label wit any oter feature set, indicating tat results from model 3 are significantly different from all oter models except for model 8. ce dg d dg ce bg e (a) Ground trut (b) Predictions Figure 4: (a) Ground trut and (b) predicted music recording origins, coloured by te distance error (in km) for te best performing model (no. 14). All models acieved results significantly different from te baseline approac (p <.1). Te best performance (lowest error of 114. km) was acieved wen combining te timbral and armonic descriptors (model 14). Tis combines te summary statistics of te cromagram and te summary statistics of te MFCCs. Te performance of tis model was significantly different (p < ˆα) from all oter models except models 13 and 15 trained on melodic and timbral, and rytmic, armonic and timbral descriptors, respectively. Te model acieved a mean error of km on te test set wen all features (Section 3) were used. Te results from model 3 trained on armonic descriptors were significantly different from all oter models except model 8 trained on melodic features. Te model trained on rytmic descriptors (model 2) is amongst te weakest predictors. However, adding rytmic features to any of melodic, armonic, or timbral features, for example models 9, 1, 11, significantly improves te performance of te model (p < ˆα for pairwise comparisons between models 3 and 9, 4 and 1, 8 and 11). Models 5, 6, 7 trained on pitc istograms, contour features mean, and contour features standard deviation, respectively, are also amongst te weakest predictors but wen all tese features are combined togeter as in model 8, te performance is improved. See Table 3 for an overview of te prediction accuracy of models trained on different feature sets. Figure 3 provides a box-plot visualisation of te results from different feature sets and marks statistical significance between results. 5.3 Results for different regions Te last analyses aim to study te prediction accuracy wit respect to te geograpical origins of recordings. Figure 4 sows te ground trut and predicted coordinates for te best performing model (model no.14 as denoted in Table 3) coloured by te distance error in km. We observe tat data points wit te lowest predictive accuracy originate from te nort-eastern and te sout-western areas of te UK (Figure 4a). Predictions are mostly concentrated in te soutern part of te UK. Data points predicted towards te

6 Proceedings of te 18t ISMIR Conference, Suzou, Cina, October 23-27, (a) Rytm (b) Harmony (c) Timbre (d) Melody Figure 5: Music recording origins coloured by te distance error (in km) for models trained on (a) rytmic, (b) armonic, (c) timbral, and (d) melodic features (models no. 2, 3, 4, 8 respectively as defined in Table 3). eastern areas indicate a larger distance error (Figure 4b). In Figure 5 we visualise te prediction accuracy of models trained on different feature sets wit respect to geograpy. We observe tat for all models te nortern areas of te UK (i.e., in te region of Scotland) are predicted wit a relatively large distance error (lowest accuracy). For te model trained on timbral features (Figure 5c) we also observe te sout west of England predicted wit lower accuracy tan te models trained on armonic and melodic features (Figures 5b and 5d). 6. DISCUSSION Our results provide insigts on te contribution of different feature sets and suggest patterns of music similarity across geograpical regions. Te metodology can be improved in various ways. Te initial corpus of folk and traditional music from te UK consisted of a total of 17 of wic only 155 were processed in tis study. Te final dataset ad a skewed geograpical distribution wit over-representation of te sout-eastern and sout-western UK regions, e.g., Devon and Suffolk, and under-representation of te Nort- Eastern, Nort-Western areas, e.g., Scotland and Nortern Ireland. Effects from te skewness of te dataset could be observed in te distribution of predicted latitude and longitude coordinates (Figure 4b). A larger and more representative corpus can be used in future work. We used features derived from te output of VAMP plugins to describe musical content of audio recordings. Some of tese plugins were designed for different music styles and teir application to folk music migt not give robust results. A toroug evaluation of te suitability of te features can give valuable insigts for improving teir robustness to different corpora suc as te one used in tis study. We used feature representations averaged over time but in future work preserving temporal information in te features could provide better music content description. We observed tat results from models trained on individual features sowed on average larger distance errors. Wen owever combinations of features were considered, te model acieved on average iger accuracies. An exception is te case wen all features were considered but te performance of te model ad a relatively large distance error. Tis could be due to limitations of te model especially wit regards to over-fitting or te lack of adequate music information captured by te features. Integrating additional audio features could elp capture more of te variance of te data and improve te model. Te model was validated for a range of parameters and several approaces were considered to avoid over-fitting. However, evidence of over-fitting could still be observed in te final results. Training wit more data could elp make te model more generalisable in future work. Wat is more, oversampling tecniques could be explored to overcome te problem of under-represented geograpical regions in our dataset. Neural networks in combination wit audio features as proposed in tis study, can provide good predictions of te origins of te music. Tis can aid musicological researc as well as improve spatial metadata associated wit large music collections. 7. CONCLUSION We studied a collection of field recordings from te UK and investigated weter te geograpical origins of recordings can be predicted from te music attributes of te audio signal. We treated tis as a regression problem and trained a neural network to take as input audio features and predict te latitude and longitude of te music s origin. We trained te model under different yperparameters and tested its performance for different feature sets. Higest accuracy was acieved for te model trained on timbral and armonic features but no significant differences were found to te same model wit rytm features added or wit melody replacing armony. Te soutern regions of te UK were predicted wit a relatively ig accuracy wereas nortern regions were predicted wit low accuracy. Effects of te skewness of te dataset and te reliability of audio features were discussed. Te corpus and metodology can be improved in future work and te applicability of te model could be extended to music from around te world. 8. ACKNOWLEDGEMENTS MP is supported by a Queen Mary researc studentsip.

7 67 Proceedings of te 18t ISMIR Conference, Suzou, Cina, October 23-27, REFERENCES [1] S. Abdalla, E. Benetos, N. Gold, S. Hargreaves, T. Weyde, and D. Wolff. Te Digital Music Lab: A Big Data Infrastructure for Digital Musicology. ACM Journal on Computing and Cultural Heritage, 1(1), 217. [2] J. Eisenstein, B. O Connor, N.A Smit, and E.P. Xing. A Latent Variable Model for Geograpic Lexical Variation. In Proceedings of te 21 Conference on Empirical Metods in Natural Language Processing, pages , 21. [3] E. Gómez, M. Haro, and P. Herrera. Music and geograpy: Content description of musical audio from different parts of te world. In Proceedings of te International Society for Music Information Retrieval Conference, pages , 9. [4] F. Gouyon, S. Dixon, E. Pampalk, and G. Widmer. Evaluating rytmic descriptors for musical genre classification. In Proceedings of te AES 25t International Conference, pages , 4. [5] C. Harte and M. Sandler. Automatic cord identifcation using a quantised cromagram. In 118t Audio Engineering Society Convention, 5. [6] J. Inman. Navigation and Nautical Astronomy: For te Use of Britis Seamen. F. & J. Rivington, [7] I. Karydis, K. Kermanidis, S. Sioutas, and L. Iliadis. Comparing content and context based similarity for musical data. Neurocomputing, 17:69 76, 213. [8] B. Logan. Mel-Frequency Cepstral Coefficients for Music Modeling. In Proceedings of te International Symposium on Music Information Retrieval,. [9] M. Marolt. Music/speec classification and detection submission for MIREX 215. In MIREX, 215. [1] M. Mauc, R. M. MacCallum, M. Levy, and A. M. Leroi. Te evolution of popular music: USA Royal Society Open Science, 2(5):1581, 215. [15] J. Novembre, K. Bryc, S. Bergmann, A.R. Boyko, C.D. Bustamante, A. Auton, M. Stepens, Z. Kutalik, A. Indap, T. Jonson, M.R. Nelson, and K.S. King. Genes mirror geograpy witin Europe. Nature, 456(7218):98 11, 8. [16] M. Panteli, R. Bittner, J. P. Bello, and S. Dixon. Towards te caracterization of singing styles in world music. In IEEE International Conference on Acoustics, Speec and Signal Processing, pages , 217. [17] P. E. Savage and S. Brown. Mapping Music: Cluster Analysis Of Song-Type Frequencies Witin and Between Cultures. Etnomusicology, 58(1): , 214. [18] P. E. Savage, S. Brown, E. Sakai, and T. E. Currie. Statistical universals reveal te structures and functions of uman music. Proceedings of te National Academy of Sciences of te United States of America, 112(29): , 215. [19] D. Turnbull and C. Elkan. Fast recognition of musical genres using RBF networks. IEEE Transactions on Knowledge and Data Engineering, 17(4):58 584, 5. [2] G. Tzanetakis, A. Ermolinskyi, and P. Cook. Pitc istograms in audio and symbolic music information retrieval. Journal of New Music Researc, 32(2): , 3. [21] W. Yang, J. Novembre, E. Eskin, and E. Halperin. A model-based approac for analysis of spatial structure in genetic data. Nature Genetics, 44(6): , 212. [22] J. M. Young, L. S. Weyric, J. Breen, L. M. Macdonald, and A. Cooper. Predicting te origin of soil evidence: Hig trougput eukaryote sequencing and MIR spectroscopy applied to a crime scene scenario. Forensic Science International, 251:22 31, 215. [23] F. Zou, Q. Claire, and R. D. King. Predicting te Geograpical Origin of Music. In IEEE International Conference on Data Mining, pages , 214. [11] C. McKay and I. Fujinaga. Automatic genre classification using large ig-level musical feature sets. In Proceedings of te International Society for Music Information Retrieval Conference, pages , 4. [12] D. Moelants, O. Cornelis, and M. Leman. Exploring African Tone Scales. In Proceedings of te International Society for Music Information Retrieval Conference, pages , 9. [13] B. Nettl. Te Study of Etnomusicology: Tirty-one Issues and Concepts. University of Illinois Press, Urbana and Cicago, 2nd edition, 5. [14] B. Nettl, R. M. Stone, J. Porter, and T. Rice, editors. Te Garland Encyclopedia of World Music. Garland Pub, New York, edition, 1998.

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

Release Year Prediction for Songs

Release Year Prediction for Songs Release Year Prediction for Songs [CSE 258 Assignment 2] Ruyu Tan University of California San Diego PID: A53099216 rut003@ucsd.edu Jiaying Liu University of California San Diego PID: A53107720 jil672@ucsd.edu

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

Supervised Learning in Genre Classification

Supervised Learning in Genre Classification Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music

More information

Analytic Comparison of Audio Feature Sets using Self-Organising Maps

Analytic Comparison of Audio Feature Sets using Self-Organising Maps Analytic Comparison of Audio Feature Sets using Self-Organising Maps Rudolf Mayer, Jakob Frank, Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology,

More information

LEARNING A FEATURE SPACE FOR SIMILARITY IN WORLD MUSIC

LEARNING A FEATURE SPACE FOR SIMILARITY IN WORLD MUSIC LEARNING A FEATURE SPACE FOR SIMILARITY IN WORLD MUSIC Maria Panteli, Emmanouil Benetos, Simon Dixon Centre for Digital Music, Queen Mary University of London, United Kingdom {m.panteli, emmanouil.benetos,

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Chord Classification of an Audio Signal using Artificial Neural Network

Chord Classification of an Audio Signal using Artificial Neural Network Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional

More information

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng Introduction In this project we were interested in extracting the melody from generic audio files. Due to the

More information

Modeling memory for melodies

Modeling memory for melodies Modeling memory for melodies Daniel Müllensiefen 1 and Christian Hennig 2 1 Musikwissenschaftliches Institut, Universität Hamburg, 20354 Hamburg, Germany 2 Department of Statistical Science, University

More information

Audio Feature Extraction for Corpus Analysis

Audio Feature Extraction for Corpus Analysis Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1 Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends

More information

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University Week 14 Query-by-Humming and Music Fingerprinting Roger B. Dannenberg Professor of Computer Science, Art and Music Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Metadata-based

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

Singer Recognition and Modeling Singer Error

Singer Recognition and Modeling Singer Error Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

Computational Modelling of Harmony

Computational Modelling of Harmony Computational Modelling of Harmony Simon Dixon Centre for Digital Music, Queen Mary University of London, Mile End Rd, London E1 4NS, UK simon.dixon@elec.qmul.ac.uk http://www.elec.qmul.ac.uk/people/simond

More information

Outline. Why do we classify? Audio Classification

Outline. Why do we classify? Audio Classification Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify

More information

Improving Frame Based Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for

More information

Creating a Feature Vector to Identify Similarity between MIDI Files

Creating a Feature Vector to Identify Similarity between MIDI Files Creating a Feature Vector to Identify Similarity between MIDI Files Joseph Stroud 2017 Honors Thesis Advised by Sergio Alvarez Computer Science Department, Boston College 1 Abstract Today there are many

More information

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS Matthew Prockup, Erik M. Schmidt, Jeffrey Scott, and Youngmoo E. Kim Music and Entertainment Technology Laboratory (MET-lab) Electrical

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING Mudhaffar Al-Bayatti and Ben Jones February 00 This report was commissioned by

More information

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM 19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM Tomoko Matsui

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Jeffrey Scott, Erik M. Schmidt, Matthew Prockup, Brandon Morton, and Youngmoo E. Kim Music and Entertainment Technology Laboratory

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,

More information

Lyrics Classification using Naive Bayes

Lyrics Classification using Naive Bayes Lyrics Classification using Naive Bayes Dalibor Bužić *, Jasminka Dobša ** * College for Information Technologies, Klaićeva 7, Zagreb, Croatia ** Faculty of Organization and Informatics, Pavlinska 2, Varaždin,

More information

TOWARDS THE CHARACTERIZATION OF SINGING STYLES IN WORLD MUSIC

TOWARDS THE CHARACTERIZATION OF SINGING STYLES IN WORLD MUSIC TOWARDS THE CHARACTERIZATION OF SINGING STYLES IN WORLD MUSIC Maria Panteli 1, Rachel Bittner 2, Juan Pablo Bello 2, Simon Dixon 1 1 Centre for Digital Music, Queen Mary University of London, UK 2 Music

More information

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

Music Genre Classification

Music Genre Classification Music Genre Classification chunya25 Fall 2017 1 Introduction A genre is defined as a category of artistic composition, characterized by similarities in form, style, or subject matter. [1] Some researchers

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC Vaiva Imbrasaitė, Peter Robinson Computer Laboratory, University of Cambridge, UK Vaiva.Imbrasaite@cl.cam.ac.uk

More information

IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS

IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS 1th International Society for Music Information Retrieval Conference (ISMIR 29) IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS Matthias Gruhne Bach Technology AS ghe@bachtechnology.com

More information

Music Similarity and Cover Song Identification: The Case of Jazz

Music Similarity and Cover Song Identification: The Case of Jazz Music Similarity and Cover Song Identification: The Case of Jazz Simon Dixon and Peter Foster s.e.dixon@qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary

More information

Music Recommendation from Song Sets

Music Recommendation from Song Sets Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia

More information

Analysis of local and global timing and pitch change in ordinary

Analysis of local and global timing and pitch change in ordinary Alma Mater Studiorum University of Bologna, August -6 6 Analysis of local and global timing and pitch change in ordinary melodies Roger Watt Dept. of Psychology, University of Stirling, Scotland r.j.watt@stirling.ac.uk

More information

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION Halfdan Rump, Shigeki Miyabe, Emiru Tsunoo, Nobukata Ono, Shigeki Sagama The University of Tokyo, Graduate

More information

Robert Alexandru Dobre, Cristian Negrescu

Robert Alexandru Dobre, Cristian Negrescu ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

Automatic Music Genre Classification

Automatic Music Genre Classification Automatic Music Genre Classification Nathan YongHoon Kwon, SUNY Binghamton Ingrid Tchakoua, Jackson State University Matthew Pietrosanu, University of Alberta Freya Fu, Colorado State University Yue Wang,

More information

The song remains the same: identifying versions of the same piece using tonal descriptors

The song remains the same: identifying versions of the same piece using tonal descriptors The song remains the same: identifying versions of the same piece using tonal descriptors Emilia Gómez Music Technology Group, Universitat Pompeu Fabra Ocata, 83, Barcelona emilia.gomez@iua.upf.edu Abstract

More information

Effects of acoustic degradations on cover song recognition

Effects of acoustic degradations on cover song recognition Signal Processing in Acoustics: Paper 68 Effects of acoustic degradations on cover song recognition Julien Osmalskyj (a), Jean-Jacques Embrechts (b) (a) University of Liège, Belgium, josmalsky@ulg.ac.be

More information

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function EE391 Special Report (Spring 25) Automatic Chord Recognition Using A Summary Autocorrelation Function Advisor: Professor Julius Smith Kyogu Lee Center for Computer Research in Music and Acoustics (CCRMA)

More information

Perceptual dimensions of short audio clips and corresponding timbre features

Perceptual dimensions of short audio clips and corresponding timbre features Perceptual dimensions of short audio clips and corresponding timbre features Jason Musil, Budr El-Nusairi, Daniel Müllensiefen Department of Psychology, Goldsmiths, University of London Question How do

More information

Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,

More information

HIT SONG SCIENCE IS NOT YET A SCIENCE

HIT SONG SCIENCE IS NOT YET A SCIENCE HIT SONG SCIENCE IS NOT YET A SCIENCE François Pachet Sony CSL pachet@csl.sony.fr Pierre Roy Sony CSL roy@csl.sony.fr ABSTRACT We describe a large-scale experiment aiming at validating the hypothesis that

More information

Wipe Scene Change Detection in Video Sequences

Wipe Scene Change Detection in Video Sequences Wipe Scene Change Detection in Video Sequences W.A.C. Fernando, C.N. Canagarajah, D. R. Bull Image Communications Group, Centre for Communications Research, University of Bristol, Merchant Ventures Building,

More information

Recognising Cello Performers Using Timbre Models

Recognising Cello Performers Using Timbre Models Recognising Cello Performers Using Timbre Models Magdalena Chudy and Simon Dixon Abstract In this paper, we compare timbre features of various cello performers playing the same instrument in solo cello

More information

THE SIGMA-DELTA MODULATOR FOR MEASUREMENT OF THE THERMAL CHARACTERISTICS OF THE CAPACITORS

THE SIGMA-DELTA MODULATOR FOR MEASUREMENT OF THE THERMAL CHARACTERISTICS OF THE CAPACITORS MEASUREMENT SCIENCE REVIEW, Volume 2, Section 3, 22 THE SIGMA-DELTA MODULATOR FOR MEASUREMENT OF THE THERMAL CHARACTERISTICS OF THE CAPACITORS Martin Kollár Department of Electronics and Multimedia Telecommunications,

More information

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Fengyan Wu fengyanyy@163.com Shutao Sun stsun@cuc.edu.cn Weiyao Xue Wyxue_std@163.com Abstract Automatic extraction of

More information

STRUCTURAL CHANGE ON MULTIPLE TIME SCALES AS A CORRELATE OF MUSICAL COMPLEXITY

STRUCTURAL CHANGE ON MULTIPLE TIME SCALES AS A CORRELATE OF MUSICAL COMPLEXITY STRUCTURAL CHANGE ON MULTIPLE TIME SCALES AS A CORRELATE OF MUSICAL COMPLEXITY Matthias Mauch Mark Levy Last.fm, Karen House, 1 11 Bache s Street, London, N1 6DL. United Kingdom. matthias@last.fm mark@last.fm

More information

ISMIR 2008 Session 2a Music Recommendation and Organization

ISMIR 2008 Session 2a Music Recommendation and Organization A COMPARISON OF SIGNAL-BASED MUSIC RECOMMENDATION TO GENRE LABELS, COLLABORATIVE FILTERING, MUSICOLOGICAL ANALYSIS, HUMAN RECOMMENDATION, AND RANDOM BASELINE Terence Magno Cooper Union magno.nyc@gmail.com

More information

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs Abstract Large numbers of TV channels are available to TV consumers

More information

IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM

IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM IMPROVING GENRE CLASSIFICATION BY COMBINATION OF AUDIO AND SYMBOLIC DESCRIPTORS USING A TRANSCRIPTION SYSTEM Thomas Lidy, Andreas Rauber Vienna University of Technology, Austria Department of Software

More information

A Categorical Approach for Recognizing Emotional Effects of Music

A Categorical Approach for Recognizing Emotional Effects of Music A Categorical Approach for Recognizing Emotional Effects of Music Mohsen Sahraei Ardakani 1 and Ehsan Arbabi School of Electrical and Computer Engineering, College of Engineering, University of Tehran,

More information

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION Research & Development White Paper WHP 232 September 2012 A Large Scale Experiment for Mood-based Classification of TV Programmes Jana Eggink, Denise Bland BRITISH BROADCASTING CORPORATION White Paper

More information

TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS

TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS Simon Dixon Austrian Research Institute for AI Vienna, Austria Fabien Gouyon Universitat Pompeu Fabra Barcelona, Spain Gerhard Widmer Medical University

More information

MODELS of music begin with a representation of the

MODELS of music begin with a representation of the 602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010 Modeling Music as a Dynamic Texture Luke Barrington, Student Member, IEEE, Antoni B. Chan, Member, IEEE, and

More information

The Million Song Dataset

The Million Song Dataset The Million Song Dataset AUDIO FEATURES The Million Song Dataset There is no data like more data Bob Mercer of IBM (1985). T. Bertin-Mahieux, D.P.W. Ellis, B. Whitman, P. Lamere, The Million Song Dataset,

More information

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t MPEG-7 FOR CONTENT-BASED MUSIC PROCESSING Λ Emilia GÓMEZ, Fabien GOUYON, Perfecto HERRERA and Xavier AMATRIAIN Music Technology Group, Universitat Pompeu Fabra, Barcelona, SPAIN http://www.iua.upf.es/mtg

More information

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS M.G.W. Lakshitha, K.L. Jayaratne University of Colombo School of Computing, Sri Lanka. ABSTRACT: This paper describes our attempt

More information

A NOVEL MUSIC SEGMENTATION INTERFACE AND THE JAZZ TUNE COLLECTION

A NOVEL MUSIC SEGMENTATION INTERFACE AND THE JAZZ TUNE COLLECTION A NOVEL MUSIC SEGMENTATION INTERFACE AND THE JAZZ TUNE COLLECTION Marcelo Rodríguez-López, Dimitrios Bountouridis, Anja Volk Utrecht University, The Netherlands {m.e.rodriguezlopez,d.bountouridis,a.volk}@uu.nl

More information

Content-based music retrieval

Content-based music retrieval Music retrieval 1 Music retrieval 2 Content-based music retrieval Music information retrieval (MIR) is currently an active research area See proceedings of ISMIR conference and annual MIREX evaluations

More information

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,

More information

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features R. Panda 1, B. Rocha 1 and R. P. Paiva 1, 1 CISUC Centre for Informatics and Systems of the University of Coimbra, Portugal

More information

POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING

POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING POLYPHONIC INSTRUMENT RECOGNITION USING SPECTRAL CLUSTERING Luis Gustavo Martins Telecommunications and Multimedia Unit INESC Porto Porto, Portugal lmartins@inescporto.pt Juan José Burred Communication

More information

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular Music Mood Sheng Xu, Albert Peyton, Ryan Bhular What is Music Mood A psychological & musical topic Human emotions conveyed in music can be comprehended from two aspects: Lyrics Music Factors that affect

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;

More information

Neural Network for Music Instrument Identi cation

Neural Network for Music Instrument Identi cation Neural Network for Music Instrument Identi cation Zhiwen Zhang(MSE), Hanze Tu(CCRMA), Yuan Li(CCRMA) SUN ID: zhiwen, hanze, yuanli92 Abstract - In the context of music, instrument identi cation would contribute

More information

A Large Scale Experiment for Mood-Based Classification of TV Programmes

A Large Scale Experiment for Mood-Based Classification of TV Programmes 2012 IEEE International Conference on Multimedia and Expo A Large Scale Experiment for Mood-Based Classification of TV Programmes Jana Eggink BBC R&D 56 Wood Lane London, W12 7SB, UK jana.eggink@bbc.co.uk

More information

An Adaptive Length Frame Synchronization Scheme

An Adaptive Length Frame Synchronization Scheme Send Orders for Reprints to reprints@bentamscience.ae 244 Te Open Automation and Control Systems Journal, 2014, 6, 244-249 An Adaptive Lengt Frame Syncronization Sceme Open Access Gang Li 1 and Jin Xia

More information

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Introduction Active neurons communicate by action potential firing (spikes), accompanied

More information

Lecture 9 Source Separation

Lecture 9 Source Separation 10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research

More information

Evaluating Melodic Encodings for Use in Cover Song Identification

Evaluating Melodic Encodings for Use in Cover Song Identification Evaluating Melodic Encodings for Use in Cover Song Identification David D. Wickland wickland@uoguelph.ca David A. Calvert dcalvert@uoguelph.ca James Harley jharley@uoguelph.ca ABSTRACT Cover song identification

More information

Towards Music Performer Recognition Using Timbre Features

Towards Music Performer Recognition Using Timbre Features Proceedings of the 3 rd International Conference of Students of Systematic Musicology, Cambridge, UK, September3-5, 00 Towards Music Performer Recognition Using Timbre Features Magdalena Chudy Centre for

More information

CAE, May London Exposure Rating and ILF. Stefan Bernegger, Dr. sc. nat., SAV Head Analytical Services & Tools Swiss Reinsurance Company Ltd

CAE, May London Exposure Rating and ILF. Stefan Bernegger, Dr. sc. nat., SAV Head Analytical Services & Tools Swiss Reinsurance Company Ltd CAE, May 5 London Eposure Rating and ILF Stean Bernegger, Dr. sc. nat., SAV ead Analytical Services & Tools Swiss Reinsurance Company Ltd Tale o Contents / Agenda Introduction ISO Curves or US Casualty

More information

A Computational Model for Discriminating Music Performers

A Computational Model for Discriminating Music Performers A Computational Model for Discriminating Music Performers Efstathios Stamatatos Austrian Research Institute for Artificial Intelligence Schottengasse 3, A-1010 Vienna stathis@ai.univie.ac.at Abstract In

More information

Speech To Song Classification

Speech To Song Classification Speech To Song Classification Emily Graber Center for Computer Research in Music and Acoustics, Department of Music, Stanford University Abstract The speech to song illusion is a perceptual phenomenon

More information

Enabling editors through machine learning

Enabling editors through machine learning Meta Follow Meta is an AI company that provides academics & innovation-driven companies with powerful views of t Dec 9, 2016 9 min read Enabling editors through machine learning Examining the data science

More information

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Video-based Vibrato Detection and Analysis for Polyphonic String Music Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International

More information

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance Methodologies for Expressiveness Modeling of and for Music Performance by Giovanni De Poli Center of Computational Sonology, Department of Information Engineering, University of Padova, Padova, Italy About

More information

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons

Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons Róisín Loughran roisin.loughran@ul.ie Jacqueline Walker jacqueline.walker@ul.ie Michael O Neill University

More information

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Emotion Recognition. Jaesung Lee. Chung-Ang University Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or

More information

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University

More information

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski Music Mood Classification - an SVM based approach Sebastian Napiorkowski Topics on Computer Music (Seminar Report) HPAC - RWTH - SS2015 Contents 1. Motivation 2. Quantification and Definition of Mood 3.

More information

Automatic Labelling of tabla signals

Automatic Labelling of tabla signals ISMIR 2003 Oct. 27th 30th 2003 Baltimore (USA) Automatic Labelling of tabla signals Olivier K. GILLET, Gaël RICHARD Introduction Exponential growth of available digital information need for Indexing and

More information

Analysing Musical Pieces Using harmony-analyser.org Tools

Analysing Musical Pieces Using harmony-analyser.org Tools Analysing Musical Pieces Using harmony-analyser.org Tools Ladislav Maršík Dept. of Software Engineering, Faculty of Mathematics and Physics Charles University, Malostranské nám. 25, 118 00 Prague 1, Czech

More information

Recognising Cello Performers using Timbre Models

Recognising Cello Performers using Timbre Models Recognising Cello Performers using Timbre Models Chudy, Magdalena; Dixon, Simon For additional information about this publication click this link. http://qmro.qmul.ac.uk/jspui/handle/123456789/5013 Information

More information

arxiv: v1 [cs.sd] 5 Apr 2017

arxiv: v1 [cs.sd] 5 Apr 2017 REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS Li-Chia Yang, Szu-Yu Chou, Jen-Yu Liu, Yi-Hsuan Yang, Yi-An Chen Research Center for Information Technology

More information

EVALUATION OF FEATURE EXTRACTORS AND PSYCHO-ACOUSTIC TRANSFORMATIONS FOR MUSIC GENRE CLASSIFICATION

EVALUATION OF FEATURE EXTRACTORS AND PSYCHO-ACOUSTIC TRANSFORMATIONS FOR MUSIC GENRE CLASSIFICATION EVALUATION OF FEATURE EXTRACTORS AND PSYCHO-ACOUSTIC TRANSFORMATIONS FOR MUSIC GENRE CLASSIFICATION Thomas Lidy Andreas Rauber Vienna University of Technology Department of Software Technology and Interactive

More information

Feature-Based Analysis of Haydn String Quartets

Feature-Based Analysis of Haydn String Quartets Feature-Based Analysis of Haydn String Quartets Lawson Wong 5/5/2 Introduction When listening to multi-movement works, amateur listeners have almost certainly asked the following situation : Am I still

More information