AMusicSearchEnginebasedonSemantic Text-Based Query

Size: px
Start display at page:

Download "AMusicSearchEnginebasedonSemantic Text-Based Query"

Transcription

1 AMusicSearchEnginebasedonSemantic Text-Based Query Michele Buccoli 1,MassimilianoZanoni 2,AugustoSarti 2,StefanoTubaro 2 Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano piazza Leonardo da Vinci Milano, Italy 1 michele.buccoli@mail.polimi.it 2 zanoni,sarti,tubaro@elet.polimi.it Abstract Search and retrieval of songs from a large music repository usually relies on added meta-information (e.g., title, artist or musical genre); or on specific descriptors (e.g. mood); or on categorical music descriptors; none of which can specify the desired intensity. In this work, we propose an early example of semantic text-based music search engine. The semantic description takes into account emotional and non-emotional musical aspects. The method also includes a query-by-similarity search approach performed using semantic cues. We model both concepts and musical content in dimensional spaces that are suitable for carrying intensity information on the descriptors. We process the semantic query with a Natural Language parser to capture only the relevant words and qualifiers. We rely on Bayesian Decision theory to model concepts and songs as probability distributions. The resulted ranked list of songs are produced through a posterior probability model. A prototype of the system has been proposed to 53 subjects for evaluation, with good ratings on performance, usefulness and potential. I. INTRODUCTION In the past decade the way we retrieve, organize, browse and listen to music has undergone a deep transformation. Numerous commercial solutions where the desired musical content is just one click away have, in fact, multiplied. This enables new scenarios for music diffusion, fruition and democratization, and paves the way to new forms of social networks for music: Soundcloud 1,Spotify 2,etc.However,with such a formidable amount of content at hand, the user is exposed to the risk of information overload: which leads to the paradoxical situation of freedom/ease of access becoming an obstacle to finding what we are looking for. Until not so long ago, what played the role of mediator between the user and the content were vendors, promoters (magazines, radios stations, etc.) and geographical constraints. In fact, they all contributed to creating a perspective, a hierarchy of importance on musical content. In the current scenario users have direct access to content, thus weakening MMSP 13, Sept Oct. 2, 2013, Pula (Sardinia), Italy /13/$31.00 c 2013 IEEE. the role of mediators and, as a consequence, flattening our perspective. This problem was understood quite early by scientific communities (particularly that Music Information Retrieval, MIR), which focused from the start on the creation of novel mediators. Despite the availability of solutions for content-based music information retrieval, music collections are usually still managed and accessed through techniques that use meta information such as artist name, title, etc. (e.g. Last.fm 3 [1] and itunes 4 ). There are also solutions that enable the search for meta descriptors starting from partial content descriptors. These solutions fall within the area of music search, within MIR. Shazam 5,forexample,isanapplicationthatretrieves songs from a recorded sample [2]; Soundhound 6 performs music search and retrieval with a query-by-humming paradigm. Recent studies [3], however, have pointed out the emerging and growing need of users to interact with systems at a higher level of abstraction. In [3] the author proposes an audio fragment search engine based on semantic descriptors and on an acoustic feature space. In [4] the authors introduce a music search system with query by semantic description based on a vocabulary of 159 descriptors related to emotion, genre, usage, etc. Stereomood 7 is a web service that retrieves music that exhibits a desired mood. Proposed tools introduce a semantic description of music content, but the paradigm still relies on the categorical approach, which cannot express how well the concept describes the excerpt. This precludes the use of qualifiers such as not so happy, very aggressive, etc.moreover,queriesaregenerally expressed in a pre-structured form. In this work we address music search issues based on query by semantic description. We refer to this paradigm as semantic text-based search engine. The paradigm provides natural language based queries to exploit the richness of 3 Last.fm Ltd., 4 Apple Inc., 5 Shazam Entertainment Ltd, 6 Soundhound Inc. 7 Stereomood srl.,

2 language and capture the significant concepts and qualifiers. Natural Language Processing (NLP) [5] is a discipline that concerns making machines able to understand human natural language. In the literature semantic descriptors are classified as Emotional Descriptors (ED) and Non-Emotional Descriptors (NED).WeusedimensionalapproachbothforEDsandNEDs. Dimensional approach to emotion conceptualization aims at mapping emotions in 2D, 3D spaces. The most widespread choice is the Valence-Arousal (VA) space [6] where each affective terms is represented as a point. While a relation between affective terms is rather apparent in EDs, we cannot state the same for NEDs. In fact, no space that is able to represent such descriptors has been proposed, which is why we model each NED on a 1D semantic space. Query-by-similarity is based on a different paradigm for music description. Itunes Genius [7] suggests music that appears to be similar to the user s collection, though it does so based on meta-data descriptors. In [8] and in the Spotify application, query by similarity is performed using both acoustic descriptors and meta information. In our paradigm the similarity is computed in the ED and NED semantic spaces. In order to be able to compare different types of descriptions and queries we propose a combined dimensional space that includes both approaches. Using the Bayesian Theory on the semantic space, we model EDs, NEDs and songs as distributions of a-priori probabilities and the final score (used for ranking the playlist) as posterior probabilities. We implemented a prototype of the system as a web search engine that returns a ranked list of songs. II. APPROACH OVERVIEW In the semantic text-based search engine, there are three key aspects to consider: the modeling of concepts, the modeling of music content, the modeling of queries. The model we propose in this paper relies on Bayesian Decision Theory. Once the query is modeled, for each song the posterior probability is computed to produce the final ranked list. The general scheme of the approach is shown in Fig. 1: Fig. 1. System block diagram with detail view of computational core. A. Dimensional Approach to Concept Modeling In this paper we modeled both Emotional and Non- Emotional Descriptors. 1) Emotional Descriptors: One of the key proprieties of music is its ability to convey emotions. This is the main reason that pushed psychologist and musicologist to investigate paradigms for the representation of emotions. The most influenced dimensional music emotion conceptualization model so far is the circumplex model of affect proposed by Russell [6](Fig. 2). This model consists of a two-dimensional space composed by Arousal (A), linked to the degree of activation or excitement, and Valence (V), linked to the degree of pleasantness. Distances between points in the space are proportional to the semantic distances between words. In [9] Fig. 2. The Russell s circumplex model of effect. Affective terms are described in term of Arousal and Valence. Arousal represents thedegreeof activation. Valence is related to the degree of pleasantness. the authors collected a set of English affective words (ANEW) manually tagged. Means and standard deviation for Valence and Arousal annotations related to each concept are computed as a measure of the consensus by people. We exploit the consensus to model each term in the ANEW collection as anormaldistributionofprobability. Given an affective term w ED W ED,withW ED the set of affective terms, w ED (n VA ) N(µ w VA, Σw VA ) (1) where N ( ) denotes a normal distribution, n VA =[n V,n A ] T represents a point in the Valence-Arousal plane, µ w VA = [µ w V,µw A ]T are the mean values of Valence and Arousal from the ANEW dataset, Σ w VA = diag(σw VA )=diag([σw V,σw A ]T ) is the covariance matrix. Since the support of the probability distribution is limited, in order to uniform the distributions, w ED are normalized by an integration to 1. 2) Non-Emotional Descriptors: In [10] the authors proposed 27 semantic descriptors divided in affective/emotive, structural, kinaesthetic, judgement. Forourstudywechoseto model some of the structural and judgement bipolar descriptors and one kinaesthetic descriptor, as shown in table I. Each concept is modeled independently. Since the definition of gesture is equivalent to the definition of grooviness proposed in [11], in this study we prefer the term groovy, whichis

3 Non-Emotional Descriptors Structural Soft/Hard Clear/Dull Rough/Harmonious Void/Compact Flowing/Stuttering Dynamic/Static Kinaesthetic Gesture Judgement Easy/Difficult TABLE I LIST OF NON-EMOTIONAL DESCRIPTORS CHOSEN FROM THE ANNOTATION EXPERIMENT IN [10] The normal distributions are normalized such that w T (T 1 )=w T (T 2 )=1 (5) w T (n) 1.5 adagio moderato presto 1 andante allegro 0.5 more widely used in musical context. Moreover, given the importance of rhythmic information, we include the bipolar Tempo descriptor Fast/Slow. Tempo is a general description of the speed of a song and it is generally expressed in beatsper-minute (BPM). Non-Emotional Descriptors are modeled using normal distributions such as: w + d (n) N(µ+ d,σ d) and w d (n) =N (µ d,σ d), (2) where w + d and w d are the first and the second term, respectively, of the bipolar descriptor d, n [0..1]; µ + d =0, µ d = 1 are the mean values of the left and right bounds of n; andσ d is the standard deviation, which is set to 0.5, as this is the value that splits the space in the two opposite terms. Concerning the Fast/Slow descriptor, the Tempo value in BPM is normalized in the range [0...1]. AsGrooviness is not expressed as bipolar concept, we formalize it as: w d (n) N(1, 0.5) where n [0..1]. Allthedistributions are finally to its maximum value. In the context of music transcription a common way to express tempo through Tempo Markings indicating the portamento of the piece. We include Tempo Markings in the search engine and we model them by exploiting the correlation with BPM ranges such as proposed in [12] (Table II). To capture possible Tempo fluctuations in the song, we model the Tempo Markings partially as normal distributions and partially as uniform distributions, as shown in Fig. 3. TM Adagio Andante Moderato Allegro Presto BPM TABLE II TEMPO MARKINGS (TM) AND CORRESPONDENT RANGES OF BPM In particular, we fix the standard deviation as: σ w T = β (T 2 T 1 ), (3) where T 1 and T 2 are the bounds for the BPM range of the word w, withw = {adagio, andante, moderato, allegro, presto} and β is experimentally determined as β =0.25. The model is formalized as: w T (n) N(T 1,σ w T ) if n T 1 =1 if n (T 1,T 2 ) N(T 2,σ w T ) if n T 2 (4) n Fig. 3. Concept modeling for tempo markings words as listed in tableii.in the x-axis there are BPM, in the y-axis the tempo markings modeled in the BPM mono dimensional space. B. Music Content Semantic Description Modeling We model songs in both ED and NED spaces in order to compare to concepts. 1) Emotional Descriptors: Songs are manually annotated in the VA space. For each song mean µ s VAand standard deviation σ s VA of the annotations are computed. As well as for concepts, in order to account for the consensus and the variation of annotation by people, we model songs as normal distributions in the VA plane: s ED (n VA ) N(µ s VA, Σs VA ). (6) where Σ s VA is the covariance matrix. Since the support of the probability distribution is limited, to uniform the distributions, s ED are normalized by an integration to 1. 2) Non-Emotional Descriptors: We compute means µ s d and standard deviations σd s for each descriptor and we model the songs in the data set as normal distributions in a monodimensional space: s d (n) N(µ s d,σs d ) (7) where n d is a point in the space and d D = {hard, clear, rough, comp, dyn, stutt, diff, groovy, BP M}. C. Query Modeling In our paradigm, queries are expressed using sentences relying on natural English language based on: EDs and NEDs; qualifiers; similarity with a song in the database. In the latter, it is also possible to use relative less and more qualifiers. Sentences are parsed using NLP techniques [5] to extract keywords. The output of a semantic parser is defined as semantic tree. Weusepart-of-speech(POS)taggingtoanalyze only: adjectives; foreign words (for Italian tempo markings) and qualifiers. Once a word w is found to be relevant, the semantic tree is parsed to capture qualifiers ψ w, if any. Qualifiers are used to alter the distribution probability of the related concept, by using a rescaled version of the mapping to a11-pointscaleproposedin[13](tableiii).

4 Verbal label Mean Value Verbal label Mean Value not at all 0.0 rather 5.8 not 0.4 quite 5.9 hardly 1.5 quite a bit 6.5 alittle 2.5 mainly 6.8 slightly 2.5 considerably 7.6 partly 3.5 very 7.9 somewhat 4.5 highly 8.6 in-between 4.8 very much 8.7 average 4.8 fully 9.4 medium 4.9 extremely 9.6 moderately 5.0 completely 9.8 fairly 5.3 TABLE III VERBAL LABELS AND CORRESPONDENT MEAN VALUES FROM [13] As far as the application of qualifiers to NEDs is concerned, the alteration effects the concept as follows: ψ w = [ ]: thesemanticallyoppositedescriptoris considered and more evidence is assigned to values closer to the opposite bound: w d (n) N(1 µw d,ασw d ) α [0.5, 1], (8) where α is a scale factor for standard deviation, directly proportional to ψ w ; ψ w = [ ]: thenovelconceptismodeledasanormal distribution centered on the rescaled version of ψ w : w d (n) N(µ w d,σ w d ), (9) where µ w d =0.1ψw and σ d w is fixed and experimentally set to 0.2; ψ w = [ ]: sameasforψ w = [ ]; ψ w = [ ]: moreevidenceisassignedtovalues closer to the bound: w d (n) N(µw d,ασw d ) α [0.5, 1]. (10) where α is inversely proportional to ψ w In Fig. 4 an example of the four categories of qualifiers applied to the concept groovy is shown. Fig. 4. Example of application of qualifiers for the concept groovy. All the four category of qualifiers are shown: ψ w =[ ], ψ w =[ ], ψ. w =[ ], ψ w =[ ] Respect to the application of qualifiers to EDs, the alteration is modeled as follows: ψ w = [ ]: theconceptattheantipodesofvaplane is considered, focused on its mean: w ED (n VA) N(1 µ w VA,αΣw VA ) α [0.5, 1], (11) where 1 = [1, 1] T and α is a scale factor directly proportional to ψ w ; ψ w = [ ]: the final concept to consider should not be the original, but a conceptually similar one. For this reason a ring around the distribution of the original concept is generated: w ED (n VA) N(µ w VA,αΣw VA ) w ED(n VA ) α [1.5, 3], (12) where α is inversely proportionally to ψ w ; ψ w = [ ]: thedistributionisrelaxed: w ED (n VA) N(µ w VA,αΣw VA ) α [1.5, 3]; (13) where α is a scale factor directly proportional to ψ w ; ψ w = [ ]: valuescloserthecenterofthedistribution are highlighted: w ED (n VA) N(µ w VA,αΣw VA ) α [0.5, 1], (14) where α inversely proportional to ψ w. More and less qualifiers applied to query-by-similarity are modeled in the score computing phase. Their modeling will be described later in this paper. In order to obtain unique distributions for ED and similarity query space, we compute the joint distributions as the product of the original distributions: w ED (n VA )= w ED (n VA), (15) w ED Z ED s ED (n VA )= s ED (n VA ), (16) s Z S s NED (n VA )= s NED (n VA ), (17) s Z S where Z ED is the set of EDs in the query, and Z S is the set of songs retrieved by the query-by-similarity. The use of the product between normal distributions guarantees that the final result will still be a normal distribution. Since NEDs are independently modeled it is not needed to produce a unique distribution. D. Overall score computation An overall score for each song is needed to produce the resulting ranked list and is the conjunction of the text-based query score and the query-by-similarity score. As far as the text-based query score is concerned, it is proportional to the probability to match the query and is computed as the posterior probability given EDs, NEDs and similarity models. ED and NED scores for a song s are computed as: ξ s ED = w ED(µ s VA )P (s ED), (18) [ ] ξned s = wd (µ s 1 d )P (s Z d) NED, (19) where P (s ED ) and P (s d ) are the a-priori probabilities of s; Z NED is the set of NED in the query; and Z NED its

5 cardinality. As the strength of the consensus of the annotation is related to the variance, P (s ED ) and P (s d ) are inversely proportional to Σ s VA and σ s d.inordertocushiontheimpact of the consensus of the annotation on the posterior probability, P (s ED ) and P (s d ) are normalized in [0.8, 1].Asnoconsensus is available for the Tempo descriptor, its a-priori probability is set to 1. As for the query-by-similarity score for a song s, thiscan be computed as a product of song similarity ED and NED scores: ξ s S = ξ s ED,S ξs NED,S, (20) where ξed,s s and ξs NED,S are computed similarly to Eqs. (18) and (19), using s instead of w. ThequalifiersMore and less are applied only to query-by-similarity and they represent a constraint on the set of songs to consider. Given the d-th NED descriptor or w ED descriptor that the qualifier is applied to, scores for the qualifier more are computed as µ s d >µŝd or w ED (µ s VA ) >w ED(µŝV A ),foreachŝ d Z S.Thequalifier less is the dual of the qualifier more. In order to uniform the scores from different spaces, the overall score for s is computed by the geometric means of the partials: ξ s = ξ 3 S s ξs NED ξs ED (21) III. IMPLEMENTATION The ANEW [9] dataset of affective words includes over 2000 terms. However, many of them are not strictly related to mood, but more to application contexts (e.g. school or city). We filtered ANEW terms using Wordnet-Affect lexical database of English language [14], a subset of the Wordnet database [15] to retain only the concepts that are strictly related to mood. One of the main issues in building semantic music search is to collect a representative set of songs that are annotated using high-level descriptors. In [16], the authors collected a set of 240 excerpts of 15 seconds each. Each excerpt is annotated in the VA plane for each second. We averaged annotations related to each song (for each second and by all the testers) to produce the mean and the standard deviation. We expanded the dataset by adding annotations for NEDs through an online listening test. Five excerpts, among the 240 songs, were randomly proposed and testers were asked to rate each of the descriptors in a 9-point Likert scale, except for the Tempo, which is not manually annotated. Rates in the range [1,...,4] assigns the graded prevalence to the first concept, whereas rates in the range [6,...,9] to the second and 5 assert no preference. 166 people completed the test. In order to clean the set of annotations from possible outliers, we applied the Modified Z-score [17] algorithm (MZ-score). Excerpts that collected fewer than three rates for each descriptor after the MZ-score outliers analysis were discarded. We obtained annotations for 130 songs. For each Non-Emotional Descriptor, mean and standard deviation were finally computed. In order to provide NED annotations to the remaining 110 songs in the dataset, an automatic annotation system was applied. We used the 130 annotated excerpts to train a set of linear regressors and a set of robust linear regressors (one for each descriptor) [18]. The linear regressor exhibited the best performance, hence we used it for the annotation. We consider the root mean-square errors as the standard deviation of the annotation. As far as the extraction of Tempo information from songs is concerned, we used a VAMP plugin for the Sonic Annotator 8, which is based on [19]. We then manually corrected the wrongly estimated tempos. We used the Natural Language Processing Stanford parser [20] to analyze the query in a semantic tree. The Stanford parser is based on Probabilistic Context-Free Grammar (PCFG). Qualifiers are also checked in the -er form. Parser is also used in query-by-similarity to identify titles and authors. In order to make it robust against typos, we then used the Jaccard similarity metric [17] to compare found authors and titles with those in database. The system also perform a synonym analysis using Natural Language Toolkit (NLTK) [21] to be robust to missing terms in ED and NED concept spaces. The ranked list of retrieved songs is presented in a playlist form. IV. EXPERIMENTAL RESULTS AND EVALUATIONS The method that we propose in this study is based on a semantic description of songs based on a large set of concepts and qualifiers. Unfortunately, no ground truth is available and, due to the complexity of the model, producing one is a hard process that goes beyond the scope of this contribution. For this reason, the system has been evaluated through a subjective test. 53 tests have been collected in two phases. In the first phase testers were asked to rate the quality of the ranked list produced using five predefined queries. In the second phase, testers were asked to evaluate the general performances by the free use of the system. Evaluations were rated on a 9- point Likert scale. Testers were categorized according to their musical knowledge. No substantial differences have emerged between categories. This is why merged result are presented in this Section. A. Pre-defined queries We chose representative queries that are aimed at testing all the functionalities of the system. A summary of the evaluations is listed in Table IV. In general, subjects gave a positive evaluation: 4 of the 5 tests reached the mode value of 8 with asmallstandarddeviation. B. General evaluation An overview of the evaluation concerning the general performance of the system is shown in Table V. The mode of the rating is 7, agreeduponby46.67% of testers, while only 19% of the testers gave the experiments an evaluation below 5. The idea of a music search engine based on semantic natural language queries was widely appreciated: 32% of testers considered the system useful and they assigned a top mark of 9. 79% stated that they would use this kind of 8 Sonic Annotator

6 Query Mode Mean Std Iwantaverygroovyandhappysong Iwantanothappyatall,dullandflowingsong Iwantaplaylistthatsoundsangry,fastand rough Iwouldliketolistentocalm,flowingandslow songs like Orinoco Flow Iwantaplaylistnotangry,notstutteringand with a slow tempo TABLE IV EVALUATION FOR THE PREDEFINED QUERIES Question Mode Mean Std Please indicate the general evaluation on the results obtained when using free queries Do you think this system is useful? Would you ever use this kind of system? How do you evaluate the system in general? TABLE V EVALUATION FOR THE GENERAL ASPECTS OF THE SYSTEM. system and, in particular, 26% assigned it a top rating. The standard deviation at 2.02 is explained by a certain reluctance in using and learning new tools. This conclusion is based on numerous collected comments, spontaneously left by testers. Finally, subjects were asked to provide a global evaluation of the system concerning the results, the idea, the functionalities, the usefulness and the potentials. 90% positively evaluates this work and its potentials. 7 is the mode agreed upon by 42% of testers. Subjects seemed positively impressed by this type of system. A histogram of collected evaluations is shown in Fig. 5 # occurrencies Fig Rates Histogram of evaluation rates about the general concept of the system. V. CONCLUSIONS We proposed a music search engine based on textual natural language queries using emotional, non-emotional Description and semantic song similarity. We used a dimensional approach to terms conceptualization to provide a degree of intensity in music description. This allows us to use qualifiers to alter the semantics of related concepts. The adopted parsing solution relies on Natural Language Processing techniques. Concepts and songs are modeled as probability distribution in emotional and non-emotional space. The ranked list of songs is obtained by computing the final score for each song as the posterior probability, based on Bayesian Decision Theory. We finally collected subjective evaluations for a prototype of the system. Subjective tests returned good performance evaluations, with promising results for future developments. REFERENCES [1] H. H. Kim, A semantically enhanced tag-based music recommendation using emotion ontology, in Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II, 2013, pp [2] A. L.-C. Wang, An industrial-strength audio search algorithm, in Proceedings of the 4 th International Conference on Music Information Retrieval, [3] M. Slaney, Semantic-audio retrieval, in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, [4] D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet, Towards musical query-by-semantic-description using the cal500 data set, in SIGIR 2007 Proceedings, Session 18: Music Retrieva, [5] R. Dale, H. L. Moisl, and H. L. Somers, Handbook of natural language processing. CRC Press, [6] J. A. Russell, A circumplex model of affect, Jornal of Personality and Social Psycholosy,, vol.39,no.6,p ,1980. [7] L. Barrington, R. Oda, and G. Lanckriet, Smarter than genius? human evaluation of music recommender systems, in In Proc. International Symposium on Music Information Retrieval, [8] L. Chiarandini, M. Zanoni, and A. Sarti, A system for dynamic playlist generation driven by multimodal control signals and descriptors, in Multimedia Signal Processing (MMSP), 2011 IEEE 13th International Workshop on,2011. [9] M. M. Bradley and P. J. Lang, Affective norms for english words (anew): Instruction manual and affective ratings, NIMH Center for the Study of Emotion and Attention, Tech. Rep., [10] M. Lesaffre, L. D. Voogdt, M. Leman, B. D. Baets, H. D. Meyer, and J. P. Martens, How potential users of music search and retrieval systems describe the semantic quality of music, Journal of the american society for information science and technology, vol. 59, no. 5, pp , [11] L. M. Zbikowski, Modelling the groove: Conceptual structure and popular music, Journal of the Royal Musical Association, vol. 129, no. 2, pp , [12] J. Cu, R. Cabredo, R. Legaspi, and M. Suarez, On modelling emotional responses to rhythm features, PRICAI 2012: Trends in Artificial Intelligence - Lecture Notes in Computer Science, vol.7458,pp , [13] B. Rohrmann, Verbal qualifiers for rating scales: Sociolinguistic considerations and psychometric data, Project Report. University of Melbourne, Australia, Tech. Rep., [14] C. Strapparava and A. Valitutti, Wordnet-affect: an affective extension of wordnet, in Proceedings of LREC, vol.4,2004,pp [15] G. A. Miller, Wordnet: A lexical database for english, vol. 38, no. 11, pp , [16] Y. E. Kim, E. Schmidt, and L. Emelle, Moodswings: A collaborative game for music mood label collection, in Proceedings of the International Symposium on Music Information Retrieval, 2008,pp [17] P.-N. Tan, Introduction to Data Mining. Addison-Wesley; 2 edition, [18] T. D. Schmidt, E. M. and Y. E. Kim, Feature selection for contentbased, time-varying musical emotion regression, in Proc. ACM SIGMM International Conference on Multimedia Information Retrieval, Philadelphia, PA, [19] M. E. Davies and M. D. Plumbley, Context-dependent beat tracking of musical audio, Audio, Speech, and Language Processing, IEEE Transactions on, vol.15,no.3,pp ,2007. [20] D. Klein and C. D. Manning, Accurate unlexicalized parsing, in Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1. Association for Computational Linguistics, 2003, pp [21] E. Loper and S. Bird, Nltk: The natural language toolkit, in Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics- Volume 1. Association for Computational Linguistics, 2002, pp

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Ricardo Malheiro, Renato Panda, Paulo Gomes, Rui Paiva CISUC Centre for Informatics and Systems of the University of Coimbra {rsmal,

More information

Enhancing Music Maps

Enhancing Music Maps Enhancing Music Maps Jakob Frank Vienna University of Technology, Vienna, Austria http://www.ifs.tuwien.ac.at/mir frank@ifs.tuwien.ac.at Abstract. Private as well as commercial music collections keep growing

More information

A User-Oriented Approach to Music Information Retrieval.

A User-Oriented Approach to Music Information Retrieval. A User-Oriented Approach to Music Information Retrieval. Micheline Lesaffre 1, Marc Leman 1, Jean-Pierre Martens 2, 1 IPEM, Institute for Psychoacoustics and Electronic Music, Department of Musicology,

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular Music Mood Sheng Xu, Albert Peyton, Ryan Bhular What is Music Mood A psychological & musical topic Human emotions conveyed in music can be comprehended from two aspects: Lyrics Music Factors that affect

More information

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC Vaiva Imbrasaitė, Peter Robinson Computer Laboratory, University of Cambridge, UK Vaiva.Imbrasaite@cl.cam.ac.uk

More information

Lyric-Based Music Mood Recognition

Lyric-Based Music Mood Recognition Lyric-Based Music Mood Recognition Emil Ian V. Ascalon, Rafael Cabredo De La Salle University Manila, Philippines emil.ascalon@yahoo.com, rafael.cabredo@dlsu.edu.ph Abstract: In psychology, emotion is

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

A Categorical Approach for Recognizing Emotional Effects of Music

A Categorical Approach for Recognizing Emotional Effects of Music A Categorical Approach for Recognizing Emotional Effects of Music Mohsen Sahraei Ardakani 1 and Ehsan Arbabi School of Electrical and Computer Engineering, College of Engineering, University of Tehran,

More information

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Emotion Recognition. Jaesung Lee. Chung-Ang University Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or

More information

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance Methodologies for Expressiveness Modeling of and for Music Performance by Giovanni De Poli Center of Computational Sonology, Department of Information Engineering, University of Padova, Padova, Italy About

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA Ming-Ju Wu Computer Science Department National Tsing Hua University Hsinchu, Taiwan brian.wu@mirlab.org Jyh-Shing Roger Jang Computer

More information

Melody Retrieval On The Web

Melody Retrieval On The Web Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu

More information

Quality of Music Classification Systems: How to build the Reference?

Quality of Music Classification Systems: How to build the Reference? Quality of Music Classification Systems: How to build the Reference? Janto Skowronek, Martin F. McKinney Digital Signal Processing Philips Research Laboratories Eindhoven {janto.skowronek,martin.mckinney}@philips.com

More information

Music Recommendation from Song Sets

Music Recommendation from Song Sets Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia

More information

FULL-AUTOMATIC DJ MIXING SYSTEM WITH OPTIMAL TEMPO ADJUSTMENT BASED ON MEASUREMENT FUNCTION OF USER DISCOMFORT

FULL-AUTOMATIC DJ MIXING SYSTEM WITH OPTIMAL TEMPO ADJUSTMENT BASED ON MEASUREMENT FUNCTION OF USER DISCOMFORT 10th International Society for Music Information Retrieval Conference (ISMIR 2009) FULL-AUTOMATIC DJ MIXING SYSTEM WITH OPTIMAL TEMPO ADJUSTMENT BASED ON MEASUREMENT FUNCTION OF USER DISCOMFORT Hiromi

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

Unifying Low-level and High-level Music. Similarity Measures

Unifying Low-level and High-level Music. Similarity Measures Unifying Low-level and High-level Music 1 Similarity Measures Dmitry Bogdanov, Joan Serrà, Nicolas Wack, Perfecto Herrera, and Xavier Serra Abstract Measuring music similarity is essential for multimedia

More information

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Jeffrey Scott, Erik M. Schmidt, Matthew Prockup, Brandon Morton, and Youngmoo E. Kim Music and Entertainment Technology Laboratory

More information

arxiv: v1 [cs.ir] 16 Jan 2019

arxiv: v1 [cs.ir] 16 Jan 2019 It s Only Words And Words Are All I Have Manash Pratim Barman 1, Kavish Dahekar 2, Abhinav Anshuman 3, and Amit Awekar 4 1 Indian Institute of Information Technology, Guwahati 2 SAP Labs, Bengaluru 3 Dell

More information

Expressive information

Expressive information Expressive information 1. Emotions 2. Laban Effort space (gestures) 3. Kinestetic space (music performance) 4. Performance worm 5. Action based metaphor 1 Motivations " In human communication, two channels

More information

HOW COOL IS BEBOP JAZZ? SPONTANEOUS

HOW COOL IS BEBOP JAZZ? SPONTANEOUS HOW COOL IS BEBOP JAZZ? SPONTANEOUS CLUSTERING AND DECODING OF JAZZ MUSIC Antonio RODÀ *1, Edoardo DA LIO a, Maddalena MURARI b, Sergio CANAZZA a a Dept. of Information Engineering, University of Padova,

More information

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

Reducing False Positives in Video Shot Detection

Reducing False Positives in Video Shot Detection Reducing False Positives in Video Shot Detection Nithya Manickam Computer Science & Engineering Department Indian Institute of Technology, Bombay Powai, India - 400076 mnitya@cse.iitb.ac.in Sharat Chandran

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. X, NO. X, MONTH Unifying Low-level and High-level Music Similarity Measures

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. X, NO. X, MONTH Unifying Low-level and High-level Music Similarity Measures IEEE TRANSACTIONS ON MULTIMEDIA, VOL. X, NO. X, MONTH 2010. 1 Unifying Low-level and High-level Music Similarity Measures Dmitry Bogdanov, Joan Serrà, Nicolas Wack, Perfecto Herrera, and Xavier Serra Abstract

More information

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION Research & Development White Paper WHP 232 September 2012 A Large Scale Experiment for Mood-based Classification of TV Programmes Jana Eggink, Denise Bland BRITISH BROADCASTING CORPORATION White Paper

More information

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs Damian Borth 1,2, Rongrong Ji 1, Tao Chen 1, Thomas Breuel 2, Shih-Fu Chang 1 1 Columbia University, New York, USA 2 University

More information

Using Genre Classification to Make Content-based Music Recommendations

Using Genre Classification to Make Content-based Music Recommendations Using Genre Classification to Make Content-based Music Recommendations Robbie Jones (rmjones@stanford.edu) and Karen Lu (karenlu@stanford.edu) CS 221, Autumn 2016 Stanford University I. Introduction Our

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

MUSICAL MOODS: A MASS PARTICIPATION EXPERIMENT FOR AFFECTIVE CLASSIFICATION OF MUSIC

MUSICAL MOODS: A MASS PARTICIPATION EXPERIMENT FOR AFFECTIVE CLASSIFICATION OF MUSIC 12th International Society for Music Information Retrieval Conference (ISMIR 2011) MUSICAL MOODS: A MASS PARTICIPATION EXPERIMENT FOR AFFECTIVE CLASSIFICATION OF MUSIC Sam Davies, Penelope Allen, Mark

More information

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski Music Mood Classification - an SVM based approach Sebastian Napiorkowski Topics on Computer Music (Seminar Report) HPAC - RWTH - SS2015 Contents 1. Motivation 2. Quantification and Definition of Mood 3.

More information

Mood Tracking of Radio Station Broadcasts

Mood Tracking of Radio Station Broadcasts Mood Tracking of Radio Station Broadcasts Jacek Grekow Faculty of Computer Science, Bialystok University of Technology, Wiejska 45A, Bialystok 15-351, Poland j.grekow@pb.edu.pl Abstract. This paper presents

More information

Crossroads: Interactive Music Systems Transforming Performance, Production and Listening

Crossroads: Interactive Music Systems Transforming Performance, Production and Listening Crossroads: Interactive Music Systems Transforming Performance, Production and Listening BARTHET, M; Thalmann, F; Fazekas, G; Sandler, M; Wiggins, G; ACM Conference on Human Factors in Computing Systems

More information

http://www.xkcd.com/655/ Audio Retrieval David Kauchak cs160 Fall 2009 Thanks to Doug Turnbull for some of the slides Administrative CS Colloquium vs. Wed. before Thanksgiving producers consumers 8M artists

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology KAIST Juhan Nam 1 Introduction ü Instrument: Piano ü Genre: Classical ü Composer: Chopin ü Key: E-minor

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information

Wipe Scene Change Detection in Video Sequences

Wipe Scene Change Detection in Video Sequences Wipe Scene Change Detection in Video Sequences W.A.C. Fernando, C.N. Canagarajah, D. R. Bull Image Communications Group, Centre for Communications Research, University of Bristol, Merchant Ventures Building,

More information

Release Year Prediction for Songs

Release Year Prediction for Songs Release Year Prediction for Songs [CSE 258 Assignment 2] Ruyu Tan University of California San Diego PID: A53099216 rut003@ucsd.edu Jiaying Liu University of California San Diego PID: A53107720 jil672@ucsd.edu

More information

Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University

Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University danny1@stanford.edu 1. Motivation and Goal Music has long been a way for people to express their emotions. And because we all have a

More information

A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System

A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2006 A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System Joanne

More information

Music Segmentation Using Markov Chain Methods

Music Segmentation Using Markov Chain Methods Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some

More information

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Probabilist modeling of musical chord sequences for music analysis

Probabilist modeling of musical chord sequences for music analysis Probabilist modeling of musical chord sequences for music analysis Christophe Hauser January 29, 2009 1 INTRODUCTION Computer and network technologies have improved consequently over the last years. Technology

More information

Identifying Related Documents For Research Paper Recommender By CPA and COA

Identifying Related Documents For Research Paper Recommender By CPA and COA Preprint of: Bela Gipp and Jöran Beel. Identifying Related uments For Research Paper Recommender By CPA And COA. In S. I. Ao, C. Douglas, W. S. Grundfest, and J. Burgstone, editors, International Conference

More information

A Large Scale Experiment for Mood-Based Classification of TV Programmes

A Large Scale Experiment for Mood-Based Classification of TV Programmes 2012 IEEE International Conference on Multimedia and Expo A Large Scale Experiment for Mood-Based Classification of TV Programmes Jana Eggink BBC R&D 56 Wood Lane London, W12 7SB, UK jana.eggink@bbc.co.uk

More information

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Mohamed Hassan, Taha Landolsi, Husameldin Mukhtar, and Tamer Shanableh College of Engineering American

More information

Research & Development. White Paper WHP 228. Musical Moods: A Mass Participation Experiment for the Affective Classification of Music

Research & Development. White Paper WHP 228. Musical Moods: A Mass Participation Experiment for the Affective Classification of Music Research & Development White Paper WHP 228 May 2012 Musical Moods: A Mass Participation Experiment for the Affective Classification of Music Sam Davies (BBC) Penelope Allen (BBC) Mark Mann (BBC) Trevor

More information

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH Proc. of the th Int. Conference on Digital Audio Effects (DAFx-), Hamburg, Germany, September -8, HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH George Tzanetakis, Georg Essl Computer

More information

Time Series Models for Semantic Music Annotation Emanuele Coviello, Antoni B. Chan, and Gert Lanckriet

Time Series Models for Semantic Music Annotation Emanuele Coviello, Antoni B. Chan, and Gert Lanckriet IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 5, JULY 2011 1343 Time Series Models for Semantic Music Annotation Emanuele Coviello, Antoni B. Chan, and Gert Lanckriet Abstract

More information

THE EFFECT OF EXPERTISE IN EVALUATING EMOTIONS IN MUSIC

THE EFFECT OF EXPERTISE IN EVALUATING EMOTIONS IN MUSIC THE EFFECT OF EXPERTISE IN EVALUATING EMOTIONS IN MUSIC Fabio Morreale, Raul Masu, Antonella De Angeli, Patrizio Fava Department of Information Engineering and Computer Science, University Of Trento, Italy

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction

Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction Hsuan-Huei Shih, Shrikanth S. Narayanan and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical

More information

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University Week 14 Query-by-Humming and Music Fingerprinting Roger B. Dannenberg Professor of Computer Science, Art and Music Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Metadata-based

More information

Toward Multi-Modal Music Emotion Classification

Toward Multi-Modal Music Emotion Classification Toward Multi-Modal Music Emotion Classification Yi-Hsuan Yang 1, Yu-Ching Lin 1, Heng-Tze Cheng 1, I-Bin Liao 2, Yeh-Chin Ho 2, and Homer H. Chen 1 1 National Taiwan University 2 Telecommunication Laboratories,

More information

Perceptual dimensions of short audio clips and corresponding timbre features

Perceptual dimensions of short audio clips and corresponding timbre features Perceptual dimensions of short audio clips and corresponding timbre features Jason Musil, Budr El-Nusairi, Daniel Müllensiefen Department of Psychology, Goldsmiths, University of London Question How do

More information

The Million Song Dataset

The Million Song Dataset The Million Song Dataset AUDIO FEATURES The Million Song Dataset There is no data like more data Bob Mercer of IBM (1985). T. Bertin-Mahieux, D.P.W. Ellis, B. Whitman, P. Lamere, The Million Song Dataset,

More information

HOW SIMILAR IS TOO SIMILAR?: EXPLORING USERS PERCEPTIONS OF SIMILARITY IN PLAYLIST EVALUATION

HOW SIMILAR IS TOO SIMILAR?: EXPLORING USERS PERCEPTIONS OF SIMILARITY IN PLAYLIST EVALUATION 12th International Society for Music Information Retrieval Conference (ISMIR 2011) HOW SIMILAR IS TOO SIMILAR?: EXPLORING USERS PERCEPTIONS OF SIMILARITY IN PLAYLIST EVALUATION Jin Ha Lee University of

More information

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * David Ortega-Pacheco and Hiram Calvo Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan

More information

POLYPHONIC INSTRUMENT RECOGNITION FOR EXPLORING SEMANTIC SIMILARITIES IN MUSIC

POLYPHONIC INSTRUMENT RECOGNITION FOR EXPLORING SEMANTIC SIMILARITIES IN MUSIC POLYPHONIC INSTRUMENT RECOGNITION FOR EXPLORING SEMANTIC SIMILARITIES IN MUSIC Ferdinand Fuhrmann, Music Technology Group, Universitat Pompeu Fabra Barcelona, Spain ferdinand.fuhrmann@upf.edu Perfecto

More information

A Framework for Segmentation of Interview Videos

A Framework for Segmentation of Interview Videos A Framework for Segmentation of Interview Videos Omar Javed, Sohaib Khan, Zeeshan Rasheed, Mubarak Shah Computer Vision Lab School of Electrical Engineering and Computer Science University of Central Florida

More information

VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC. Chia-Hao Chung and Homer Chen

VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC. Chia-Hao Chung and Homer Chen VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC Chia-Hao Chung and Homer Chen National Taiwan University Emails: {b99505003, homer}@ntu.edu.tw ABSTRACT The flow of emotion expressed by music through

More information

Deriving the Impact of Scientific Publications by Mining Citation Opinion Terms

Deriving the Impact of Scientific Publications by Mining Citation Opinion Terms Deriving the Impact of Scientific Publications by Mining Citation Opinion Terms Sofia Stamou Nikos Mpouloumpasis Lefteris Kozanidis Computer Engineering and Informatics Department, Patras University, 26500

More information

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t MPEG-7 FOR CONTENT-BASED MUSIC PROCESSING Λ Emilia GÓMEZ, Fabien GOUYON, Perfecto HERRERA and Xavier AMATRIAIN Music Technology Group, Universitat Pompeu Fabra, Barcelona, SPAIN http://www.iua.upf.es/mtg

More information

Analysis of local and global timing and pitch change in ordinary

Analysis of local and global timing and pitch change in ordinary Alma Mater Studiorum University of Bologna, August -6 6 Analysis of local and global timing and pitch change in ordinary melodies Roger Watt Dept. of Psychology, University of Stirling, Scotland r.j.watt@stirling.ac.uk

More information

Music Information Retrieval Community

Music Information Retrieval Community Music Information Retrieval Community What: Developing systems that retrieve music When: Late 1990 s to Present Where: ISMIR - conference started in 2000 Why: lots of digital music, lots of music lovers,

More information

Feature-Based Analysis of Haydn String Quartets

Feature-Based Analysis of Haydn String Quartets Feature-Based Analysis of Haydn String Quartets Lawson Wong 5/5/2 Introduction When listening to multi-movement works, amateur listeners have almost certainly asked the following situation : Am I still

More information

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed, VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS O. Javed, S. Khan, Z. Rasheed, M.Shah {ojaved, khan, zrasheed, shah}@cs.ucf.edu Computer Vision Lab School of Electrical Engineering and Computer

More information

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections 1/23 Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections Rudolf Mayer, Andreas Rauber Vienna University of Technology {mayer,rauber}@ifs.tuwien.ac.at Robert Neumayer

More information

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface 1st Author 1st author's affiliation 1st line of address 2nd line of address Telephone number, incl. country code 1st author's

More information

1. BACKGROUND AND AIMS

1. BACKGROUND AND AIMS THE EFFECT OF TEMPO ON PERCEIVED EMOTION Stefanie Acevedo, Christopher Lettie, Greta Parnes, Andrew Schartmann Yale University, Cognition of Musical Rhythm, Virtual Lab 1. BACKGROUND AND AIMS 1.1 Introduction

More information

Retrieval of textual song lyrics from sung inputs

Retrieval of textual song lyrics from sung inputs INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Retrieval of textual song lyrics from sung inputs Anna M. Kruspe Fraunhofer IDMT, Ilmenau, Germany kpe@idmt.fraunhofer.de Abstract Retrieving the

More information

Computer Coordination With Popular Music: A New Research Agenda 1

Computer Coordination With Popular Music: A New Research Agenda 1 Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,

More information

Statistical Modeling and Retrieval of Polyphonic Music

Statistical Modeling and Retrieval of Polyphonic Music Statistical Modeling and Retrieval of Polyphonic Music Erdem Unal Panayiotis G. Georgiou and Shrikanth S. Narayanan Speech Analysis and Interpretation Laboratory University of Southern California Los Angeles,

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS Xiao Hu J. Stephen Downie Graduate School of Library and Information Science University of Illinois at Urbana-Champaign xiaohu@illinois.edu

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

A perceptual assessment of sound in distant genres of today s experimental music

A perceptual assessment of sound in distant genres of today s experimental music A perceptual assessment of sound in distant genres of today s experimental music Riccardo Wanke CESEM - Centre for the Study of the Sociology and Aesthetics of Music, FCSH, NOVA University, Lisbon, Portugal.

More information

Musical Entrainment Subsumes Bodily Gestures Its Definition Needs a Spatiotemporal Dimension

Musical Entrainment Subsumes Bodily Gestures Its Definition Needs a Spatiotemporal Dimension Musical Entrainment Subsumes Bodily Gestures Its Definition Needs a Spatiotemporal Dimension MARC LEMAN Ghent University, IPEM Department of Musicology ABSTRACT: In his paper What is entrainment? Definition

More information

Music Information Retrieval

Music Information Retrieval Music Information Retrieval Opportunities for digital musicology Joren Six IPEM, University Ghent October 30, 2015 Introduction MIR Introduction Tasks Musical Information Tools Methods Overview I Tone

More information

ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1

ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1 ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1 Roger B. Dannenberg Carnegie Mellon University School of Computer Science Larry Wasserman Carnegie Mellon University Department

More information

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC Arijit Ghosal, Rudrasis Chakraborty, Bibhas Chandra Dhara +, and Sanjoy Kumar Saha! * CSE Dept., Institute of Technology

More information

Detecting Musical Key with Supervised Learning

Detecting Musical Key with Supervised Learning Detecting Musical Key with Supervised Learning Robert Mahieu Department of Electrical Engineering Stanford University rmahieu@stanford.edu Abstract This paper proposes and tests performance of two different

More information

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY (Invited Paper) Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University, Stanford, CA 94305 {amaaron,bgirod}@stanford.edu Abstract

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Musical Acoustics Session 3pMU: Perception and Orchestration Practice

More information

Music Information Retrieval. Juan P Bello

Music Information Retrieval. Juan P Bello Music Information Retrieval Juan P Bello What is MIR? Imagine a world where you walk up to a computer and sing the song fragment that has been plaguing you since breakfast. The computer accepts your off-key

More information

Music Similarity and Cover Song Identification: The Case of Jazz

Music Similarity and Cover Song Identification: The Case of Jazz Music Similarity and Cover Song Identification: The Case of Jazz Simon Dixon and Peter Foster s.e.dixon@qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary

More information

Citation Proximity Analysis (CPA) A new approach for identifying related work based on Co-Citation Analysis

Citation Proximity Analysis (CPA) A new approach for identifying related work based on Co-Citation Analysis Bela Gipp and Joeran Beel. Citation Proximity Analysis (CPA) - A new approach for identifying related work based on Co-Citation Analysis. In Birger Larsen and Jacqueline Leta, editors, Proceedings of the

More information

Improving MeSH Classification of Biomedical Articles using Citation Contexts

Improving MeSH Classification of Biomedical Articles using Citation Contexts Improving MeSH Classification of Biomedical Articles using Citation Contexts Bader Aljaber a, David Martinez a,b,, Nicola Stokes c, James Bailey a,b a Department of Computer Science and Software Engineering,

More information

A repetition-based framework for lyric alignment in popular songs

A repetition-based framework for lyric alignment in popular songs A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine

More information

The relationship between properties of music and elicited emotions

The relationship between properties of music and elicited emotions The relationship between properties of music and elicited emotions Agnieszka Mensfelt Institute of Computing Science Poznan University of Technology, Poland December 5, 2017 1 / 19 Outline 1 Music and

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

POLITECNICO DI TORINO Repository ISTITUZIONALE

POLITECNICO DI TORINO Repository ISTITUZIONALE POLITECNICO DI TORINO Repository ISTITUZIONALE MoodyLyrics: A Sentiment Annotated Lyrics Dataset Original MoodyLyrics: A Sentiment Annotated Lyrics Dataset / Çano, Erion; Morisio, Maurizio. - ELETTRONICO.

More information

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 353 359 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p353 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index

More information