VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC. Chia-Hao Chung and Homer Chen

Size: px
Start display at page:

Download "VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC. Chia-Hao Chung and Homer Chen"

Transcription

1 VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC Chia-Hao Chung and Homer Chen National Taiwan University s: {b , ABSTRACT The flow of emotion expressed by music through time is a useful feature for music information indexing and retrieval. In this paper, we propose a novel vector representation of emotion flow for popular music. It exploits the repetitive verse-chorus structure of popular music and connects a verse (represented by a point) and its corresponding chorus (another point) in the valencearousal emotion plane. The proposed vector representation visually gives users a snapshot of the emotion flow of a popular song in an intuitive and instant manner, more effective than the point and curve representations of music emotion flow. Because many other genres also have repetitive music structure, the vector representation has a wide range of applications. Index Terms Affective content, emotion flow, music emotion representation, music structure. 1. INTRODUCTION It is commonly agreed that music listening is an appealing experience for most people because music evokes emotion in listeners. As emotion conveyed by music is important to music listening, there is a strong need for effective extraction and representation of music emotion from the music organization and retrieval perspective. This paper focuses on music emotion representation. A typical approach to music emotion representation condenses the entire emotion flow of a song to a single emotion. This approach is adopted by most music emotion recognition (MER) systems [1] [3]. It works by selecting a certain segment from the song and mapping the musical features extracted from the segment to a single emotion. The emotion representation is either a label, such as happy, angry, sad, or relaxed, or the coordinates of a point in, for example, the valence-arousal (VA) emotion plane [4]. The former is a categorical representation, while the latter is a dimensional representation [5]. A user can query songs through either form of single-point music emotion representation, and a music retrieval system responds to the query with songs that match the emotion specified by the user [6], [7]. However, the emotion of a music piece varies as it unrolls in time [8]. This dynamic nature has not been fully explored for music emotion representation, perhaps because the emotion flow of music is difficult to qualify or quantify in data collecting and model training [1]. The Fig. 1. Affect curves of four songs in the VA plane where diamonds indicate the beginning and circles indicate the end of the songs. The black curve is Smells Like Teen Spirit by Nirvana. The blue curve is Are We the Waiting by Green Day. The green curve is Dying in the sun by The Cranberries. The red curve is Barriers by Aereogramme. work that comes close is called music emotion tracking [9] [12], which generates a sequence of points at regular interval to form an affect curve in the emotion plane [13]. Four examples are shown in Fig. 1, where each curve is generated by dividing a full song into 30-second segments with 10-second hop size and by predicting the VA values of all segments. Each curve depicts the emotion of a song from the beginning to the end. We can see that the variation of music emotion can be quite complex and that a point representation cannot properly capture the dynamics of music emotion. The representation of emotion flow for music should be easy to visualize, yet sufficiently informative to convey the dynamics of music emotion. The conventional point representation of music emotion is the simplest one; however, it does not contain any dynamic information of music emotion. On the other hand, the affect curve can fairly show the dynamics of music emotion, but it is too complex to specify for users. Clearly, simplicity and informativeness are two competing criteria, and a certain degree of tradeoff between them is necessary in practice. It has been reported that the emotion expressed by a music piece has to do with music structure. Schubert et al. [14] showed that music emotion flow can be attributed to the changes of music structure. Yang et al. [15] reported that the boundaries between contrasting segments of a music piece have rapid changes of VA values. Wang et al /15/$ IEEE

2 (a) (b) (c) Verse Chorus Verse Chorus Verse Chorus Fig. 2. (a) Music structure of Smells Like Teen Spirit by Nirvana. (b) The arousal values and (c) the valence values of all 30- second segments of the song. [16] showed that exploiting the music structure of popular music for segment selection improves the performance of an MER system. For popular music, the music structure usually consists of a number of repetitive musical sections [17]. Each musical section refers to a song segment that has its own musical role such as verse or chorus. As shown in Fig. 2, popular music typically has repetitive verse-chorus structure and its emotion flow changes significantly during the transition between verse and chorus sections. The burgeoning evidence of the strong relation between music structure and emotion flow motivates us to develop an effective representation of emotion flow for music retrieval. The proposed emotion flow representation of a song is a vector in the VA emotion plane, pointing from the emotion of a verse to the emotion of its corresponding chorus. This representation is simple and intuitive, which is made possible by exploiting the repetitive property of music structure of popular music. We focus on popular music in this paper because it has perhaps the largest user base on a daily basis and because its structure is normally within a finite set of well-known patterns [18] [22]. In summary, the primary contributions of this paper include: A study on the music structure of popular music, such as pop, R&B, and rock songs, is conducted to demonstrate the repetitive property of the music structure of popular music (Section 2). A novel vector representation of emotion flow for popular music is proposed. A comprehensive comparison of the proposed vector representation with the point and curve representations is presented (Section 3 and 4). A performance study is conducted to demonstrate the accuracy and effectiveness of the vector representation in capturing the emotion flow of a song (Section 5). 2. MUSIC STRUCTURE OF POPULAR MUSIC Music is an art form of organized sounds. A popular song can be divided into a number of musical sections, such as introduction (intro), verse, chorus, bridge, instrumental solo, and ending (outro) [18]. Such sections are structured (maybe repeatedly) in a particular pattern referred to as Table. 1. Music structure statistics of the 60 English popular songs of the NTUMIR-60 dataset. Times per song Proportion to song Intro Verse Chorus Others Outro musical form. Recovering the musical form is called music structure analysis and can be considered a segmentation process that detects the temporal position and duration of each segment [19]. Here, we briefly review the common musical sections and their musical roles. Intro and outro indicate the beginning and the ending sections, respectively, of a song and usually only contain instrumental sounds without singing voice and lyrics. However, not every song has intro or outro. For example, composers may place a verse or a chorus in the beginning or at the end of a song to make the song sound special. The sections corresponding to verse or chorus normally express a flow of emotion as the music unfolds. The verse usually has low energy, and it is the place where the story of the song is narrated. Compared to verse, chorus is emotive and leaves significant impression on listeners [20]. Other structural elements, such as bridge and instrumental solo, are optional and function as transitional sections to avoid monotonous composition and to make the song colorful. Bridge means a transition between other types of sections, and instrumental solo is predominantly the special transitional section of instrumental sounds. To investigate music structure, we conduct an analysis of NTUMIR-60, which is a dataset consisting of 60 English popular songs [23]. Because the state-of-the-art automatic music structure analysis is not as accurate as expected [19], [21], we perform the analysis manually. The results are shown in Table 1. We can see that verse and chorus indeed make a large portion of a song and on the average appear 3.13 and 2.37 times, respectively, per song. This is consistent with the findings by musicologists that verse and chorus is a widely used musical form (aka the verse-chorus form) for song writers of popular music [20]. This also suggests that verse and chorus are the most memorable sections of a song [22] and represent the main affection of the song. The corresponding emotion flow gives listeners an affective sensation. 3. MUSIC EMOTION REPRESENTATION In either the categorical or the dimensional approach, the typical representation of music emotion represents the affective content of a song by a single emotion. The categorical approach describes emotion using a finite number of discrete affective terms [24], [25], whereas the dimensional approach defines emotion in a continuous space, such as the VA plane [26], [27]. In this section, we first review the point and curve representation in the dimensional approach and then present the vector representation in detail.

3 Table. 2. A comparison of the point, the curve, and the proposed vector representations. Point Curve Vector Locational information X * X X Dynamic information X X Fig. 3. Illustration of the vector representation of music emotion flow. The two terminals of the vector represent a verse and its corresponding chorus in the VA plane Point and curve representation In the dimensional approach, the VA values of a music segment can be predicted from the extracted features of the music segment through a regression formulation of the MER problem [26]. The emotion of the music segment is represented by a point in the VA plane. Given the entire music segments of a song, one may select one of them to represent the whole song. This gives rise to the single point representation of music emotion in the VA plane, and a user only has to specify the coordinates of the point in the VA plane to retrieve the corresponding song. Although this method provides an intuitive way for music retrieval, as discussed in Section 1, it is impossible to represent the emotion flow of whole song by a single point in the VA plane. In addition, which music segment really represents the entire song is difficult to determine automatically by computer. By dividing a song into a number of segments and predicting the VA values of each music segment [11], [12], the collection of VA points forms an affect curve of the song in the VA plane. One may also represent valence and arousal of the song separately, each as a function of time. Although such affect curves can indeed show the emotion flow of a song, the representation is too complex to be adopted in a music retrieval system, because most users are unable to precisely specify the affect curve of a song even if it is a familiar one. In addition, how to measure the similarity (or distance) between two affect curves with different lengths is an open issue. Therefore, a simple approach is desirable Vector representation By exploiting the repetitive property of music structure of popular music, we can represent the characteristic of emotion flow in a much simpler way than the affect curve representation. As discussed in Section 2, the verse-chorus form is a common music structure of popular music and has a strong relation to the emotion flow of a song. Therefore, we leverage it to construct the emotion flow representation of a song. The resulting representation is a vector pointing from a verse to its corresponding chorus in the VA emotion plane, as illustrated in Fig. 3. Structural information Complexity Low High Medium * A checked box means yes. Besides the positional information of the verses and choruses in the VA plane, the vector representation indicates the direction and strength of the emotion flow of a song. Therefore, the vector representation is more informative than the point representation. Since the two terminals of a vector represent the emotions of a verse and its corresponding chorus of a song, this representation is more intuitive and simpler to use than the affect curve, which does not explicitly present the structural information of a song. Indeed, the vector representation does express the main emotion flow of a song characterized in the versechorus form. Table 2 shows a qualitative comparison of the point representation, the affect curve representation, and the proposed vector representation. We can see that the vector representation of emotion flow is novel, simple, and intuitive. Users can easily search songs by specifying a vector in the VA plane as the query, and a music retrieval system can quickly respond to the query according to the proximity of a candidate song to the vector. In practice, a set of candidate songs can be generated and ordered according to the proximity when presented to the user. With this representation of music emotion flow, many innovative music retrieval mechanisms can be developed to match the needs of a specific application. Although we focus on popular music in this paper, the repetitive property of music structure can also be found in other genres, such as the sonata form and the rondo form of classical music [28]. The vector representation is good for the visualization of the emotion flow of such music as well. 4. IMPLEMENTATION The MER system described in [26] serves as the platform to generate the VA values of musical sections (segments). The MER system consists of two main steps, as shown in Fig. 4. The first step performs regression model training, and the second step takes musical sections as inputs and generates their VA values. The details of regression model training and vector representation generation are described in this section. X

4 Fig. 4. Overview of an MER system 4.1. Regression model training Adopting the dimensional approach for MER, we define the valence and arousal as real values in [ 1, 1] and formulate the prediction of VA values as a regression problem. Denote the input training data by (x i, y i), where 1 i N, x i is the feature vector for the ith input data, and y i is the real value to be predicted for the ith input data. A regression model (regressor) is trained by minimizing the mean squared difference between the prediction and the annotated value [26]. The dataset NTUMIR-60, which is composed of 60 English popular songs, is used for training and testing. For fair comparison, each song is converted to a uniform format (22,050 Hz, 16 bits, and mono channel PCM WAV) and normalized to the same volume level. Then, each song is trimmed to a 30-second segment manually to conduct the subjective test and the feature extraction. In the subjective test, each segment is annotated by 40 participants, and the mean of the annotated VA values is calculated and used as the ground truth of the segment. Then the MIRToolbox [29] is applied to extract 177 features including the following five types of acoustic features: two dynamic features (the mean and the standard deviation of root-mean-squared energy), five rhythmic features (fluctuation peak, fluctuation centroid, tempo, pulse clarity, and event density), 142 spectral features (the mean and the standard deviation of centroid, brightness, spread, skewness, kurtosis, rolloff 85%, rolloff 95%, entropy, flatness, roughness, irregularity, 20 MFCCs, 20 delta MFCCs, and 20 delta-delta MFCCs), six timbre features (the mean and the standard deviation of zero crossing rate, low energy, and spectral flux), and 22 tonal features (12-bin chromagram concatenated with the mean and the standard deviation of chromagram peak, chromagram centroid, key clarity, HCDF, and mode). The quality of NTUMIR-60 for MER is evaluated and reported in [23]. The regression models of arousal and valence are trained independently. For accuracy, the support vector regression (SVR) [30], [31] with radial basis kernel function is adopted to train the regressors. A grid-search is applied to find the best kernel parameter γ and the best penalty parameter C [32], where γ {10 4, 10 3, 10 2, 10 1 } and C {1, 10 1, 10 2, 10 3, 10 4 }. To evaluate the performances of the regressors, tenfold cross validation is conducted. The whole dataset is randomly divided into 10 parts, nine of them for training and the remaining one for testing. The above process is repeated 50 times. The average performance in term of the R-squared value [33] is 0.21 for valence and 0.76 for Fig. 5. The proposed vector representation provides an intuitive visualization of music emotion flow in the VA plane. This chart shows the emotion flows of all the songs in the NTUMIR-60 dataset. Each blue diamond represents the emotion of verses, and each red circle represents the emotion of choruses connected to the corresponding verses by a line segment. arousal. This result is comparable to the one reported in the previous work [23], [26] Generating vector representation The audio segmentation method proposed in [34] is applied to segment each song of the NTUMIR-60 dataset. All verses and choruses are manually selected from the song based on the result of segmentation, and their VA values are estimated independently. In our current implementation the vector representation of the song in the VA plane is generated by connecting the point representing the average verse with that representing the average chorus. Fig. 5 shows the resulting vector representations of all songs of the NTUMIR-60 dataset. We can see that each vector clearly describes the emotion flow of a song. For example, a vector in the first quadrant pointing to the upper right corner indicates that the corresponding song drives listeners toward a positive and exciting feeling, whereas a vector in the second quadrant pointing toward the upper left corner indicates that the song it represents would drive listeners toward a negative and aggressive mood. We also see that, for most of the songs, the arousal value of its representative chorus is higher than that of the corresponding verse. That is, the emotion vectors usually go upward. This reflects the fact that the chorus is typically more exciting than its corresponding verse [20]. 5. EVALUATION An experiment is conducted to evaluate the effectiveness of the proposed vector representation of music emotion flow in comparison with two ad hoc methods. The effectiveness of a method is measured in terms of the approximation error between the method and the emotion flow of a song. All songs of the NTUMIR-60 dataset are considered in this experiment.

5 Table. 3. Results of Euclidean and cosine distances between the ground truth and three different approaches. Random F30L30 1 Vector Euclidean distance Cosine distance F30L30 means that the first segment is from the 30th to 60th second and the second segment is from the last 60th to the last 30th second of a song. 2 Cosine distance is defined as 1 minus cosine similarity. As discussed in Section 1, the emotion flow of a song is difficult for a subject to specify; therefore, we use the affect curve generated by MER as the ground truth. Specifically, the affect curve of each song is generated by dividing the full song into 30-second segments with 10- second hop size and by predicting the VA values of all segments. Then, a k-means algorithm [35] is applied to partition the collection of VA points into two clusters. The center points of these two clusters are used as reference to calculate the approximation error of the proposed vector representation and compare it with that of the two ad hoc methods. The first ad hoc method randomly selects two 30- second segments from a song and constructs a vector representation from them. The second ad hoc method selects the first segment from the 30th to 60th second of a song and the second segment from the last 60th to last 30th second of the song. The VA values of the two selected segments are predicted independently. Two distance measures are considered: Euclidean distance and cosine similarity [36]. The former is applied to compute the difference of two vectors in length, and the latter is applied to compute the angular difference of two vectors. The experimental results are shown in Table 3. Note that the process of randomly selecting two segments from a song is repeated 100 times and the average results are presented in the first column of Table 3. Compared with the two ad hoc methods, the vector representation has the smallest approximation error in both Euclidean distance and cosine distance. This shows effectiveness of the vector representation in capturing the emotion flow of popular music. In Fig. 6, the vector representation of the emotion flow of each song is plotted together with the affect curve of the song and the emotion of each verse and chorus identified for the song. We can see that most vectors are located in the repetitive region of the affect curves. The dangling parts of an affect curve normally correspond to the intro and outro sections of the song, and hence they are of no concern. We can also see that the verses are located on one side of the affect curve of a song while the choruses are located on the other side. Thus, using the average verse and average chorus for the vector representation can effectively characterize the affect curve and the emotion flow. Fig. 6. Most vectors (represented by a diamond-circle pair) generated by our method are in the repetitive region of the affect curves (shown in grey). The hollow diamond represents the emotion of a verse, and the hollow circle represents the emotion of a chorus of a song. 6. CONCLUSION In this paper, we have investigated the repetitive property of music structure and described a novel approach that represents the emotion flow of popular music by a vector in the VA plane. The vector emerges from a representative verse of a song and ends at the corresponding chorus. We have also compared the proposed vector representation

6 with point and curve representations of music emotion and shown that the proposed method is an intuitive and effective representation of emotion flow for popular music. This property of our method is supported by experimental results. This work is motivated by the increasing need for effective music content representation and analysis in response to the explosive content growth. With the proposed vector representation, the proximity of emotion flow between two songs can be easily measured, which is essential to music retrieval, and many innovative music retrieval applications can be developed. REFERENCES [1] Y.-H. Yang and H. H. Chen, Music Emotion Recognition, CRC Press, [2] Y.-H. Yang and H. H. Chen, Machine recognition of music emotion: A review, ACM Trans. Intell. Syst. Technol., vol. 3, no. 3, article 40, [3] Y. E. Kim, E. M. Schmidt, R. Migneco, B. G. Morton, P. Richardson, J. Scott, J. A. Speck, and D. Turnbull, Music emotion recognition: A state of the art review, in Proc. 11th Int. Soc. Music Inform. Retrieval Conf., pp , Utrecht, Netherlands, [4] J. A. Russell, A circumplex model of affect, J. Pers. Soc. Psychol., vol. 39, no. 6, pp , [5] T. Eerola and J. K. Vuoskoski, A comparison of the discrete and dimensional models of emotion in music, Psychol. Music, vol. 39, no. 1, pp , [6] X. Zhu, Y.-Y. Shi, H.-G. Kim, and K.-W. Eom, An integrated music recommendation system, IEEE Trans. Consum. Electron., vol. 53, no. 2, pp , [7] Y.-H. Yang, Y.-C. Lin, H.-T. Cheng, and H. H. Chen, Mr. Emo: Music retrieval in the emotion plane, in Proc. ACM Multimedia, pp , Vancouver, Canada, [8] E. Schubert, Measurement and time series analysis of emotion in music, Ph.D. dissertation, School of Music & Music Education, University of New South Wales, Sydney, Australia, [9] L. Lu, D. Liu, and H.-J. Zhang, Automatic mood detection and tracking of music audio signals, IEEE Trans. Audio, Speech, Language Process., vol. 14, no. 1, pp. 5-18, [10] M. D. Korhonen, D. A. Clausi, and M. E. Jernigan, Modeling emotional content of music using system identification, IEEE Trans. Syst. Man, Cybern. B, Cybern., vol. 36, no. 3, pp , [11] R. Panda and R. P. Paiva, Using support vector machines for automatic mood tracking in audio music, Audio Engineering Soc. Convention 130, London, UK, [12] E. M. Schmidt, D. Turnbull, and Y. E. Kim, Feature selection for content-based, time-varying musical emotion regression, in Proc. ACM Int. Conf. Multimedia Inform. Retrieval, pp , Philadelphia, USA, [13] A. Hanjalic and L.-Q. Xu, Affective video content representation and modeling, IEEE Trans. Multimedia, vol. 7, no. 1, pp , [14] E. Schubert, S. Ferguson, N. Farrar, D. Taylor, and G. E. McPherson, Continuous response to music using discrete emotion faces, in Proc. 9th Int. Symp. Computer Music Modelling and Retrieval, pp. 1-17, London, UK, [15] Y.-H. Yang, C.-C. Liu, and H. H. Chen, Music emotion classification: A fuzzy approach, in Proc. ACM Multimedia, pp , Santa Barbara, USA, [16] X. Wang, Y. Wu, X. Chen, and D. Yang, Enhance popular music emotion regression by importing structure information, in Proc. Asia-Pacific Signal and Inform. Process. Association Annu. Summit and Conf., pp. 1-4, Kaohsiung, Taiwan, [17] B. Horner and T. Swiss, Key Terms in Popular Music and Culture, Blackwell Publishing, [18] N. C. Maddage, C. Xu, M. S. Kankanhalli, and X. Shao, Content-based music structure analysis with applications to music semantics understanding, in Proc. ACM Multimedia, pp , NY, USA, [19] J. Paulus, M. Müller, and A. Klapuri, Audio-based music structure analysis, in Proc. 11th Int. Soc. Music Inform. Retrieval Conf., pp , Utrecht, Netherlands, [20] D. Christopher, Rockin' out: expressive modulation in verse chorus form, Music Theory Online, vol. 17, [21] J. B. L. Smith, C.-H. Chuan, and E. Chew, Audio properties of perceived boundaries in music, IEEE Trans. Multimedia, vol. 16, no. 5, pp , [22] M. Cooper and J. Foote, Summarizing popular music via structural similarity analysis, in Proc. IEEE Workshop on Applications of Signal Process. Audio and Acoustic, pp , New Paltz, NY, USA, [23] Y.-H. Yang, Y.-F. Su, Y.-C. Lin, and H. H. Chen, Music emotion recognition: The role of individuality, in Proc. ACM Int. Workshop on Human-centered Multimedia, pp , Augsburg, Bavaria, Germany, [24] X. Hu, J. S. Downie, C. Laurier, M. Bay, and A. F. Ehmann, The 2007 MIREX audio mood classification task: Lessons learned, in Proc. 9th Int. Conf. Music Inform. Retrieval, pp , Philadelphia, USA, [25] C. Laurier, J. Grivolla, and P. Herrera, Multimodal music mood classification using audio and lyrics, in Proc. IEEE 7th Int. Conf. Machine Learning and Applications, pp , San Diego, California, USA, [26] Y.-H. Yang, Y.-C. Lin, Y.-F. Su, and H. H. Chen, A regression approach to music emotion recognition, IEEE Trans. Audio, Speech, Language Process., vol. 16, no. 2, pp , [27] E. M. Schmidt and Y. E. Kim, Projection of acoustic features to continuous valence-arousal mood labels via regression, in Proc. 10th Int. Soc. Music Inform. Retrieval Conf., Kobe, Japan, [28] M. Hickey, Assessment rubrics for music composition, Music Educators Journal, vol. 85, no. 4, pp , [29] O. Lartillot and P. Toiviainen, A MATLAB toolbox for musical feature extraction from audio, in Proc. Int. Conf. Digital Audio Effects, pp , Bordeaux, France, [30] A. J. Smola and B. Schölkopf, A tutorial on support vector regression, Stat. Comput., vol. 4, no. 3, pp , [31] C.-C. Chang and C.-J. Lin, LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, article 27, [32] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, A practical guide to support vector classification, Technical report, National Taiwan University, 2010 [Online]. Available at: [33] A. Sen and M. S. Srivastava, Regression Analysis: Theory, Methods, and Applications, Springer Science & Business Media, [34] J. Foote and M. Cooper, Media segmentation using selfsimilarity decomposition, in Proc. SPIE Storage and Retrieval for Multimedia Databases, vol. 5021, pp , [35] S. P. Lloyd, Least squares quantization in PCM, IEEE Trans. Inform. Theory, vol. 28, no. 2, pp , [36] L. Lee, Measures of distributional similarity, in Proc. 37th Annu. Meeting of the Association for Computational Linguistics on Computational Linguistics, pp , PA, USA, 1999.

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC Vaiva Imbrasaitė, Peter Robinson Computer Laboratory, University of Cambridge, UK Vaiva.Imbrasaite@cl.cam.ac.uk

More information

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Ricardo Malheiro, Renato Panda, Paulo Gomes, Rui Paiva CISUC Centre for Informatics and Systems of the University of Coimbra {rsmal,

More information

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular Music Mood Sheng Xu, Albert Peyton, Ryan Bhular What is Music Mood A psychological & musical topic Human emotions conveyed in music can be comprehended from two aspects: Lyrics Music Factors that affect

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Emotion Recognition. Jaesung Lee. Chung-Ang University Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features R. Panda 1, B. Rocha 1 and R. P. Paiva 1, 1 CISUC Centre for Informatics and Systems of the University of Coimbra, Portugal

More information

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Jeffrey Scott, Erik M. Schmidt, Matthew Prockup, Brandon Morton, and Youngmoo E. Kim Music and Entertainment Technology Laboratory

More information

Mood Tracking of Radio Station Broadcasts

Mood Tracking of Radio Station Broadcasts Mood Tracking of Radio Station Broadcasts Jacek Grekow Faculty of Computer Science, Bialystok University of Technology, Wiejska 45A, Bialystok 15-351, Poland j.grekow@pb.edu.pl Abstract. This paper presents

More information

Supervised Learning in Genre Classification

Supervised Learning in Genre Classification Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music

More information

A Categorical Approach for Recognizing Emotional Effects of Music

A Categorical Approach for Recognizing Emotional Effects of Music A Categorical Approach for Recognizing Emotional Effects of Music Mohsen Sahraei Ardakani 1 and Ehsan Arbabi School of Electrical and Computer Engineering, College of Engineering, University of Tehran,

More information

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Fengyan Wu fengyanyy@163.com Shutao Sun stsun@cuc.edu.cn Weiyao Xue Wyxue_std@163.com Abstract Automatic extraction of

More information

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski Music Mood Classification - an SVM based approach Sebastian Napiorkowski Topics on Computer Music (Seminar Report) HPAC - RWTH - SS2015 Contents 1. Motivation 2. Quantification and Definition of Mood 3.

More information

MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET

MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET Diane Watson University of Saskatchewan diane.watson@usask.ca Regan L. Mandryk University of Saskatchewan regan.mandryk@usask.ca

More information

The Role of Time in Music Emotion Recognition

The Role of Time in Music Emotion Recognition The Role of Time in Music Emotion Recognition Marcelo Caetano 1 and Frans Wiering 2 1 Institute of Computer Science, Foundation for Research and Technology - Hellas FORTH-ICS, Heraklion, Crete, Greece

More information

Exploring Relationships between Audio Features and Emotion in Music

Exploring Relationships between Audio Features and Emotion in Music Exploring Relationships between Audio Features and Emotion in Music Cyril Laurier, *1 Olivier Lartillot, #2 Tuomas Eerola #3, Petri Toiviainen #4 * Music Technology Group, Universitat Pompeu Fabra, Barcelona,

More information

Lyric-Based Music Mood Recognition

Lyric-Based Music Mood Recognition Lyric-Based Music Mood Recognition Emil Ian V. Ascalon, Rafael Cabredo De La Salle University Manila, Philippines emil.ascalon@yahoo.com, rafael.cabredo@dlsu.edu.ph Abstract: In psychology, emotion is

More information

COMPUTATIONAL MODELING OF INDUCED EMOTION USING GEMS

COMPUTATIONAL MODELING OF INDUCED EMOTION USING GEMS COMPUTATIONAL MODELING OF INDUCED EMOTION USING GEMS Anna Aljanaki Utrecht University A.Aljanaki@uu.nl Frans Wiering Utrecht University F.Wiering@uu.nl Remco C. Veltkamp Utrecht University R.C.Veltkamp@uu.nl

More information

A Study on Cross-cultural and Cross-dataset Generalizability of Music Mood Regression Models

A Study on Cross-cultural and Cross-dataset Generalizability of Music Mood Regression Models A Study on Cross-cultural and Cross-dataset Generalizability of Music Mood Regression Models Xiao Hu University of Hong Kong xiaoxhu@hku.hk Yi-Hsuan Yang Academia Sinica yang@citi.sinica.edu.tw ABSTRACT

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

Methods for the automatic structural analysis of music. Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010

Methods for the automatic structural analysis of music. Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010 1 Methods for the automatic structural analysis of music Jordan B. L. Smith CIRMMT Workshop on Structural Analysis of Music 26 March 2010 2 The problem Going from sound to structure 2 The problem Going

More information

PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS

PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS Andy M. Sarroff and Juan P. Bello New York University andy.sarroff@nyu.edu ABSTRACT In a stereophonic music production, music producers

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

Toward Multi-Modal Music Emotion Classification

Toward Multi-Modal Music Emotion Classification Toward Multi-Modal Music Emotion Classification Yi-Hsuan Yang 1, Yu-Ching Lin 1, Heng-Tze Cheng 1, I-Bin Liao 2, Yeh-Chin Ho 2, and Homer H. Chen 1 1 National Taiwan University 2 Telecommunication Laboratories,

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

THEORETICAL FRAMEWORK OF A COMPUTATIONAL MODEL OF AUDITORY MEMORY FOR MUSIC EMOTION RECOGNITION

THEORETICAL FRAMEWORK OF A COMPUTATIONAL MODEL OF AUDITORY MEMORY FOR MUSIC EMOTION RECOGNITION THEORETICAL FRAMEWORK OF A COMPUTATIONAL MODEL OF AUDITORY MEMORY FOR MUSIC EMOTION RECOGNITION Marcelo Caetano Sound and Music Computing Group INESC TEC, Porto, Portugal mcaetano@inesctec.pt Frans Wiering

More information

Music Structure Analysis

Music Structure Analysis Lecture Music Processing Music Structure Analysis Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals

More information

Music Segmentation Using Markov Chain Methods

Music Segmentation Using Markov Chain Methods Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some

More information

Audio Structure Analysis

Audio Structure Analysis Tutorial T3 A Basic Introduction to Audio-Related Music Information Retrieval Audio Structure Analysis Meinard Müller, Christof Weiß International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de,

More information

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu

More information

Chord Classification of an Audio Signal using Artificial Neural Network

Chord Classification of an Audio Signal using Artificial Neural Network Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

Multimodal Music Mood Classification Framework for Christian Kokborok Music

Multimodal Music Mood Classification Framework for Christian Kokborok Music Journal of Engineering Technology (ISSN. 0747-9964) Volume 8, Issue 1, Jan. 2019, PP.506-515 Multimodal Music Mood Classification Framework for Christian Kokborok Music Sanchali Das 1*, Sambit Satpathy

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD

MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD AROUSAL 12th International Society for Music Information Retrieval Conference (ISMIR 2011) MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD Matt McVicar Intelligent Systems

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION Research & Development White Paper WHP 232 September 2012 A Large Scale Experiment for Mood-based Classification of TV Programmes Jana Eggink, Denise Bland BRITISH BROADCASTING CORPORATION White Paper

More information

A COMPARISON OF PERCEPTUAL RATINGS AND COMPUTED AUDIO FEATURES

A COMPARISON OF PERCEPTUAL RATINGS AND COMPUTED AUDIO FEATURES A COMPARISON OF PERCEPTUAL RATINGS AND COMPUTED AUDIO FEATURES Anders Friberg Speech, music and hearing, CSC KTH (Royal Institute of Technology) afriberg@kth.se Anton Hedblad Speech, music and hearing,

More information

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA Ming-Ju Wu Computer Science Department National Tsing Hua University Hsinchu, Taiwan brian.wu@mirlab.org Jyh-Shing Roger Jang Computer

More information

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH Proc. of the th Int. Conference on Digital Audio Effects (DAFx-), Hamburg, Germany, September -8, HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH George Tzanetakis, Georg Essl Computer

More information

POLITECNICO DI TORINO Repository ISTITUZIONALE

POLITECNICO DI TORINO Repository ISTITUZIONALE POLITECNICO DI TORINO Repository ISTITUZIONALE MoodyLyrics: A Sentiment Annotated Lyrics Dataset Original MoodyLyrics: A Sentiment Annotated Lyrics Dataset / Çano, Erion; Morisio, Maurizio. - ELETTRONICO.

More information

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Mohamed Hassan, Taha Landolsi, Husameldin Mukhtar, and Tamer Shanableh College of Engineering American

More information

UC San Diego UC San Diego Previously Published Works

UC San Diego UC San Diego Previously Published Works UC San Diego UC San Diego Previously Published Works Title Classification of MPEG-2 Transport Stream Packet Loss Visibility Permalink https://escholarship.org/uc/item/9wk791h Authors Shin, J Cosman, P

More information

A Large Scale Experiment for Mood-Based Classification of TV Programmes

A Large Scale Experiment for Mood-Based Classification of TV Programmes 2012 IEEE International Conference on Multimedia and Expo A Large Scale Experiment for Mood-Based Classification of TV Programmes Jana Eggink BBC R&D 56 Wood Lane London, W12 7SB, UK jana.eggink@bbc.co.uk

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions 1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,

More information

MODELS of music begin with a representation of the

MODELS of music begin with a representation of the 602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010 Modeling Music as a Dynamic Texture Luke Barrington, Student Member, IEEE, Antoni B. Chan, Member, IEEE, and

More information

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU Siyu Zhu, Peifeng Ji,

More information

Statistical Modeling and Retrieval of Polyphonic Music

Statistical Modeling and Retrieval of Polyphonic Music Statistical Modeling and Retrieval of Polyphonic Music Erdem Unal Panayiotis G. Georgiou and Shrikanth S. Narayanan Speech Analysis and Interpretation Laboratory University of Southern California Los Angeles,

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University A Pseudo-Statistical Approach to Commercial Boundary Detection........ Prasanna V Rangarajan Dept of Electrical Engineering Columbia University pvr2001@columbia.edu 1. Introduction Searching and browsing

More information

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION H. Pan P. van Beek M. I. Sezan Electrical & Computer Engineering University of Illinois Urbana, IL 6182 Sharp Laboratories

More information

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,

More information

Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis

Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis R. Panda 1, R. Malheiro 1, B. Rocha 1, A. Oliveira 1 and R. P. Paiva 1, 1 CISUC Centre for Informatics and Systems

More information

Outline. Why do we classify? Audio Classification

Outline. Why do we classify? Audio Classification Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify

More information

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface 1st Author 1st author's affiliation 1st line of address 2nd line of address Telephone number, incl. country code 1st author's

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC

MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC Maria Panteli University of Amsterdam, Amsterdam, Netherlands m.x.panteli@gmail.com Niels Bogaards Elephantcandy, Amsterdam, Netherlands niels@elephantcandy.com

More information

The song remains the same: identifying versions of the same piece using tonal descriptors

The song remains the same: identifying versions of the same piece using tonal descriptors The song remains the same: identifying versions of the same piece using tonal descriptors Emilia Gómez Music Technology Group, Universitat Pompeu Fabra Ocata, 83, Barcelona emilia.gomez@iua.upf.edu Abstract

More information

Audio Feature Extraction for Corpus Analysis

Audio Feature Extraction for Corpus Analysis Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1 Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

Enhancing Music Maps

Enhancing Music Maps Enhancing Music Maps Jakob Frank Vienna University of Technology, Vienna, Austria http://www.ifs.tuwien.ac.at/mir frank@ifs.tuwien.ac.at Abstract. Private as well as commercial music collections keep growing

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

Analysing Musical Pieces Using harmony-analyser.org Tools

Analysing Musical Pieces Using harmony-analyser.org Tools Analysing Musical Pieces Using harmony-analyser.org Tools Ladislav Maršík Dept. of Software Engineering, Faculty of Mathematics and Physics Charles University, Malostranské nám. 25, 118 00 Prague 1, Czech

More information

Semantic Segmentation and Summarization of Music

Semantic Segmentation and Summarization of Music [ Wei Chai ] DIGITALVISION, ARTVILLE (CAMERAS, TV, AND CASSETTE TAPE) STOCKBYTE (KEYBOARD) Semantic Segmentation and Summarization of Music [Methods based on tonality and recurrent structure] Listening

More information

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases *

Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 31, 821-838 (2015) Automatic Singing Performance Evaluation Using Accompanied Vocals as Reference Bases * Department of Electronic Engineering National Taipei

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS M.G.W. Lakshitha, K.L. Jayaratne University of Colombo School of Computing, Sri Lanka. ABSTRACT: This paper describes our attempt

More information

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS Christian Fremerey, Meinard Müller,Frank Kurth, Michael Clausen Computer Science III University of Bonn Bonn, Germany Max-Planck-Institut (MPI)

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

A Music Retrieval System Using Melody and Lyric

A Music Retrieval System Using Melody and Lyric 202 IEEE International Conference on Multimedia and Expo Workshops A Music Retrieval System Using Melody and Lyric Zhiyuan Guo, Qiang Wang, Gang Liu, Jun Guo, Yueming Lu 2 Pattern Recognition and Intelligent

More information

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS Matthew Prockup, Erik M. Schmidt, Jeffrey Scott, and Youngmoo E. Kim Music and Entertainment Technology Laboratory (MET-lab) Electrical

More information

Tempo and Beat Analysis

Tempo and Beat Analysis Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:

More information

Adaptive Key Frame Selection for Efficient Video Coding

Adaptive Key Frame Selection for Efficient Video Coding Adaptive Key Frame Selection for Efficient Video Coding Jaebum Jun, Sunyoung Lee, Zanming He, Myungjung Lee, and Euee S. Jang Digital Media Lab., Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul,

More information

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES

A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES A NOVEL CEPSTRAL REPRESENTATION FOR TIMBRE MODELING OF SOUND SOURCES IN POLYPHONIC MIXTURES Zhiyao Duan 1, Bryan Pardo 2, Laurent Daudet 3 1 Department of Electrical and Computer Engineering, University

More information

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS MOTIVATION Thank you YouTube! Why do composers spend tremendous effort for the right combination of musical instruments? CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

More information

Color Image Compression Using Colorization Based On Coding Technique

Color Image Compression Using Colorization Based On Coding Technique Color Image Compression Using Colorization Based On Coding Technique D.P.Kawade 1, Prof. S.N.Rawat 2 1,2 Department of Electronics and Telecommunication, Bhivarabai Sawant Institute of Technology and Research

More information

Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates

Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates Konstantinos Trochidis, David Sears, Dieu-Ly Tran, Stephen McAdams CIRMMT, Department

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

Speech To Song Classification

Speech To Song Classification Speech To Song Classification Emily Graber Center for Computer Research in Music and Acoustics, Department of Music, Stanford University Abstract The speech to song illusion is a perceptual phenomenon

More information

Music Tempo Classification Using Audio Spectrum Centroid, Audio Spectrum Flatness, and Audio Spectrum Spread based on MPEG-7 Audio Features

Music Tempo Classification Using Audio Spectrum Centroid, Audio Spectrum Flatness, and Audio Spectrum Spread based on MPEG-7 Audio Features Music Tempo Classification Using Audio Spectrum Centroid, Audio Spectrum Flatness, and Audio Spectrum Spread based on MPEG-7 Audio Features Alvin Lazaro, Riyanarto Sarno, Johanes Andre R., Muhammad Nezar

More information

Improving Music Mood Annotation Using Polygonal Circular Regression. Isabelle Dufour B.Sc., University of Victoria, 2013

Improving Music Mood Annotation Using Polygonal Circular Regression. Isabelle Dufour B.Sc., University of Victoria, 2013 Improving Music Mood Annotation Using Polygonal Circular Regression by Isabelle Dufour B.Sc., University of Victoria, 2013 A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

More information

Music Structure Analysis

Music Structure Analysis Tutorial Automatisierte Methoden der Musikverarbeitung 47. Jahrestagung der Gesellschaft für Informatik Music Structure Analysis Meinard Müller, Christof Weiss, Stefan Balke International Audio Laboratories

More information

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has

More information

A repetition-based framework for lyric alignment in popular songs

A repetition-based framework for lyric alignment in popular songs A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine

More information

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS Xiao Hu J. Stephen Downie Graduate School of Library and Information Science University of Illinois at Urbana-Champaign xiaohu@illinois.edu

More information

Music Structure Analysis

Music Structure Analysis Overview Tutorial Music Structure Analysis Part I: Principles & Techniques (Meinard Müller) Coffee Break Meinard Müller International Audio Laboratories Erlangen Universität Erlangen-Nürnberg meinard.mueller@audiolabs-erlangen.de

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005. Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute

More information

Discovering Similar Music for Alpha Wave Music

Discovering Similar Music for Alpha Wave Music Discovering Similar Music for Alpha Wave Music Yu-Lung Lo ( ), Chien-Yu Chiu, and Ta-Wei Chang Department of Information Management, Chaoyang University of Technology, 168, Jifeng E. Road, Wufeng District,

More information

Interactive Classification of Sound Objects for Polyphonic Electro-Acoustic Music Annotation

Interactive Classification of Sound Objects for Polyphonic Electro-Acoustic Music Annotation for Polyphonic Electro-Acoustic Music Annotation Sebastien Gulluni 2, Slim Essid 2, Olivier Buisson, and Gaël Richard 2 Institut National de l Audiovisuel, 4 avenue de l Europe 94366 Bry-sur-marne Cedex,

More information

10 Visualization of Tonal Content in the Symbolic and Audio Domains

10 Visualization of Tonal Content in the Symbolic and Audio Domains 10 Visualization of Tonal Content in the Symbolic and Audio Domains Petri Toiviainen Department of Music PO Box 35 (M) 40014 University of Jyväskylä Finland ptoiviai@campus.jyu.fi Abstract Various computational

More information

Figure.1 Clock signal II. SYSTEM ANALYSIS

Figure.1 Clock signal II. SYSTEM ANALYSIS International Journal of Advances in Engineering, 2015, 1(4), 518-522 ISSN: 2394-9260 (printed version); ISSN: 2394-9279 (online version); url:http://www.ijae.in RESEARCH ARTICLE Multi bit Flip-Flop Grouping

More information

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,

More information

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES 12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou

More information

EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach

EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach Song Hui Chon Stanford University Everyone has different musical taste,

More information

Wipe Scene Change Detection in Video Sequences

Wipe Scene Change Detection in Video Sequences Wipe Scene Change Detection in Video Sequences W.A.C. Fernando, C.N. Canagarajah, D. R. Bull Image Communications Group, Centre for Communications Research, University of Bristol, Merchant Ventures Building,

More information

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate

More information

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting Luiz G. L. B. M. de Vasconcelos Research & Development Department Globo TV Network Email: luiz.vasconcelos@tvglobo.com.br

More information

Popular Song Summarization Using Chorus Section Detection from Audio Signal

Popular Song Summarization Using Chorus Section Detection from Audio Signal Popular Song Summarization Using Chorus Section Detection from Audio Signal Sheng GAO 1 and Haizhou LI 2 Institute for Infocomm Research, A*STAR, Singapore 1 gaosheng@i2r.a-star.edu.sg 2 hli@i2r.a-star.edu.sg

More information