Automatic Detection of Emotion in Music: Interaction with Emotionally Sensitive Machines

Size: px
Start display at page:

Download "Automatic Detection of Emotion in Music: Interaction with Emotionally Sensitive Machines"

Transcription

1 Automatic Detection of Emotion in Music: Interaction with Emotionally Sensitive Machines Cyril Laurier, Perfecto Herrera Music Technology Group Universitat Pompeu Fabra Barcelona, Spain Abstract Creating emotionally sensitive machines will significantly enhance the interaction between humans and machines. In this chapter we focus on enabling this ability for music. Music is extremely powerful to induce emotions. If machines can somehow apprehend emotions in music, it gives them a relevant competence to communicate with humans. In this chapter we review the theories of music and emotions. We detail different representations of musical emotions from the literature, together with related musical features. Then, we focus on techniques to detect the emotion in music from audio content. As a proof of concept, we detail a machine learning method to build such a system. We also review the current state of the art results, provide evaluations and give some insights into the possible applications and future trends of these techniques. Introduction Why do people enjoy music? One of the main factors is that music easily induces emotions and affects the listener. Can machines enjoy music as people do? Or, surely easier and less philosophically debatable, can we develop systems that are capable of detecting emotions in music and use this ability to improve human-machine interaction? Stating that music and emotions have a close relationship is no revelation. One significant motivation of composers is to express their sentiments, performers to induce feelings, and listeners to feel emotions. There are also some findings that show a direct link between audio processing and emotion in the brain, which is not so clear for other sensory modalities. Moreover music has a noticeable social role and is ubiquitous in everyday life. To communicate with humans using musical emotions, machines should be able to detect and predict them. Enabling this ability will enhance the communication between the machine and the environment. First they can sense the acoustic scene with a microphone. The surrounding music can be understood in terms of emotions and the machine can react accordingly. The face of a robot can give an explicit feedback of the emotions it detects. Moreover robots with musical abilities can select, play and even compose music conveying targeted emotions. The technology we detail in the remainder of this chapter enables machines to detect emotions from raw audio material, which is directly from the digital signal. In this chapter we expose the main findings about music and emotions, together with techniques in artificial intelligence and more explicitly in machine learning to create emotionally sensitive machines. This chapter is structured in four parts. In the first section we comment on the relationship between emotion and music and review theories from different expertise. In section 2 we define the machine learning techniques that can be used to create emotion aware machines; we detail also the methodology and give evaluation results from state of the art research in this area. Then, in section 3, we develop some ideas around emotion-based music assistants. Finally, in the last part, we present some general observations and give future perspectives.

2 Section 1. Music and emotions: emotion in music & emotions from music To study the relationship between music and emotion, we have to consider the literature from many fields. Indeed, relevant scientific publications about this topic can be found in psychology, sociology, neuroscience, cognitive science, biology, musicology, machine learning and philosophy. We focus here on works aiming to understand the emotional process in music, and to represent and model the emotional space. We also detail the main results regarding the pertinent musical features and how they can be used to describe and convey emotions. Why does music convey emotion? Emotion and expressive properties of musical elements have been studied since the time of ancient Greece (Juslin and Laukka, 2004). The fact that music induces emotions is evident for everyone. However we do not intuitively apprehend why. Emotions are mostly said to be complex and to involve a complicated combination of cognition, positive or negative feeling changes, appraisal, motivation, autonomic arousal, and bodily action tendency or change in action readiness. One of the first things to clarify is the definition of an emotion and the difference between emotions and moods. The concept of emotion is not simple to define: "Everyone knows what an emotion is, until asked to give a definition" (Fehr and Russell, 1984, p. 464). It could be defined as an intense mental state arousing the nervous system and invoking physiological responses. According to Damasio (1994), emotions are a series of body state changes that are connected to mental images that have activated a given brain subsystem (e.g., the music processing subsystem). So emotions involve physiological reactions but also they are object-oriented and provoke a categorization of their object: if the emotion is one of fear its object must be viewed as harmful (Davies, 2001, p. 26). Emotions also induce an attitude towards the object. Moods could be considered as lasting emotional states. They are not object oriented and take into account quite general feelings. Moods and emotions can be very similar concepts in some cases, for instance happiness, sadness and anger can be seen as both moods and emotions. However some emotions can only be considered as transient, such as surprise. Understanding how music conveys emotion is not trivial. Kivy (1989) gives two such hypotheses. The first might be a hearing resemblance between the music and the natural expression of the emotion. Some musical cues can induce emotions because of their similarity to speech. One example is anger where the loudness and the spectral dissonance (derived from frequency ratios and harmonic coincidence in the sound spectrum and based on psychoacoustic tests) are two components we can find in both an angry voice and music. However it might not always be that simple. The second hypothesis Kivy gives is the accumulated connotations a certain musical phenomena acquire in a culture. In that case, we learn in our culture which musical cues correspond to which feeling. Most probably, both hypotheses are valid. Frijda (1987, pp. 469) argues for a notion of emotions as action tendencies where various emotions humans or animals can have - the various action readiness modes they may experience or show - depends upon what action programs, behavior systems, and activation or deactivation mechanisms the organism has at its disposal.. As pointed out by Nussbaum (2007), this correlates with results in neuroscience from scientists such as Damasio (1994). Grewe et al. (2007) demonstrated that the intensity of the emotion induced by music could vary depending on personal experience and musical background. If a musician knows and has studied the piece for a performance, he/she is more likely to rate the intensity of the emotion higher. This is an auto-reinforcement by training. We can also imagine that listening to a musical piece too many times can create the opposite behavior. Almost everyone has experienced the fact of being bored, or less and less sensitive to a musical piece they used to love. Besides, it is important to notice that emotions in

3 music are not restricted to adults or musically trained people. The emotional processing of music starts at an early age. Four-months-old children have a preference for consonant (pleasant) over dissonant (unpleasant) music (Trainor, Tsang and Cheung, 2002). At five years old, they can distinguish between happy and sad music using the tempo (sad = slow, happy = fast), but at six, they use information from the mode (sad = minor, happy = major) such as adults do (Dalla Bella et al., 2001). Studies in neuroscience, exploiting the current techniques of brain imaging also give a hint about the emotional processing of music, with some schemas of the brain functions involved (Koelsch et al., 2006). Gosselin et al. (2005) demonstrated that the amygdala, well established to have an important role in the recognition of fear, is determinant in the recognition of scary music. Blood and Zatorre (2001) revealed that music creating highly pleasurable experience like shivers-down-the-spine or chills activate regions in the brain involved in reward and motivation. It is worth noticing that these areas are also active in response to other euphoria-inducing stimuli like food, sex and drugs. Huron (2006) simply states that music making and listening are primarily motivated by pleasure and that the contrary is biologically implausible (p. 373). Meyer (1956) describes the importance of expectation as a tool for the composer to create emotions. This work has been continued and formalized as the ITPRA 1 theory by Huron (2006). One important way to control the pleasure in a musical piece is to play with this feature by delaying expected outcomes and fulfilling our expectation. Additional research (Menon and Levitin, 2005) seems to have also found the physical connections between music and mood alteration by means of antidepressants: the latter act on the dopaminergic system which has one of its main centers in the so-called nucleus accumbens, a brain structure that also receives a dramatic degree of activation when listening to music. These results are coherent with Lazarus (1991), when he argues that emotions are evolutionary adaptations, to evoke behaviors that improve chances for survival and procreation, and with Tomkins (1980) view that emotions can be understood as motivational amplifiers. It links music with survival related stimuli. Often, damages to emotional controls limiting the normal functionability of the emotional behavior are disastrous for people (Damasio, 1994). Moreover people who did not develop social emotions seem incapable of appreciating music (Sacks and Freeman, 1994). However, this evolutionary adaptation theory can be balanced by the fact that most emotional responses to music are neither used to achieve goals, nor practically related to survival issues. This argument is used by researchers who assume that music cannot induce basic survival emotions, but more music-specific emotions (Scherer and Zentner, 2001, p. 381). Nonetheless, other notable researchers affirm about music that it is remarkable that any medium could so readily evoke all the basic emotions of our brain (Panksepp and Bernatzky, 2002). This is one of the multiple contradictions we can observe in current research on music and emotions. As pointed out by Juslin and Västfjäll (2008), the literature presents a confusing picture with conflicting views. Nevertheless there is no doubt that music induces emotion because of the related context. It evokes emotions from past events because it is associated in our memory to emotional events. When talking about emotion and music, one important distinction to make is the difference between induced and perceived emotions (Juslin and Laukka, 2004). That is what we define as emotion in music and emotion from music. The former represents the intended emotion and the latter the emotion felt while listening to a musical piece. A typical example of differentiation between both is the expression of anger. When someone is angry, people might perceive anger but feel scared or defensive. The induced emotion is radically different from the perceived one. Different factors can influence both 1 ITPRA stands for : Imagination response, Tension response, Prediction response, Reaction response, Appraisal response (Huron, 2006, pp )

4 types, for instance the symbolic aspect or the social context of a song will influence more the induced emotion (like for a national anthem). As noticed by Bigand et al. (2005) both aspects are not strictly independent and there will always be an influence of the induced emotion on someone asked to judge the perceived one. Nevertheless it should be observed that people tend to agree more on the perceived emotion than on the induced emotion (Juslin and Laukka, 2004). It is worth noticing that a relevant part of the emotion in songs comes from the lyrics. Psychological studies have shown that part of the semantic information of songs resides exclusively in the lyrics (Besson et al., 1998). This means that lyrics can contain relevant information to express emotions that is not included in the audio. Indeed, Juslin and Laukka (2004) reported that 29% people mentioned the lyrics as a factor of how music expresses emotions. Although there is an increase in research about the causal links between music and emotion, there still remain many open questions (Patel, 2007). In addition to the biological substrate, there are important links related to the musical features that are present or absent when perceiving or feeling a given music-related emotion. In section 2, we give some results about these musical features, but first we will discuss the different representations of musical emotions that arise from psychological studies. Emotional Representations One main issue in making machines emotionally sensitive is to find models of human representation of emotion in music. From the literature in music psychology, there exist two main paradigms to represent emotions. This distinction is quite general, it is not only about musical emotions, but studies were designed specifically to test and refine these models for music. The first one is the categorical representation that distinguishes among several emotion classes. The other one is the dimensional representation defining an emotional space. We detail here the main theories using both approaches and we make explicit the special case of musically-related emotional representations. Categorical representation The categorical representation aims to divide emotions in categories, where each emotion is labeled with one or several adjectives. The most canonical model is the concept of basic emotions where several distinct categories are the basis of all possible emotions. This concept is illustrated by Ekman s basic emotion theory distinguishing between anger, fear, sadness, happiness and disgust (Ekman, 1992). Nevertheless other categorical approaches are possible. Indeed a lot of psychologists propose that their emotion adjective set is applicable to music. One of the most relevant works in this domain is the study by Hevner (1936) and her adjective circle shown in figure 1. Hevner s adjective list is composed of 67 words arranged into eight clusters. From this study each cluster includes adjectives that have a close relationship. This similarity between words of the same cluster enables one to work at the cluster level reducing the taxonomy to eight categories. Farnsworth (1954) modified Hevner s list into ten clusters. These categories were defined by conducting listening tests and subjective answers. Moreover, we should note that most of these studies were conducted using classical music from the western culture and mainly of the baroque and romantic periods. We can imagine that the emotions evoked by popular music are different. A problem of the categorical approach is that classifying a musical piece into one or several categories is rather difficult sometimes, as pointed out by Hevner (1936). For instance in one of her studies, based on a musical piece called Reflections on the water by Debussy was rated to belong to all the clusters unless a continuous measure was considered. Although it was argued that a word list couldn t describe the variety of possible emotions in music, using a reduced set helps to achieve an agreement between people (even if it gives less meaning) and offers the possibility for automatic systems to model the general consensus of musical pieces.

5 Figure 1. Adjectives and clusters, adapted from Hevner (1936) Figure 2. Circumplex model of affect, adapted from Russel (1980) Dimensional representation In a dimensional representation, the emotions are classified along axes. Most of the proposed representations in the literature are inspired by the Russell (1980) circumplex model of affect, using a two-dimensional space spanned by arousal (activity, excitation of the emotion) and valence (positivity or negativity of the emotion). In figure 2, we represent this bipolar model with the different adjectives placed in this emotional space. In this two-dimensional space, a point at the upper-right corner has high valence and arousal, which means happy with a high activity such as excited. Opposite to this one, the lower-left part is negative with low activity like bored or depressed. Several researchers such as Thayer (1989) applied this dimensional approach and developed the idea of an energy-stress model. Other studies propose other dimensional representations. However they all somehow relate to the models previously presented, as in the case of Schubert s (1999) two-dimensional emotion space (called 2DES), with valence on the x-axis and arousal on the y-axis with a mapping of adjectives from different psychological references. The main advantage of representing emotion in a dimensional form is that any emotion can then be mapped in that space. It allows a model where any emotion can be represented, within the limitation of these dimensions. One common criticism of this approach is that very different emotions in terms of semantic meaning (but also in terms of psychological and cognitive mechanisms involved) can be close in the emotional space. For instance, looking at the circumplex model of affect in figure 2, we observe that the distance between angry and afraid is small although these two emotions are quite different. Nevertheless, if both categorical and dimensional approaches are criticized and not perfect, both are used and can be considered as valid, as partial evidence for and against each one can be found in the available experimental literature. Musical features and emotion Several studies investigated musical features and their relations to particular emotions. However most

6 of the available research is centered on the western musical culture and mainly from classical music. Note that both composers and performers use these musical features. In table 1, we report the main mapping between musical features and emotion categories found in the literature (Juslin and Laukka, 2004). Each independent feature is probably not sufficient to conclude about one emotion; on the contrary this may require a rich set of musical descriptors. It is interesting to notice that these features correlate with research made on speech by Scherer (1991, p. 206). Of course the comparison is limited to only a small set of attributes useful for speech like the pitch, the loudness and the tempo. From the list shown in table 1, we observe that some features can be automatically extracted from polyphonic audio content (like commercial CD tracks or mp3 files) with existing technology 2. These features are marked with an asterisk. For instance the tempo can be estimated by locating the beats. Of course it would work better on music with evident tempo and prominent percussion on beats (rock or techno for example). The results are less reliable for music with a smooth and subtle rhythm (some classical music, for instance). From audio content the reliability of these features is not always optimum but still it makes sense to use them, as they are informative. The key and the mode can also be extracted with a satisfying correctness (Gómez, 2006) by analyzing frequency distributions and comparing with tonal profiles. Other attributes are more difficult to extract from a complex mix of instruments and would be reliable only on monophonic tracks (one instrument). They are marked with two asterisks in table 1. For example the vibrato or the singer formant changes can be detected if we work on audio information containing just the singer s voice, but it becomes too complex on a mix containing all the instruments. From these results, can we seriously think about automatically predicting the emotion from music? Can machines have an emotional understanding close to ours? Depending on the information an automatic system can get from the environment the answer may vary. It is clear that an audio signal taken from a microphone and a musical score give very different information. In the recent years, research in machine learning and signal processing has allowed one to extract relevant and robust high-level musical features with techniques we will detail in the next section. 2 For a review on automatic extraction of audio features see Herrera et al. (2005) and Gouyon et al. (2008).

7 Table 1. The most frequent musical features mapped with the emotion categories based on Juslin and Laukka (2004). An asterisk (*) means that some information can be extracted from polyphonic audio content; two asterisks (**) means that it can be extracted only from monophonic audio content (one instrument), in both cases using state-of-the-art technology. In parenthesis is the quadrant number in Russell s dimensional space (see figure 1).

8 Section 2. Music Information Retrieval: Building automatic detectors of music emotions Several studies have demonstrated that musical emotions are not too subjective or too variable to deserve a mathematical modeling approach (Bigand et al., 2005; Juslin and Laukka, 2004; Krumhansl, 1997; Peretz, Gagnon and Bouchard, 1998). Indeed, within a common culture, the emotional responses to music can be highly consistent within and between listeners, but also accurate, quite immediate and precocious (Vieillard et al., 2008). This stated, it opens the door to reproduce this consistent behavior with machines. In this section we give a technical explanation of how to build a system to automatically detect musical emotions from audio. To achieve this goal, we use machine-learning techniques and more specifically supervised learning methods. The overall idea of supervised learning is to learn by example. It requires that the system is presented with enough examples of a given emotional category. We focus here on the categorical representation because it seems easier for people to categorize using simple emotions rather than to give a value for each dimension (arousal, valence). An important part of the work is to gather a substantial amount of reliably labeled examples (called ground truth). Then we extract acoustical and musical information (called features) from the audio of each example file, and finally we learn the mapping between the features and the labels (emotions in our case). This mapping is validated using cross-validation 3 methods or an independent test database. These methods ensure that our system can build general models of the emotional classes (i.e., that the model is not overfitting to the training data). Using this procedure, along with standard automatic classifiers, we can build a system able to reliably and consistently predict the emotion in music to a certain extent. This type of methodology is part of the research conducted by the Music Information Retrieval (MIR) community. The mostly studied problem in this field is genre classification (Tzanetakis and Cook, 2002; Guaus and Herrera, 2006). However recent trends focus on emotion or mood detection. We review and compare the existing systems to our approach at the end of this section. If we can work on a symbolic representation (like the musical score, a MIDI file or other), we can use accurate representations of the melody, chords, rhythm and other musical dimensions. It allows generating new versions of the music modifying the emotional content in a more flexible and efficient way than from audio content. Indeed one can operate directly on the relevant musical aspects like Fridberg, Bresin and Sundberg (2006). In our system, we want to deal directly with the audio signal, as we cannot always have access to symbolic information. On one hand we loose the precision in notes and measure mentioned before but on the other we can process the vast amount of musical data available in a digital audio format. Although it seems more complicated, it corresponds to a realistic usage. The machine can then analyze any kind of music from audio files but also from the sonic environment using a microphone. Methodology To detect emotions in music, we are using statistical classification. Classification algorithms need a large amount of labeled examples, but also a good and rich musical description of each example, in order to learn how to classify it properly. The information gathered from the examples are numerical data called features (or descriptors). They are computed directly from the audio signal and can describe 3 In order to assess the ability that the system has to predict a label for new and unseen music files, the training of the system uses only a portion of all the available data, and its testing is done using the remaining data. In n-fold cross-validation, the data is split into n portions, n-1 folds are used for training, and the remaining fold is used for testing. This is done n times, each one using one of the n folds for testing and the remaining folds for training; finally an average of the n tests is used to estimate the mean error of the classification system.

9 different aspects like for instance timbre, rhythm or tonality 4. With this information and enough realistic data, the classifier can learn from simple rules to complex functions to predict the emotional label of any new music. We specify here each step of this approach and summarize it in figure 3. Figure 3. Schema of the supervised learning approach. From the manually annotated ground truth, features are extracted to train a classifier. This trained system can then annotate automatically any new music collection. Ground Truth The first step is to create the database of examples. In the case of emotion, the representation chosen will determine the rest of the process. From music and emotion theories, and from psychological studies mentioned in section 1, we can understand the pros and cons of each representation for this purpose. In the MIR field, the representation most often chosen is the categorical approach based on basic emotion theories. Each emotion is considered as independent from the others and all existing emotions would be a combination of these basic emotions. In that case the categories are considered mutually exclusive. This categorization fits particularly well in the automatic classification way of thinking, were we have several classes and one element can belong to only one class. Other studies consider this approach too restrictive as emotions are more complex and because one piece of music can evoke both happiness and sadness at the same time. For our experiments, we decided to use a categorical approach to ease the process of annotating the 4 Even though it plays a crucial role in any music cognition aspect, melodic information is still out of scope of the current state-of-the-art automatic music content description. It can only be addressed very roughly or unreliably when polyphonic music files are analysed. Even with this limitation, the current audio descriptors can deal with many practical applications, such as the one we describe here.

10 data, making it clearer for the people involved in that process. However choosing one type of representation does not totally solve the problem. Deciding categories is not trivial. Each theory of basic emotion in music gives a different set of emotions. To decide on which taxonomy to use, and to study the overlap between categories, we have conducted a small experiment. We asked 16 people to annotate 100 musical excerpts choosing one or several adjectives in a set. The results showed that already on a simple set and a few people, it was not easy to come to an agreement. Moreover some categories are difficult to take into account separately. From the literature and our preliminary experiments, we decided to use a simple approach based on emotion categories well distinguished by people (Laurier and Herrera, 2007). This allows one to have the best agreement between people when labeling, and to make the system as general as possible. We also decided to have a binary approach. Each category is considered to be boolean, for instance a song is happy or not happy. With this approach we have multiple binary classifiers, one for each emotion (instead of a single multi-class classifier). This avoids the strict separation of so-called basic emotions as if they would be mutually exclusive. In fact, this approach is closer to the theory considering that each emotion can be a combination of basic emotions. Therefore we consider that we have an expert for each basic emotion which will estimate the amount of this particular emotion in a given music file. This allows for a more detailed description using an ensemble of multiple boolean experts. Once we have chosen a proper representation, we need to build the database of examples. In our case the examples are musical pieces labeled with emotions. This step is very time consuming, because people have to listen and manually annotate music. Moreover we want to have as many annotations as possible and on a large amount of musical examples. There are several ways to gather this data. The main method used is a questionnaire, either web-based or in laboratory settings to have more control on the factors that can influence the annotation. The effect of using web-based experiments instead of laboratory settings for musical perception studies is discussed in Honing and Ladinig (2008). Another way is to conduct games to gather this data. Kim, Schmidt and Emelle (2008) created a flash-based game using the dimensional paradigm called MoodSwings. In the arousal-valence plane the users are marking the perceived emotion in the music and get points if they agree. Mandel and Ellis (2007) invented a web-based game using the category paradigm. This game is not limited to mood but open to any music labeling. When different users use the same tag to define a musical piece, they get points. These online games are useful to gather much more data than asking people to annotate with no special motivation. In all cases, several issues have to be addressed. Many different factors can have an impact on the annotation reliability. On one hand, in laboratory settings it is easier to control these factors than when using web-based interfaces. On the other hand in a laboratory environment one might not react as if he was in everyday conditions. Beyond these considerations, in the case of emotion in music, several factors also have to be controlled. Indeed, the emotion in the music depends on many different elements, such as the cultural background, the social context, the lyrics, the temporal evolution of the music, or the personal preferences. The cultural background could mean the experience one has with music. Considering mainstream popular music from the western culture, we can limit the cultural impact so that it would work for many people (but maybe not with people not exposed to western popular music). Information about the social context is by definition not included in the music itself but relies on the context of the music. This is particularly difficult to control as one might have a very personal relationship with one musical piece. That is also one reason to focus our system on detecting emotions in the music and not from the music. Indeed the induced emotion can be quite different from the one perceived, especially because of the social context or the personal history of the listener with that particular music. Moreover, in the annotation process, we can limit this influence by checking if the annotator knows the music. Finally, as to the effect of lyrics, one possibility is to use instrumental

11 music; we can also reduce the song to a short excerpt so that the whole meaning of the lyrics can not influence the annotation process. Although all these factors are important in the way a ground truth is constructed, they are almost never mentioned in the current MIR literature. In our case, we have built a ground truth of popular music, with four categories: happy, sad, angry and relaxed. We have chosen these emotions because they are related to basic emotions from psychological theories and also because they cover the four parts of the 2D valence/arousal representation. But as we also do not want to restrict to exclusive categories, we consider the problem as a binary classification for each term. One song can be happy or not happy, but also independently angry or not angry and so on. Our collection is made of popular music pre-selected from a large online community (Last.fm 5 ), which is active in associating labels (tags) with the music they listen to. We looked for the songs mostly tagged with our categories and synonyms and we asked a small group of listeners in our lab to validate this selection. We included this manual confirmation in order to exclude songs that could have been wrongly tagged, to express something else, or because of a "following the majority" type of effect. The annotators were asked to listen to 30 seconds of the songs, first to avoid as much as possible changes in the emotion, then to reduce the influence of the lyrics and finally to speed up the annotation process. In total 17 different evaluators participated and the final database is composed of 1000 songs divided between 4 categories of interest plus their complementary categories ( not happy, not sad, not angry and not relaxed ). Feature Extraction If early MIR systems were able to process only symbolic data like MIDI (symbolic musical standard, which provides a score-like music representation), the evolution of Digital Signal Processing (DSP) techniques have provided new tools to extract audio features. DSP techniques combined with perceptual and musical knowledge allow us to compute descriptors about timbre, rhythm, harmony, loudness or pitch. An audio file or stream is digitally represented as a waveform, basically a succession of values between -1 and 1 with a rate of several thousand values per second. Typically (as with the Compact Disc format) we consider values per second for psychoacoustic reasons. In the last decade MIR researchers have been very active in extracting meaningful information from this raw data. Several levels of abstraction can be addressed, from low level (close to the signal) to high level (semantic level, like musical concepts). Taking advantage of expertise in signal processing, psychoacoustic, musicology, statistics, machine learning and information retrieval many descriptors have been proposed (Herrera et al., 2005; Gouyon et al., 2008 pp ) Each descriptor can be computed as a series of values for a time window and summarized for the entire music file using statistical measures like the mean or the variance. It can also be directly computed as a global value corresponding to a song (like the estimation of the key and mode, e.g. C major, for instance). Some widely used descriptors are the Mel-Frequency Cepstral Coefficients (MFCCs) (Logan, 2000), because they are very informative about the timbre of the acoustic signal. This type of spectral descriptor is useful to classify music by genre and many other tasks. Another example is the Harmonic Pitch Class Profiles (HPCP) from Gómez (2006) or chroma features for tonality. They describe how the energy in the audio is spread over the notes. It allows estimations of the chord, the key and, with an appropriate algorithm they can be used to detect different versions of the same song (Serrà et al., 2008). 5

12 For the mood detection, many features are relevant. It is important to keep in mind that we use these techniques to extract information of a different kind: timbral (for instance MFCCs, spectral centroid), rhythmic (for example tempo), tonal (like HPCP) and temporal descriptors. Among others we have also an estimation of the dissonance, the mode, the onset rate and the loudness. Not all the musical features detailed previously in section 1 can be accurately retrieved from audio content only. Nevertheless, these audio descriptors studied and developed by MIR researchers are sufficient to model many aspects of music. Other kinds of information can be gathered, such as text from the lyrics (we will present some results about this later), reviews, blogs or symbolic musical data like the score or a MIDI file. However we restrict our starting point to the raw audio data. Classification Statistical classification algorithms use the features extracted from examples and try to derive a mathematical or predictive relationship between each of them and its label (an emotion in our case). In a supervised learning approach, the descriptors from each example of the database are used to train a classifier that learns a statistical mapping and models the problem. For instance it may automatically learn from many examples that happy music is more likely to be in a major mode and sad in a minor mode. To achieve the classification task, we use well-known methods for statistical classification like k- Nearest-Neighbors (k-nn) or Support Vector Machines (SVM). Most of the standard algorithms are included in the WEKA software (Witten and Frank, 1999), no particular classifier is to be preferred by default. Several approaches should to be tested. However in machine learning in general and in music information retrieval in particular, SVM seem to be one of the best options. They are known to be efficient, to perform relatively well and to be reliable in many cases. In the emotion classification literature, the main differences are in the representation chosen, the methodology to get a ground truth and to evaluate the results. The classification stage is largely standardized using SVM and sometimes other classifiers, but with no dramatic improvement in the classification results. Results In this part we present evaluation results from different experiments and relevant empirical studies found in the literature. If predicting the emotion from audio is feasible, it is quite arduous to compare all the different approaches because they use different representations, databases and evaluation schemas. The Music Information Retrieval Evaluation exchange (MIREX) attempts to make this comparison possible (Downie, 2006). The MIREX provides evaluation frameworks and metrics with which researchers could scientifically compare their approaches and algorithms. In 2007 a first evaluation in Audio Music Mood Classification was organized. The representation chosen for this contest was a categorical approach with mood clusters, where the clusters were mutually exclusive (one instance could only belong to one mood cluster). There were five categories, or mood clusters shown in table 2 and the best results achieved were around 60% of accuracy (Laurier and Herrera, 2007). It means that the best systems were able to classify correctly 60 % of the music given to test. This percentage is a mean made using a 3-fold cross-validation. Almost all the systems submitted to this evaluation were using SVM to classify and different sets of descriptors (Hu et al., 2008).

13 Clusters Mood Adjectives Accuracy in percentage Cluster 1 passionate, rousing, confident, boisterous, rowdy 45.8 % Cluster 2 rollicking, cheerful, fun, sweet, amiable/good natured 50.0 % Cluster 3 literate, poignant, wistful, bittersweet, autumnal, brooding 82.5 % Cluster 4 humorous, silly, campy, quirky, whimsical, witty, wry 53.3 % Cluster 5 aggressive, fiery, tense/anxious, intense, volatile, visceral 70.8 % Table 2. Clusters of adjectives used for the MIREX 2007 mood evaluation task and mean accuracy of our classifier. In the literature other results are available and can be of interest, especially if the approach is different. Basically almost every scientific contribution differs in at least one key aspect. Several consider the category representation based on basic emotions (Laurier and Herrera, 2007; Sordo, Laurier and Celma, 2007; Shi et al., 2006; Lu, Liu and Zhang, 2006), while others treat the categories in a multi-labeling approach like Wieczorkowska et al. (2005). The basic emotion approach gives simple but relatively satisfying results with accuracies around 80-90% depending on the data and the number of categories. The lower accuracies for the MIREX approach mentioned before might be due to an overlap in the concepts included in the class labels (Hu et al., 2008). It could also be due to a stricter evaluation on more data than the other mentioned works. The latter (multi-labeling) suffer from a difficult evaluation in general, as the annotated data needed should be much larger. Indeed if we want to use precision and recall 6 in an appropriate way we need to annotate all the data we evaluate with all categories (presence or absence), otherwise we might consider wrong results that are actually correct. There are also similar approaches to ours, such as the work by Li and Ogigara (2003), where they extracted timbre, pitch and rhythm features and trained Support Vector Machines. They used 13 categories, 11 from Farnsworth (1954) and 2 additional ones. However the results were not satisfying (it was one of the very first studies of mood classification), with low precision (around 0.32) and recall (around 0.54). This might be due to the small dataset labeled by only one person, and to the large adjective set. Another similar work should be mentioned; Skowronek et al. (2007) used spectral, tempo rhythm, tonal and percussive detection features together with a quadratic discriminant analysis to model emotions. He achieved a mood predictor with 12 categories considered binary with an average accuracy around 85%. Other studies concentrated on the dimensional representation. Lu, Liu and Zhang (2006) used Thayer s (1996) model based on the energy and stress dimensions and modeled the four parts of the space: contentment, depression exuberance and anxious. They modeled the different parts of the space using Gaussian Mixture Models. The system was trained with 800 excerpts of classical music and the system achieved around 85% accuracy (trained with three fourths and tested on the remaining fourth of the data). Although it was based on a dimensional system the prediction was made on the four quadrants as exclusive categories. However another relevant study (Yang et al., 2008a) used Thayer s arousalvalence emotion plane, but with a regression approach, to model each of the two dimensions. They used mainly spectral and tonal descriptors together with loudness features. With these tools, they modeled arousal and valence using annotated data and regression functions (Support Vector 6 Precision and recall are two typical measures in Information Retrieval. Precision is a measure of exactness (ratio of correct instances in the retrieve set) and Recall a measure of completeness (amount of correct instances retrieved over the whole set of correct instances)

14 Regression). The overall results were very encouraging and demonstrated that a dimensional approach is also feasible (see figure 4 for an application of this research). In another work worth to be mentioned here, Mandel, Poliner and Ellis (2006) designed a system using MFCCs and SVM. The interesting aspect of this work is the application of an active learning approach. The system learns according to the feedback given by the user. Moreover the algorithm chooses the examples to be labeled in a smart manner, hence reducing the amount of data needed to build a model achieving a similar accuracy with a standard method. Our ground truth is based on songs already tagged by hundreds of people (Last.fm users). We added a manual validation step to ensure the quality and reliability of our data. The evaluation was conducted on the 1000 annotated examples mentioned previously. We extracted audio features and performed classification with a SVM. Four categories were considered: happy, sad, relaxed and angry, each one approached as a binary problem. Either an instance belongs to the category or not. It means that each category is a boolean problem with a random baseline of 50 % of accuracy (i.e., a classifier just based on random choice between both categories would give an average accuracy of 50%). In table 3, we report the results of our evaluation using SVM and 10-fold cross-validation. The evaluation data were obtained after 10 runs of the same experimental setup (i.e., a random seed changed the allocation of files to folds for each run). Category Accuracy in percentage (standard deviation) Angry 98.1 % (3.8) Happy 81.5 % (11.5) Sad 87.7 % (11.0) Relaxed 91.4 % (7.3) Table 3. Accuracy of our classifiers on the different categories. Each category implies a binary decision (for instance angry vs. not angry ). This was made using SVM and 10 runs of 10-fold cross-validation. The performances we obtained using audio-based classifiers are quite satisfying and even exceptional when looking at the angry category with 98 %. It is difficult to directly compare this with the results from the MIREX evaluation, because we use here different categories and each one is considered binary. All four categories reached accuracies above 80%, and two categories ( angry and relaxed ) above 90%. Even though these results can seem surprisingly high, this is coherent with similar studies (Skowronek et al., 2007). Moreover as we deal with binary comparisons on a balanced dataset, the random baseline is 50%. Also, the examples are selected and validated only when they clearly belong to the category or its complementary. This can bias the database towards very clear differences. We should also notice that these models might work only for popular music (there was no classical music in our database), so it can generalize only to a certain extent. We conducted an experiment using the lyric information and combined the two classifiers: one for audio and one for lyrics. For the lyrics we used a text information retrieval method to detect the words that discriminate best between categories (Laurier, Grivolla and Herrera, 2008). We obtained the results presented in table 4.

15 Category Accuracy in percentage (standard deviation) Difference adding lyrics information to audio Angry 99.1 % (2.2) + 1 % Happy 86.8 % (10.6) % Sad 92.8 % (8.7) % Relaxed 91.7 % (7.1) % Table 4. Accuracy of a multimodal system using audio and lyrics. Results presented in table 4 show that lyrics contribute positively in correctly classifying emotions, especially for the happy and sad categories. It may be because lyrics are more informative about the valence. But we should also notice that the highest improvement occurs when there was more room for improvement. In a nutshell, detecting emotion in music is feasible if we consider simple categories or dimensions. The available results are encouraging continuing along this line and perhaps addressing more complex representations and models of emotions. Conclusions Even if we can predict some aspects of the emotion in a musical piece, the level of analysis can be made more precise. In addition, there are some important aspects that should be taken into account like the effect of the singer s voice, which theoretically contains much emotional information that is not considered by the existing techniques. Moreover the degree of emotional extent is limited to simple categories or to a few dimensions. Finally, we do not examine the time development of the emotions but we average musical features over the entire piece, which is certainly a simplification of the rich emotional tapestry that certain musical pieces can weave. Even though our initial results have been encouraging, there is room for many improvements. As explained previously the current state-of-the-art in automatic detection is quite limited to a simplistic view. Some effort should be made towards designing systems with a better music understanding and to allow a process of user modeling. Currently we average the perceived emotion among people to have a general prediction and the predictive models are universal (i.e., the same for all the users), but we should also seek to yield predictions at the user level. This would make possible the development of personal music assistants. Section 3. From Music Information Retrieval to personalized emotion-based music assistants People voluntary use music as a mood regulator, trying to induce emotional reactions (Sloboda, 1999). For instance, after heartbreak or sad event, someone may prefer to listen to sad songs either to give some solemnity to this moment or to find solace and consolation (Sacks, 2007). On the contrary one may want to feel better by playing happy songs. Music can be employed to emphasize the current mood or to decrease the intensity of certain emotions (Levitin, 2007). Someone feeling nervous could relax by listening to calm music. There is also evidence that experiencing musical emotions leads to physiological and cognitive changes. People can intentionally play with this phenomenon to influence their own state but also to communicate to others persons. A typical example would be a teenager listening to loud heavy metal or hardcore techno music (or any aggressive alternative) to express his anger and rebellion against his parents. It might make less sense to listen to this music if his parents are not around to receive the message (North, Hargreaves and O Neill, 2000). Nowadays personal electronic devices are ubiquitous. Almost everyone has at least a cell-phone or a music player and now these kinds of mobile devices have huge capabilities. They can already play more music that one has time to listen to, they can store hours of video and thousand of pictures, they are capable of taking pictures and can be used as a notebook and agenda. They enable one to trace

16 listening habits, to geographically locate the place where users are, to detect subtle movements with accelerometers, and soon they could use all this information and additional physiological data to contextualize any listening experience. The aim of an emotion-based music assistant would be to exploit these types of devices and the techniques mentioned in the previous section to automatically and intelligently recommend musical pieces. Based on one's feeling or a targeted mood, the machine could choose the appropriate music. MIR techniques help to extract information from musical content and in our case they can automatically detect emotions from audio content. The technical issues have been explained previously in section 2. Basically learning from examples, an automatic system is able to retrieve songs with the similar mood or emotion in a large collection. It means that this trained system is able to detect if a song is sad or happy and can even estimate its degree of happiness. Possible applications of this technology are numerous 7. For instance, a device can play music according to one's mood and make him feel better (or intentionally worse). By manually selecting the current mood or a targeted one, the machine can choose or even create music accordingly. Many different factors are important to detect the emotion induced by music, but in our case, although we can grasp social data from the user, we mostly concentrate on the audio level. Even though is does not cover all the processes it is already enough to use it in many applications. The system can provide the music corresponding to one s demand in terms of mood. Skowronek et al. (2007), or Laurier and Herrera (2008) demonstrated prototypes that can extract the emotion from the audio content to visualize the prediction of the automatic classifiers, and the intensity of the predicted category. For instance in Mood Cloud, one can see the estimated amount of happiness or sadness of a song evolving while it is being played (see figure 5). Figure 4. Mr. Emo (Yang et al., 2008b), reprinted with the permission of the authors Figure 5. Mood Cloud interface (Laurier and Herrera, 2008) This information can be directly used to provide to the listener songs containing the targeted emotion. Moreover using the probability of each emotion given by the automatic classifier, we can estimate how clearly an emotion is present in the music. This also allows defining mood trajectories by practically creating musical playlists. For instance one can select to start with relaxing music and progressively move to happy music (see figure 4). This can also be used to produce music or to help a composer in 7 See, for example,

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski Music Mood Classification - an SVM based approach Sebastian Napiorkowski Topics on Computer Music (Seminar Report) HPAC - RWTH - SS2015 Contents 1. Motivation 2. Quantification and Definition of Mood 3.

More information

Exploring Relationships between Audio Features and Emotion in Music

Exploring Relationships between Audio Features and Emotion in Music Exploring Relationships between Audio Features and Emotion in Music Cyril Laurier, *1 Olivier Lartillot, #2 Tuomas Eerola #3, Petri Toiviainen #4 * Music Technology Group, Universitat Pompeu Fabra, Barcelona,

More information

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Emotion Recognition. Jaesung Lee. Chung-Ang University Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or

More information

A Categorical Approach for Recognizing Emotional Effects of Music

A Categorical Approach for Recognizing Emotional Effects of Music A Categorical Approach for Recognizing Emotional Effects of Music Mohsen Sahraei Ardakani 1 and Ehsan Arbabi School of Electrical and Computer Engineering, College of Engineering, University of Tehran,

More information

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC Vaiva Imbrasaitė, Peter Robinson Computer Laboratory, University of Cambridge, UK Vaiva.Imbrasaite@cl.cam.ac.uk

More information

Affective response to a set of new musical stimuli W. Trey Hill & Jack A. Palmer Psychological Reports, 106,

Affective response to a set of new musical stimuli W. Trey Hill & Jack A. Palmer Psychological Reports, 106, Hill & Palmer (2010) 1 Affective response to a set of new musical stimuli W. Trey Hill & Jack A. Palmer Psychological Reports, 106, 581-588 2010 This is an author s copy of the manuscript published in

More information

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset Ricardo Malheiro, Renato Panda, Paulo Gomes, Rui Paiva CISUC Centre for Informatics and Systems of the University of Coimbra {rsmal,

More information

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular Music Mood Sheng Xu, Albert Peyton, Ryan Bhular What is Music Mood A psychological & musical topic Human emotions conveyed in music can be comprehended from two aspects: Lyrics Music Factors that affect

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

THE EFFECT OF EXPERTISE IN EVALUATING EMOTIONS IN MUSIC

THE EFFECT OF EXPERTISE IN EVALUATING EMOTIONS IN MUSIC THE EFFECT OF EXPERTISE IN EVALUATING EMOTIONS IN MUSIC Fabio Morreale, Raul Masu, Antonella De Angeli, Patrizio Fava Department of Information Engineering and Computer Science, University Of Trento, Italy

More information

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t MPEG-7 FOR CONTENT-BASED MUSIC PROCESSING Λ Emilia GÓMEZ, Fabien GOUYON, Perfecto HERRERA and Xavier AMATRIAIN Music Technology Group, Universitat Pompeu Fabra, Barcelona, SPAIN http://www.iua.upf.es/mtg

More information

Supervised Learning in Genre Classification

Supervised Learning in Genre Classification Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music

More information

Expressive information

Expressive information Expressive information 1. Emotions 2. Laban Effort space (gestures) 3. Kinestetic space (music performance) 4. Performance worm 5. Action based metaphor 1 Motivations " In human communication, two channels

More information

Indexing Music by Mood: Design and Integration of an Automatic Content-based Annotator

Indexing Music by Mood: Design and Integration of an Automatic Content-based Annotator Indexing Music by Mood: Design and Integration of an Automatic Content-based Annotator Cyril Laurier, Owen Meyers, Joan Serrà, Martin Blech, Perfecto Herrera and Xavier Serra Music Technology Group, Universitat

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has

More information

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring 2009 Week 6 Class Notes Pitch Perception Introduction Pitch may be described as that attribute of auditory sensation in terms

More information

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES 12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features R. Panda 1, B. Rocha 1 and R. P. Paiva 1, 1 CISUC Centre for Informatics and Systems of the University of Coimbra, Portugal

More information

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS M.G.W. Lakshitha, K.L. Jayaratne University of Colombo School of Computing, Sri Lanka. ABSTRACT: This paper describes our attempt

More information

Quality of Music Classification Systems: How to build the Reference?

Quality of Music Classification Systems: How to build the Reference? Quality of Music Classification Systems: How to build the Reference? Janto Skowronek, Martin F. McKinney Digital Signal Processing Philips Research Laboratories Eindhoven {janto.skowronek,martin.mckinney}@philips.com

More information

MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET

MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET Diane Watson University of Saskatchewan diane.watson@usask.ca Regan L. Mandryk University of Saskatchewan regan.mandryk@usask.ca

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS Areti Andreopoulou Music and Audio Research Laboratory New York University, New York, USA aa1510@nyu.edu Morwaread Farbood

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,

More information

Modeling memory for melodies

Modeling memory for melodies Modeling memory for melodies Daniel Müllensiefen 1 and Christian Hennig 2 1 Musikwissenschaftliches Institut, Universität Hamburg, 20354 Hamburg, Germany 2 Department of Statistical Science, University

More information

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra

More information

MUSICAL MOODS: A MASS PARTICIPATION EXPERIMENT FOR AFFECTIVE CLASSIFICATION OF MUSIC

MUSICAL MOODS: A MASS PARTICIPATION EXPERIMENT FOR AFFECTIVE CLASSIFICATION OF MUSIC 12th International Society for Music Information Retrieval Conference (ISMIR 2011) MUSICAL MOODS: A MASS PARTICIPATION EXPERIMENT FOR AFFECTIVE CLASSIFICATION OF MUSIC Sam Davies, Penelope Allen, Mark

More information

Audio Feature Extraction for Corpus Analysis

Audio Feature Extraction for Corpus Analysis Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1 Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends

More information

Compose yourself: The Emotional Influence of Music

Compose yourself: The Emotional Influence of Music 1 Dr Hauke Egermann Director of York Music Psychology Group (YMPG) Music Science and Technology Research Cluster University of York hauke.egermann@york.ac.uk www.mstrcyork.org/ympg Compose yourself: The

More information

Topic 10. Multi-pitch Analysis

Topic 10. Multi-pitch Analysis Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds

More information

The song remains the same: identifying versions of the same piece using tonal descriptors

The song remains the same: identifying versions of the same piece using tonal descriptors The song remains the same: identifying versions of the same piece using tonal descriptors Emilia Gómez Music Technology Group, Universitat Pompeu Fabra Ocata, 83, Barcelona emilia.gomez@iua.upf.edu Abstract

More information

Music Recommendation from Song Sets

Music Recommendation from Song Sets Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia

More information

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL

More information

Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis

Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis R. Panda 1, R. Malheiro 1, B. Rocha 1, A. Oliveira 1 and R. P. Paiva 1, 1 CISUC Centre for Informatics and Systems

More information

The Role of Time in Music Emotion Recognition

The Role of Time in Music Emotion Recognition The Role of Time in Music Emotion Recognition Marcelo Caetano 1 and Frans Wiering 2 1 Institute of Computer Science, Foundation for Research and Technology - Hellas FORTH-ICS, Heraklion, Crete, Greece

More information

Mood Tracking of Radio Station Broadcasts

Mood Tracking of Radio Station Broadcasts Mood Tracking of Radio Station Broadcasts Jacek Grekow Faculty of Computer Science, Bialystok University of Technology, Wiejska 45A, Bialystok 15-351, Poland j.grekow@pb.edu.pl Abstract. This paper presents

More information

TOWARDS AFFECTIVE ALGORITHMIC COMPOSITION

TOWARDS AFFECTIVE ALGORITHMIC COMPOSITION TOWARDS AFFECTIVE ALGORITHMIC COMPOSITION Duncan Williams *, Alexis Kirke *, Eduardo Reck Miranda *, Etienne B. Roesch, Slawomir J. Nasuto * Interdisciplinary Centre for Computer Music Research, Plymouth

More information

Research & Development. White Paper WHP 228. Musical Moods: A Mass Participation Experiment for the Affective Classification of Music

Research & Development. White Paper WHP 228. Musical Moods: A Mass Participation Experiment for the Affective Classification of Music Research & Development White Paper WHP 228 May 2012 Musical Moods: A Mass Participation Experiment for the Affective Classification of Music Sam Davies (BBC) Penelope Allen (BBC) Mark Mann (BBC) Trevor

More information

Computational Modelling of Harmony

Computational Modelling of Harmony Computational Modelling of Harmony Simon Dixon Centre for Digital Music, Queen Mary University of London, Mile End Rd, London E1 4NS, UK simon.dixon@elec.qmul.ac.uk http://www.elec.qmul.ac.uk/people/simond

More information

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications Matthias Mauch Chris Cannam György Fazekas! 1 Matthias Mauch, Chris Cannam, George Fazekas Problem Intonation in Unaccompanied

More information

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio Jeffrey Scott, Erik M. Schmidt, Matthew Prockup, Brandon Morton, and Youngmoo E. Kim Music and Entertainment Technology Laboratory

More information

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance Methodologies for Expressiveness Modeling of and for Music Performance by Giovanni De Poli Center of Computational Sonology, Department of Information Engineering, University of Padova, Padova, Italy About

More information

Singer Recognition and Modeling Singer Error

Singer Recognition and Modeling Singer Error Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing

More information

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for

More information

1. BACKGROUND AND AIMS

1. BACKGROUND AND AIMS THE EFFECT OF TEMPO ON PERCEIVED EMOTION Stefanie Acevedo, Christopher Lettie, Greta Parnes, Andrew Schartmann Yale University, Cognition of Musical Rhythm, Virtual Lab 1. BACKGROUND AND AIMS 1.1 Introduction

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

& Ψ. study guide. Music Psychology ... A guide for preparing to take the qualifying examination in music psychology.

& Ψ. study guide. Music Psychology ... A guide for preparing to take the qualifying examination in music psychology. & Ψ study guide Music Psychology.......... A guide for preparing to take the qualifying examination in music psychology. Music Psychology Study Guide In preparation for the qualifying examination in music

More information

The relationship between properties of music and elicited emotions

The relationship between properties of music and elicited emotions The relationship between properties of music and elicited emotions Agnieszka Mensfelt Institute of Computing Science Poznan University of Technology, Poland December 5, 2017 1 / 19 Outline 1 Music and

More information

HIT SONG SCIENCE IS NOT YET A SCIENCE

HIT SONG SCIENCE IS NOT YET A SCIENCE HIT SONG SCIENCE IS NOT YET A SCIENCE François Pachet Sony CSL pachet@csl.sony.fr Pierre Roy Sony CSL roy@csl.sony.fr ABSTRACT We describe a large-scale experiment aiming at validating the hypothesis that

More information

Electronic Musicological Review

Electronic Musicological Review Electronic Musicological Review Volume IX - October 2005 home. about. editors. issues. submissions. pdf version The facial and vocal expression in singers: a cognitive feedback study for improving emotional

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

Expressive performance in music: Mapping acoustic cues onto facial expressions

Expressive performance in music: Mapping acoustic cues onto facial expressions International Symposium on Performance Science ISBN 978-94-90306-02-1 The Author 2011, Published by the AEC All rights reserved Expressive performance in music: Mapping acoustic cues onto facial expressions

More information

Perceptual dimensions of short audio clips and corresponding timbre features

Perceptual dimensions of short audio clips and corresponding timbre features Perceptual dimensions of short audio clips and corresponding timbre features Jason Musil, Budr El-Nusairi, Daniel Müllensiefen Department of Psychology, Goldsmiths, University of London Question How do

More information

An ecological approach to multimodal subjective music similarity perception

An ecological approach to multimodal subjective music similarity perception An ecological approach to multimodal subjective music similarity perception Stephan Baumann German Research Center for AI, Germany www.dfki.uni-kl.de/~baumann John Halloran Interact Lab, Department of

More information

Acoustic and musical foundations of the speech/song illusion

Acoustic and musical foundations of the speech/song illusion Acoustic and musical foundations of the speech/song illusion Adam Tierney, *1 Aniruddh Patel #2, Mara Breen^3 * Department of Psychological Sciences, Birkbeck, University of London, United Kingdom # Department

More information

Improving Music Mood Annotation Using Polygonal Circular Regression. Isabelle Dufour B.Sc., University of Victoria, 2013

Improving Music Mood Annotation Using Polygonal Circular Regression. Isabelle Dufour B.Sc., University of Victoria, 2013 Improving Music Mood Annotation Using Polygonal Circular Regression by Isabelle Dufour B.Sc., University of Victoria, 2013 A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS Xiao Hu J. Stephen Downie Graduate School of Library and Information Science University of Illinois at Urbana-Champaign xiaohu@illinois.edu

More information

The Effects of Web Site Aesthetics and Shopping Task on Consumer Online Purchasing Behavior

The Effects of Web Site Aesthetics and Shopping Task on Consumer Online Purchasing Behavior The Effects of Web Site Aesthetics and Shopping Task on Consumer Online Purchasing Behavior Cai, Shun The Logistics Institute - Asia Pacific E3A, Level 3, 7 Engineering Drive 1, Singapore 117574 tlics@nus.edu.sg

More information

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical and schemas Stella Paraskeva (,) Stephen McAdams (,) () Institut de Recherche et de Coordination

More information

A User-Oriented Approach to Music Information Retrieval.

A User-Oriented Approach to Music Information Retrieval. A User-Oriented Approach to Music Information Retrieval. Micheline Lesaffre 1, Marc Leman 1, Jean-Pierre Martens 2, 1 IPEM, Institute for Psychoacoustics and Electronic Music, Department of Musicology,

More information

Lyric-Based Music Mood Recognition

Lyric-Based Music Mood Recognition Lyric-Based Music Mood Recognition Emil Ian V. Ascalon, Rafael Cabredo De La Salle University Manila, Philippines emil.ascalon@yahoo.com, rafael.cabredo@dlsu.edu.ph Abstract: In psychology, emotion is

More information

INFLUENCE OF MUSICAL CONTEXT ON THE PERCEPTION OF EMOTIONAL EXPRESSION OF MUSIC

INFLUENCE OF MUSICAL CONTEXT ON THE PERCEPTION OF EMOTIONAL EXPRESSION OF MUSIC INFLUENCE OF MUSICAL CONTEXT ON THE PERCEPTION OF EMOTIONAL EXPRESSION OF MUSIC Michal Zagrodzki Interdepartmental Chair of Music Psychology, Fryderyk Chopin University of Music, Warsaw, Poland mzagrodzki@chopin.edu.pl

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Creating a Feature Vector to Identify Similarity between MIDI Files

Creating a Feature Vector to Identify Similarity between MIDI Files Creating a Feature Vector to Identify Similarity between MIDI Files Joseph Stroud 2017 Honors Thesis Advised by Sergio Alvarez Computer Science Department, Boston College 1 Abstract Today there are many

More information

Construction of a harmonic phrase

Construction of a harmonic phrase Alma Mater Studiorum of Bologna, August 22-26 2006 Construction of a harmonic phrase Ziv, N. Behavioral Sciences Max Stern Academic College Emek Yizre'el, Israel naomiziv@013.net Storino, M. Dept. of Music

More information

Content-based music retrieval

Content-based music retrieval Music retrieval 1 Music retrieval 2 Content-based music retrieval Music information retrieval (MIR) is currently an active research area See proceedings of ISMIR conference and annual MIREX evaluations

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

A prototype system for rule-based expressive modifications of audio recordings

A prototype system for rule-based expressive modifications of audio recordings International Symposium on Performance Science ISBN 0-00-000000-0 / 000-0-00-000000-0 The Author 2007, Published by the AEC All rights reserved A prototype system for rule-based expressive modifications

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Effects of acoustic degradations on cover song recognition

Effects of acoustic degradations on cover song recognition Signal Processing in Acoustics: Paper 68 Effects of acoustic degradations on cover song recognition Julien Osmalskyj (a), Jean-Jacques Embrechts (b) (a) University of Liège, Belgium, josmalsky@ulg.ac.be

More information

Music Similarity and Cover Song Identification: The Case of Jazz

Music Similarity and Cover Song Identification: The Case of Jazz Music Similarity and Cover Song Identification: The Case of Jazz Simon Dixon and Peter Foster s.e.dixon@qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary

More information

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function EE391 Special Report (Spring 25) Automatic Chord Recognition Using A Summary Autocorrelation Function Advisor: Professor Julius Smith Kyogu Lee Center for Computer Research in Music and Acoustics (CCRMA)

More information

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS MOTIVATION Thank you YouTube! Why do composers spend tremendous effort for the right combination of musical instruments? CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

More information

Music Information Retrieval

Music Information Retrieval CTP 431 Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology (GSCT) Juhan Nam 1 Introduction ü Instrument: Piano ü Composer: Chopin ü Key: E-minor ü Melody - ELO

More information

Embodied music cognition and mediation technology

Embodied music cognition and mediation technology Embodied music cognition and mediation technology Briefly, what it is all about: Embodied music cognition = Experiencing music in relation to our bodies, specifically in relation to body movements, both

More information

Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates

Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates Konstantinos Trochidis, David Sears, Dieu-Ly Tran, Stephen McAdams CIRMMT, Department

More information

Music Genre Classification

Music Genre Classification Music Genre Classification chunya25 Fall 2017 1 Introduction A genre is defined as a category of artistic composition, characterized by similarities in form, style, or subject matter. [1] Some researchers

More information

Speech Recognition and Signal Processing for Broadcast News Transcription

Speech Recognition and Signal Processing for Broadcast News Transcription 2.2.1 Speech Recognition and Signal Processing for Broadcast News Transcription Continued research and development of a broadcast news speech transcription system has been promoted. Universities and researchers

More information

Analysis of local and global timing and pitch change in ordinary

Analysis of local and global timing and pitch change in ordinary Alma Mater Studiorum University of Bologna, August -6 6 Analysis of local and global timing and pitch change in ordinary melodies Roger Watt Dept. of Psychology, University of Stirling, Scotland r.j.watt@stirling.ac.uk

More information

Improving Frame Based Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for

More information

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC Lena Quinto, William Forde Thompson, Felicity Louise Keating Psychology, Macquarie University, Australia lena.quinto@mq.edu.au Abstract Many

More information

Classification of Timbre Similarity

Classification of Timbre Similarity Classification of Timbre Similarity Corey Kereliuk McGill University March 15, 2007 1 / 16 1 Definition of Timbre What Timbre is Not What Timbre is A 2-dimensional Timbre Space 2 3 Considerations Common

More information

PHI 3240: Philosophy of Art

PHI 3240: Philosophy of Art PHI 3240: Philosophy of Art Session 17 November 9 th, 2015 Jerome Robbins ballet The Concert Robinson on Emotion in Music Ø How is it that a pattern of tones & rhythms which is nothing like a person can

More information

Detecting Musical Key with Supervised Learning

Detecting Musical Key with Supervised Learning Detecting Musical Key with Supervised Learning Robert Mahieu Department of Electrical Engineering Stanford University rmahieu@stanford.edu Abstract This paper proposes and tests performance of two different

More information

Speech To Song Classification

Speech To Song Classification Speech To Song Classification Emily Graber Center for Computer Research in Music and Acoustics, Department of Music, Stanford University Abstract The speech to song illusion is a perceptual phenomenon

More information

Melody Retrieval On The Web

Melody Retrieval On The Web Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,

More information

Composer Style Attribution

Composer Style Attribution Composer Style Attribution Jacqueline Speiser, Vishesh Gupta Introduction Josquin des Prez (1450 1521) is one of the most famous composers of the Renaissance. Despite his fame, there exists a significant

More information

PRESCOTT UNIFIED SCHOOL DISTRICT District Instructional Guide January 2016

PRESCOTT UNIFIED SCHOOL DISTRICT District Instructional Guide January 2016 Grade Level: 9 12 Subject: Jazz Ensemble Time: School Year as listed Core Text: Time Unit/Topic Standards Assessments 1st Quarter Arrange a melody Creating #2A Select and develop arrangements, sections,

More information

MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD

MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD AROUSAL 12th International Society for Music Information Retrieval Conference (ISMIR 2011) MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD Matt McVicar Intelligent Systems

More information

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional

More information

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information