Automatic Mood Detection of Music Audio Signals: An Overview

Similar documents
Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

A Categorical Approach for Recognizing Emotional Effects of Music

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

MUSI-6201 Computational Music Analysis

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features

Mood Tracking of Radio Station Broadcasts

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

Music Genre Classification and Variance Comparison on Number of Genres

Discovering Similar Music for Alpha Wave Music

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

A Survey Of Mood-Based Music Classification

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Automatic Music Clustering using Audio Attributes

Singer Traits Identification using Deep Neural Network

Subjective Similarity of Music: Data Collection for Individuality Analysis

Toward Multi-Modal Music Emotion Classification

Singer Identification

Automatic Laughter Detection

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Perceptual dimensions of short audio clips and corresponding timbre features

Exploring Relationships between Audio Features and Emotion in Music

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC. Chia-Hao Chung and Homer Chen

Supervised Learning in Genre Classification

A Large Scale Experiment for Mood-Based Classification of TV Programmes

Topics in Computer Music Instrument Identification. Ioanna Karydi

Automatic Music Genre Classification

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Music Similarity and Cover Song Identification: The Case of Jazz

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Outline. Why do we classify? Audio Classification

Release Year Prediction for Songs

Chord Classification of an Audio Signal using Artificial Neural Network

Music Information Retrieval Community

Quality of Music Classification Systems: How to build the Reference?

Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection

Singer Recognition and Modeling Singer Error

Improving Music Mood Annotation Using Polygonal Circular Regression. Isabelle Dufour B.Sc., University of Victoria, 2013

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Music Information Retrieval with Temporal Features and Timbre

Automatic Emotion Prediction of Song Excerpts: Index Construction, Algorithm Design, and Empirical Comparison

Music Information Retrieval

The Role of Time in Music Emotion Recognition

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Neural Network for Music Instrument Identi cation

Multimodal Music Mood Classification Framework for Christian Kokborok Music

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

A Survey on: Sound Source Separation Methods

THE AUTOMATIC PREDICTION OF PLEASURE AND AROUSAL RATINGS OF SONG EXCERPTS. Stuart G. Ough

THE importance of music content analysis for musical

Classification of Timbre Similarity

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

A Survey of Audio-Based Music Classification and Annotation

Speech To Song Classification

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Creating a Feature Vector to Identify Similarity between MIDI Files

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

A Study on Cross-cultural and Cross-dataset Generalizability of Music Mood Regression Models

A Music Retrieval System Using Melody and Lyric

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

Automatic Detection of Emotion in Music: Interaction with Emotionally Sensitive Machines

Expressive information

A Study of Predict Sales Based on Random Forest Classification

ISSN ICIRET-2014

AN EMOTION MODEL FOR MUSIC USING BRAIN WAVES

Voice & Music Pattern Extraction: A Review

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

The Million Song Dataset

Detecting Musical Key with Supervised Learning

Audio-Based Video Editing with Two-Channel Microphone

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH

HIT SONG SCIENCE IS NOT YET A SCIENCE

Hidden Markov Model based dance recognition

MODELS of music begin with a representation of the

CTP431- Music and Audio Computing Music Information Retrieval. Graduate School of Culture Technology KAIST Juhan Nam

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Computational Modelling of Harmony

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

CS229 Project Report Polyphonic Piano Transcription

Automatic Piano Music Transcription

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

Transcription:

Automatic Mood Detection of Music Audio Signals: An Overview Sonal P.Sumare 1 Mr. D.G.Bhalke 2 1.(PG Student Department of Electronics and Telecommunication Rajarshi Shahu College of Engineering Pune) 2.(Faculty Department of Electronics and Telecommunication Rajarshi Shahu College of Engineering Pune) ABSTRACT:Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music related applications. Over the past decade, a lot of research has been done in audio content analysis for extracting various kinds of information, especially the moods it denotes, from an audio signal, because music expresses emotions in a concise and succinct way, in an effective way. People select music compatibility to their moods and emotions, making the need to classify music in accordance to moods. In this paper different mood models are described which uses to detect different moods using different methods as hierarchical method and nonhierarchical method with GMM, SVM. Keywords: Hierarchical framework, mood detection, mood tracking, music emotion, music informationretrieval, music mood, GMM, SVM. I. INTRODUCTION Most people enjoy music in their leisure time. Atpresent there is more and more music on personalcomputers, in music libraries, and on the Internet. Music is considered as the best form of expression ofemotions. The music that people listen to is governed by what mood they are in. The characteristics of music such as rhythm, melody, harmony, pitch and timbre play a significant role in human physiological and psychological functions, thus altering their mood. For example, when an individual comes backhome from work, he may want to listen to some relaxing light music; while when he is at gymnasium, he may want to choose some exciting music with a strong beat and fast tempo. Music is not merely a form of entertainment but also the easiest way of communication among people, a medium toshare emotions and a place to keep emotions and memories. Booming of the Internet technology, thereis more and moremusic on personal computer, in the music libraries and on the Internet. Therefore, automatic music analysis system such as music classification, music browsing and play listgeneration system are urgently required for music management facility. Because of various listening objectives in different time concordance, music classification andretrieval based on perceive emotion is mightily powerful than other tagging such as artist, album, tempo and genre. Some music genres such as classics usually contain more than one musical mood. Distinct musical features create different musical mood. For better accuracy, we use various low level musical features and detect musical mood changes based on them. For this purpose, we first divide music clips into segments based on such musical features and cluster them intogroups with similar features. Beat and tempo detection and genre classification have beendeveloped in a few research works, using different features and different models. It is noted that, in most psychology textbooks, emotion usually means a short but strong experience while mood is alonger but less strong experiences. Therefore, we mainly choose to use the word mood in this paper. However, the words affect, emotion and emotional expression are still used in order to keep the same usage as those used in the references. II. LITERATURE SURVEY Over the years, considerable work has been done in music mood detection. Literature survey has been carried out of last 20 years. These are listed below: A.S. Bhat, Amith V. S., Namrata S. Prasad, Murali Mohan D. [1] describes an Efficient Classification Algorithm for Music Mood Detection in Western and Hindi Music using Audio Feature Extraction. This paper proposed an automated and efficient method to perceive the mood of any given music piece, or the emotions related to it. Features like rhythm, harmony, spectral feature, are studied in order to classify the songs according to its mood, based on Thayer s model. All the music composition signals used were sampled at 44100 Hz, and 16-bit quantized. The accuracy of classifying mood is as high as 94.44%. Lie Lu, Dan Liu, and Hong-Jiang Zhang [2] described Automatic Mood Detection and tracking of Music Audio Signals. In this paper, a hierarchical framework is presented to automate the task of mood detection from acoustic music data. Music features such as intensity, timbre, and rhythm, were extracted to represent the characteristics of a music clip. The approach to mood detection is extended to mood tracking for a music piece. Thayer s model of mood is adopted, which is composed of four music moods, Contentment, Depression, Exuberance, and Anxious/Frantic. The average accuracy of mood detection is up to 86.3%. Mark D. Korhonen, David A. Clausi, and M. Ed Jernigan [3] proposed modeling emotional content of music using system identification. This paper developed a methodology to model the emotional content of music. System-identification techniques are used to create the emotional content models. Emotion Space Lab is used to quantify emotions using the dimensions valence and arousal. Because Emotion Space Lab collects emotional appraisal data at 1 Hz. Results shows that the system identification provides the emotional content for a genre of music. Yi-Hsuan Yang, Yu-Ching Lin, YaFan Su, and Homer H. Chen [4] describes a regression approach to music emotion recognition. This paper proposed for recognizing the emotion content of music signals. Music emotion recognition 83 Page

MER is formulated as a regression problem to predict the arousal and valence values (AV values). To improve the performance, principal component analysis is used and it reduce the correlation between arousal and valence. The best performance for arosal is 58.3% and for valence is 28.1% by employing support vector machine as the regressor. George Tzanetakis, Perry Cook [5] explains Musical Genre classification of Audio Signals. Musical genres are categorized by humans to characterize pieces of music. The categorized characteristics typically related to the instrumentation, rhythmic structure, and harmonic content of the music. Timbral texture, rhythmic content and pitch content these three features are proposed in this paper. Training statistical pattern recognition classifiers is used to evaluate proposed features. Using the feature sets, 61% classification of ten musical genres is achieved. Jong In Lee, Dong Gyu Yeo, Byeong Man Kim, and HaeYeoun Lee [6] introduces Automatic Music Mood Detection through Musical Structure Analysis. The mood variation in music makes their application more difficult. To cope with these problems, the author present an automatic method to classify the music mood. A modified Thayer's 2-dimensional mood model with AV model is used to detect the mood. EiEiPeMyint, Moe Pwint [7] proposed An Approach for Multi Label Music Mood Classification. This paper presents selfcolored music mood segmentation and a hierarchical framework based on new mood taxonomy model. The proposed mood taxonomy model combines Thayer s 2 Dimension (2D) model and Schubert s Updated Hevner adjective Model (UHM). FSVM has superior accuracy as compare with the SVM. III. THEORETICAL BACKGROUND The mood of the people can be recognized by the music which they are listening. The characteristics of music such as rhythm, melody, harmony, pitch and timbre play a significant role in human physiological and psychological functions, altering their mood. With the help of these music characteristics the music mood is divided in different types of mood as: Happy, Exuberant, energetic, depression, frantic,sad, calm and contentment[2]. There are number of music feature. From that some of acoustic features such as intensity, timbre, pitch and rhythm are given below: 1.1 Intensity Features Intensity is an essential feature in music mood detection. For example, the intensity of Contentment and Depression is usually little, while that of Exuberance and Anxious/Frantic is usually large. It gives an indication of the degree of loudness or calmness of music. 1.2 Timbre Features In music timbre also known as tone color or tone quality, it is the quality of musical note or sound or tone that distinguishes different types of sound production. For example, the brightness of Exuberance music is usually higher than that of Depression. 1.3 Pitch features The pitch of a sound is dependent on the frequency of vibration and the size of the vibrating object. This feature corresponds to the relative lowness or highness that can be heard in a song. 1.4 features : In music rhythm refers to the placement of sounds intime. The sounds along with silences in between create a pattern, when these patterns are repeated they form rhythm.in general, three aspects of rhythm are related with people s mood response: rhythm strength, rhythm regularity, and tempo. IV. MOOD MODELS Human psychologists have done a great deal of work and proposed a numberof models on human emotions. 4.1 Hevner's experiment In music psychology, the earliest and best known systematic attempt at creating music mood taxonomy was by Kate Hevner. Hevner examined the affectivevalue of six musical features such as tempo, mode, rhythm, pitch, harmont andmelody and studied how they relate to mood. Based on the study 67 adjectives were categorized into eight different emotional groups with similar emotions. 4.2 Russell's model Both Ekmans and Hevners models belong to Categorical Model" because the mood spaces consist of a set of discrete mood categories. On the contrary, James Russell came up with a circumflex model of emotions arranging 28 adjectives in a circle on two dimensional bipolar space (arousal - valence). This model helpedin separating and keeping away the opposite emotions. 4.3 Thayer's model Another well known dimensional model was proposed by Thayer. It describes the mood with two factors: Stress dimension (happy/anxious) and Energy dimension (calm/energetic), and divides music mood into four clusters according to the four quadrants in the two-dimensional space: Contentment, Depression, Exuberance and Anxious (Frantic), as shown in fig1[1][2]. 84 Page

High Energ Anxi Exub +ve -ve Depr essio Conte Low Fig.1: Thayer's Mood Model V. MOOD DETECTIONFRAMEWORK 5.1 Hierarchical Mood DetectionFramework Using GMM Timber Contentment Group 1 1 - + Depression Music Clip X Intensity Timber Exuberance Group 2 + 1 - Anxious/Frantic Fig. 2. Hierarchical mood detection framework[2] Layer 1 Layer 2 Layer 3 85 Page

Based on Thayer s model of mood, a hierarchical framework is proposed for mood detection, as illustrated in Fig.2. The intensity features are first used to classify a music clip into one of two mood groups.[2] The basic rule could be, if its energy is low, the music clip will be classified into Group 1(Contentment and Depression); otherwise, it is classified into Group 2 (Exuberance and Anxious/Frantic). Subsequently, the remaining features, including timbre and rhythm, are used todetermine which exact mood the music clip is. With the obtained GMM models, the detailed mood classificationcan be performed in the following two steps. In the firststep, a music clip is classified into different mood groups, i.e.group 1 (Contentment and Depression) and Group 2 (Exuberanceand Anxious/Frantic), by employing a simple hypothesistest with the intensity features, as (1) Where is the likelihood ratio, Gi represents different mood groupi is the intensity feature set.in the second step, the music clip in Group 1 is classified into Contentment and Depression, while that in Group 2 is classified into Exuberanceand Anxious/Frantic, based on the timbre and rhythm features. In each group, the probability of the testing clip belonging to an exact mood can be calculated as, (2) where Mi,jis thejth mood cluster in ith mood group, T and R represent timbre and rhythm features, respectively, and are two weighting factors to represent different importance of timbre and rhythm features[2]. 5.2 Nonhierarchical mood detectionframework using GMM Music Clip X Intensity Timbre GMM Contentment Depression Exuberance Anxious/Frantic Fig. 3 Nonhierarchical mood detection framework[2] Nonhierarchical framework is shown in fig.3. Comparing Nonhierarchical framework with its hierarchical part, the hierarchical framework can make better use of sparse training data, which is very important especially when the training data is limited. In the framework, a Gaussian mixture model (GMM) with 16 mixtures is utilized to model each feature set regarding each mood cluster (group). In constructing each GMM, the Expectation Maximization (EM) algorithm is used to estimate the parameters of Gaussian components and mixture weights, and K -means are employed for initialization [2]. 5.3 SVM Support vector machines (SVM) is based on the principle of empirical risk minimization i.e., minimization of error on training data.for linear separable data SVM finds a separating hyper lane which separates the data with the largest margin. For linearly separable data, it maps the input pattern space X to a high dimensional feature space Z using a nonlinear function. Then the SVM finds optimal hyper plane as the decision surface to separate the examples of two classes in the feature space. The SVM in particular defines the criterion to be looking for a decision surface that is maximally far away from any data point[10]. This distance from the decision surface to the closest data point determines the margin of the classifier. This method of construction necessarily means that the decision function for an SVM is fully specified by a (usually small) subset of the data which defines the position of the separator. VI. CONCLUSION This paper presents an approach to mood detection for acoustic recordings of music. A hierarchical framework is used to detect the mood in a music clip. In this intensity features, timbre and rhythm features are extracted. The hierarchal framework can utilize the most suitable features in different tasks and can perform better than its nonhierarchicalframework.in SVM, a Mel frequency cepstral coefficient (MFCC) is extracted as a feature from the data collected. SVM Classifier performs better, which offers a new efficient way of solving problems. ACKNOWLEDGMENTS Any research or project is never an individual effort but contribution of many hands and brains. With great pleasure I express my gratitude to our Principal Prof.Dr.D.S.Bormane and Head Of Department Mr. D.G.Bhalke. I would like to place my thanks to all the faculty members of the Electronics and Telecommunication.At critical occasions their 86 Page

affectionate and helping attitude helped me a lot in rectifying my mistakes and proved to be sources of unending inspiration, for which I am grateful to them. Their timely suggestions have helped me in completing this research work in time. REFERENCES [1] Bhat, A.S.; Amith, V.S.; Prasad, N.S.; Mohan, D.M., "An Efficient Classification Algorithm for Music Mood Detection in Western and Hindi Music Using Audio Feature Extraction," Signal and Image Processing (ICSIP), 2014 Fifth International Conference on, vol., no., pp.359,364, 8-10 Jan. 2014 [2] Lie Lu; Dan Liu; Hong-Jiang Zhang, "Automatic mood detection and tracking of music audio signals," Audio, Speech, and Language Processing, IEEE Transactions on, vol.14, no.1, pp.5,18, Jan. 2006 [3] Korhonen, M.D.; Clausi, D.A; Jernigan, M.E., "Modeling emotional content of music using system identification," Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol.36, no.3, pp.588,599, June 2005 [4] Yi-Hsuan Yang; Yu-Ching Lin; Ya-Fan Su; Chen, H.H., "A Regression Approach to Music Emotion Recognition," Audio, Speech, and Language Processing, IEEE Transactions on, vol.16, no.2, pp.448,457, Feb. 2008 [5] Tzanetakis, G.; Cook, P., "Musical genre classification of audio signals," Speech and Audio Processing, IEEE Transactions on, vol.10, no.5, pp.293,302, Jul 2002 [6] Lee, Jong In; Yeo, Dong-Gyu; Kim, Byeong Man; Hae-Yeoun Lee, "Automatic Music Mood Detection through Musical Structure Analysis," Computer Science and its Applications, 2009. CSA '09. 2nd International Conference on, vol., no., pp.1,6, 10-12 Dec. 2009 [7] Myint, E.E.P.; Pwint, M., "An approach for mulit-label music mood classification," Signal Processing Systems (ICSPS), 2010 2nd International Conference on, vol.1, no., pp.v1-290,v1-294, 5-7 July 2010 [8] Miyoshi, M.; Tsuge, S.; Oyama, T.; Ito, M.; Fukumi, M., "Feature selection method for music mood score detection," Modeling, Simulation and Applied Optimization (ICMSAO), 2011 4th International Conference on, vol., no., pp.1,6, 19-21 April 2011 [9] Bartoszewski, M.; Kwasnicka, H.; Markowska-Kaczmar, U.; Myszkowski, P.B., "Extraction of Emotional Content from Music Data," Computer Information Systems and Industrial Management Applications, 2008. CISIM '08.7th, vol., no., pp.293,299, 26-28 June 2008 [10] E. Vijayavani1; P. Suganya; S.Lavanya; E.Elakiya, Emotion Recognition Based on MFCC Features using SVM,International Journal of Advance Research incomputer Science and Management Studies,Volume 2, Issue 4, April 2014 [11] A. McCallum et al., Improving text classification by shrinkage in a hierarchy of classes, in Proc. Int. Conf. Machine Learning, 1998, pp. 359 367. [12] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York: Springer-Verlag, 2001. 87 Page