Music Mood Classification - an SVM based approach. Sebastian Napiorkowski

Similar documents
Music Emotion Recognition. Jaesung Lee. Chung-Ang University

MUSI-6201 Computational Music Analysis

A Categorical Approach for Recognizing Emotional Effects of Music

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Exploring Relationships between Audio Features and Emotion in Music

MODELING MUSICAL MOOD FROM AUDIO FEATURES AND LISTENING CONTEXT ON AN IN-SITU DATA SET

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Supervised Learning in Genre Classification

Multimodal Music Mood Classification Framework for Christian Kokborok Music

Headings: Machine Learning. Text Mining. Music Emotion Recognition

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Detection of Emotion in Music: Interaction with Emotionally Sensitive Machines

Performance Improvement of Music Mood Classification Using Hyper Music Features

EXPLORING MOOD METADATA: RELATIONSHIPS WITH GENRE, ARTIST AND USAGE METADATA

MELODY ANALYSIS FOR PREDICTION OF THE EMOTIONS CONVEYED BY SINHALA SONGS

Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features

WHEN LYRICS OUTPERFORM AUDIO FOR MUSIC MOOD CLASSIFICATION: A FEATURE ANALYSIS

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Music Genre Classification and Variance Comparison on Number of Genres

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Content-based music retrieval

Multimodal Mood Classification Framework for Hindi Songs

Automatic Music Clustering using Audio Attributes

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Improving Music Mood Annotation Using Polygonal Circular Regression. Isabelle Dufour B.Sc., University of Victoria, 2013

AudioRadar. A metaphorical visualization for the navigation of large music collections

Analytic Comparison of Audio Feature Sets using Self-Organising Maps

Mood Tracking of Radio Station Broadcasts

Indexing Music by Mood: Design and Integration of an Automatic Content-based Annotator

Multi-Modal Music Emotion Recognition: A New Dataset, Methodology and Comparative Analysis

Multimodal Sentiment Analysis of Telugu Songs

The Role of Time in Music Emotion Recognition

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

Automatic Music Similarity Assessment and Recommendation. A Thesis. Submitted to the Faculty. Drexel University. Donald Shaul Williamson

Week 14 Music Understanding and Classification

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

MINING THE CORRELATION BETWEEN LYRICAL AND AUDIO FEATURES AND THE EMERGENCE OF MOOD

Music Mood Classication Using The Million Song Dataset

A Survey of Audio-Based Music Classification and Annotation

Music Information Retrieval

Outline. Why do we classify? Audio Classification

Improving Frame Based Automatic Laughter Detection

Aalborg Universitet. Feature Extraction for Music Information Retrieval Jensen, Jesper Højvang. Publication date: 2009

Data Driven Music Understanding

Automatic Music Genre Classification

Singer Recognition and Modeling Singer Error

A Survey Of Mood-Based Music Classification

Coimbra, Coimbra, Portugal Published online: 18 Apr To link to this article:

Music Genre Classification

Lyrics Classification using Naive Bayes

Chord Classification of an Audio Signal using Artificial Neural Network

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Music Emotion Classification based on Lyrics-Audio using Corpus based Emotion

Psychophysiological measures of emotional response to Romantic orchestral music and their musical and acoustic correlates

Lyric-Based Music Mood Recognition

POLITECNICO DI TORINO Repository ISTITUZIONALE

A DATA-DRIVEN APPROACH TO MID-LEVEL PERCEPTUAL MUSICAL FEATURE MODELING

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Music Tempo Classification Using Audio Spectrum Centroid, Audio Spectrum Flatness, and Audio Spectrum Spread based on MPEG-7 Audio Features

Automatic Piano Music Transcription

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Toward Multi-Modal Music Emotion Classification

Music Recommendation from Song Sets

MODELS of music begin with a representation of the

A New Method for Calculating Music Similarity

A FEATURE SELECTION APPROACH FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

A Large Scale Experiment for Mood-Based Classification of TV Programmes

MUSICAL INSTRUMENTCLASSIFICATION USING MIRTOOLBOX

Subjective Similarity of Music: Data Collection for Individuality Analysis

MUSIC TONALITY FEATURES FOR SPEECH/MUSIC DISCRIMINATION. Gregory Sell and Pascal Clark

Automatic Laughter Detection

Automatic Emotion Prediction of Song Excerpts: Index Construction, Algorithm Design, and Empirical Comparison

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

Research & Development. White Paper WHP 232. A Large Scale Experiment for Mood-based Classification of TV Programmes BRITISH BROADCASTING CORPORATION

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Music Information Retrieval Community

RESEARCH ARTICLE. Improving Music Genre Classification Using Automatically Induced Harmony Rules

Classification of Timbre Similarity

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

Mood Classification Using Lyrics and Audio: A Case-Study in Greek Music

Music Similarity and Cover Song Identification: The Case of Jazz

Music Information Retrieval with Temporal Features and Timbre

Information Processing and Management

STRUCTURAL CHANGE ON MULTIPLE TIME SCALES AS A CORRELATE OF MUSICAL COMPLEXITY

The song remains the same: identifying versions of the same piece using tonal descriptors

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

ARE TAGS BETTER THAN AUDIO FEATURES? THE EFFECT OF JOINT USE OF TAGS AND AUDIO CONTENT FEATURES FOR ARTISTIC STYLE CLUSTERING

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

Effects of acoustic degradations on cover song recognition

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

MUSICAL INSTRUMENT RECOGNITION USING BIOLOGICALLY INSPIRED FILTERING OF TEMPORAL DICTIONARY ATOMS

A Music Retrieval System Using Melody and Lyric

Transcription:

Music Mood Classification - an SVM based approach Sebastian Napiorkowski Topics on Computer Music (Seminar Report) HPAC - RWTH - SS2015

Contents 1. Motivation 2. Quantification and Definition of Mood 3. How mood classification is done 4. Example: Mood and Theme Classification based on an Support Vector Machine approach 2

Motivation Imagine you could search songs based on the mood Create Playlists that follow a mood M d Create Playlists that follow a theme (e.g. party time) Users are already trying [1]: music related searches mood related theme related 15% 30% 0 20 40 60 80 100

Contents 1. Motivation 2. Quantification and Definition of Mood 1. Perception and Definition 2. MIREX mood clusters 3. Russell/Thayer s Valence-Arousal model 3. How mood classification is done 4. Example: Mood and Theme Classification based on an Support Vector Machine approach 4

Perception and Definition Emotions can be [2] expressed by music feelings that are intrinsic to a given track induced by music feelings that the listener associates with a given track Music can have a [4] Mood the state and/or quality of a particular feeling associated to the track (e.g. happy, sad, aggressive) Theme refers to context or situations which fit best when listening to the track (e.g. party time, christmas, at the beach) 5

Perception and Definition Emotions can be [2] we focus on this expressed by music feelings that are intrinsic to a given track induced by music feelings that the listener associates with a given track Music can have a [4] Mood the state and/or quality of a particular feeling associated to the track (e.g. happy, sad, aggressive) Theme refers to context or situations which fit best when listening to the track (e.g. party time, christmas, at the beach) 6

MIREX mood clusters MIREX (Music Information Retrieval Evaluation exchange) (first mood task 2007) mutual exclusive clusters derived by performing clustering on a co-occurrence matrix of mood labels for popular music from AllMusic.com Guide [5] Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 passionate, rousing, confident, boisterous, rowdy rollicking, cheerful, fun, sweet, amiable/ good natured literate, poignant, wistful, bittersweet, autumnal, brooding humorous, silly, campy, quirky, whimsical, witty, wry aggressive, fiery, tense/ anxious, intense, volatile, visceral 7

Russell/Thayer s Valence-Arousal model most noted dimensional model [3] emotion exist on a plane along independent axes high to low - arousal (intensity) angry annoyed frustrated miserable alarmed Arousal astonished aroused delighted glad happy Valence pleased positive to negative - valence (appraisal of polarity) bored tired content satisfied calm 8

Contents 1. Motivation 2. Quantification and Definition of Mood 3. How mood classification is done 1. Content-based Audio Analysis 4. Example: Mood and Theme Classification based on an Support Vector Machine approach 9

How mood classification is done (or tried at least) [3] Contextual Text Information mining web documents social tags Emotion recognition from lyrics Content-based Audio Analysis Hybrid Approaches 10

How mood classification is done (or tried at least) [3] Contextual Text Information mining web documents social tags Emotion recognition from lyrics we focus on this Content-based Audio Analysis Hybrid Approaches 11

Content-based Audio Analysis much prior work in Music-IR: audio features overview of most common used acoustic features used for mood recognition: blackbox toolset for audio classification Type Dynamics Timbre (tone color) Harmony Register Rhythm Articulation Features RMS energy Mel-frequency cepstral coefficients (MFCCs), spectral shape, spectral contract Roughness, harmonic changes, key clarity, maharanis Chromagram, chroma centroid and deviation rhythm strength, regularity, tempo, beat histograms Event density, attack slope, attack time 12

Content-based Audio Analysis more or less AC power tune combination pleasent for the ear spectrum is projected onto 12 bins forming one octave time a tune gets to it s loudest part Type Dynamics Timbre (tone color) Harmony Register Rhythm Articulation Features RMS energy Mel-frequency cepstral coefficients (MFCCs), spectral shape, spectral contract Roughness, harmonic changes, key clarity, maharanis Chromagram, chroma centroid and deviation rhythm strength, regularity, tempo, beat histograms Event density, attack slope, attack time 13

Content-based Audio Analysis like JPEG for sound Type Dynamics Timbre (tone color) Harmony Register Rhythm Articulation Features RMS energy Mel-frequency cepstral coefficients (MFCCs), spectral shape, spectral contract Roughness, harmonic changes, key clarity, maharanis Chromagram, chroma centroid and deviation rhythm strength, regularity, tempo, beat histograms Event density, attack slope, attack time figure taken from http://www.pampalk.at/ma/documentation.html 14

Contents 1. Motivation 2. Quantification and Definition of Mood 3. How mood classification is done 4. Example: Mood and Theme Classification based on an Support Vector Machine approach 1. Datasets 2. Audio Feature - SV-Machine learning 3. Social Tags - Naive Bayes classifier 15

4. Example: Mood and Theme Classification based on an Support Vector Machine approach based on: Music Mood and Theme Classification - a hybrid approach Kerstin Bischoff, Claudiu S. Firan, Raluca Paiu, Wolfgang Nejdl L3S Research Center Appelstr. 4, Hannover, Germany Cyril Laurier, Mohamed Sordo Music Technology Group Universitat Pompeu Fabra 16

4. Example: Mood and Theme Classification based on an Support Vector Machine approach based on: Music Mood and Theme Classification - a hybrid approach worked on MIREX mood clusters [5] Kerstin Bischoff, Claudiu S. Firan, Raluca Paiu, Wolfgang Nejdl L3S Research Center Appelstr. 4, Hannover, Germany Cyril Laurier, Mohamed Sordo Music Technology Group Universitat Pompeu Fabra 17

Datasets: The truth, the whole truth, and nothing but the truth Find a ground truth dataset for training "ground truth" refers to the accuracy of the training set AllMusic.com (1995), Data gets created by music experts therefore good ground truth corpus: Found 178 different moods and 73 Themes 5,770 Tracks with moods assigned 8,158 track-mood assignments (avg. 1.73 moods, max. 12) 1,218 track-theme assignments (avg. 1.21 themes, max. 6) 18

Dataset: Social Tags Last.fm (2002) popular UK-based Internet radio and music community website Obtain tags for tracks from AllMusic.com Not all 5,770 Tracks have user tags Dataset is reduced to 4,737 Tracks 19

Dataset: Social Tags Last.fm (2002) popular UK-based Internet radio and music community website Obtain tags for tracks from AllMusic.com Not all 5,770 Tracks have user tags Dataset is reduced to 4,737 Tracks 19

Dataset: Prepare for multiclass classifier (1/2) We use the MIREX mood clusters five to seven AllMusic.com mood labels define together a MIREX mood cluster as mood clusters are mutual exclusive we restrict our dataset to tracks with 1-to-1 mood-track relations therefore dataset is reduced to 1192 distinct tracks 20

Dataset: Prepare for multiclass classifier (1/2) We use the MIREX mood clusters five to seven AllMusic.com mood labels define together a MIREX mood cluster as mood clusters are mutual exclusive we restrict our dataset to tracks with 1-to-1 mood-track relations therefore dataset is reduced to 1192 distinct tracks 20

Dataset: Prepare for multiclass classifier (1/2) To get an equal training set for the classifier, the cluster size is reduced to 200 per cluster 5 Clusters means 1000 tracks for machine learning 21

Dataset: Prepare for multiclass classifier (1/2) To get an equal training set for the classifier, the cluster size is reduced to 200 per cluster 5 Clusters means 1000 tracks for machine learning 21

Dataset: Prepare for multiclass classifier (1/2) To get an equal training set for the classifier, the cluster size is reduced to 200 per cluster 5 Clusters means 1000 tracks for machine learning 21

Support Vector Machine Learning Dataset 1000 Tracks 22

classifiy 200ms frame-based timbral tonal extracted features rhythmic including MFCCs, BPM chroma features spectral centroid 23

assign mood from ground truth set 24

max. margin calculate support vectors 25

Radial Basis Function (RBF) kernel performed best 26

Results and Evaluation audio features were classified by a SVM also social tags were used to classify a track with a Naive Bayes classifier (calculating Likelihoods) Algorithm is the same as in an other paper submitted to MIREX, but the results differ as they obtained 60.5 % accuracy and here we obtained only Mood MIREX Mood THAYER Themes clustered Classifier Accuracy SVM (audio) 0.450 NB (tags) 0.565 Combined 0.575 Classifier Accuracy SVM (audio) 0.517 NB (tags) 0.539 Combined 0.596 Classifier Accuracy SVM (audio) 0.527 NB (tags) 0.595 Combined 0.625 27

Evaluation Mood MIREX Mood THAYER Themes clustered Classifier Accuracy SVM (audio) 0.450 NB (tags) 0.565 Combined 0.575 Classifier Accuracy SVM (audio) 0.517 NB (tags) 0.539 Combined 0.596 Classifier Accuracy SVM (audio) 0.527 NB (tags) 0.595 Combined 0.625 classifier relying only on audio features perform worse than pure tag based but combined: improve overall results The used ground-truth set was not that good as expected possible improvements: filter training and test instances using listeners (that focus on audio only) 28

Conclusion Emotions are fuzzy and it s not trivial to define them Machine learning highly depends on quality of training data It is hard to find a high quality ground truth dataset that is large enough since 2007 the results seem disillusioning: mood classification is hard to do 1 0.875 0.75 0.625 0.5 2007 2009 2011 2013 MIREX year Best Mood Classification Accuracy [6] 2014 0.6633 2013 0.6833 2012 0.6783 2011 0.6950 2010 0.6417 2009 0.6567 2008 0.6367 2007 0.6150 29

References 1. K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu: Can all tags be used for search?, CIKM, pp. 193 202, 2008. 2. P. Juslin and P. Luakka, Expression, perception, and induction of musical emotions: A review and questionnaire study of everyday listening, Journal of New Music Research, vol. 33, no. 3, p. 217, 2004. 3. Kim, Youngmoo E., et al. "Music emotion recognition: A state of the art review." Proc. ISMIR. 2010. 4. Bischoff, Kerstin, et al. "Music Mood and Theme Classification-a Hybrid Approach." ISMIR. 2009. 5. Downie, X. H. J. S., Cyril Laurier, and M. B. A. F. Ehmann. "The 2007 MIREX audio mood classification task: Lessons learned." ISMIR 2008: Proceedings of the 9th International Conference of Music Information Retrieval. Lulu. com, 2008. 6. http://www.music-ir.org/mirex/wiki/mirex_home 30