gresearch Focus Cognitive Sciences

Similar documents
Classifying music perception and imagination using EEG

BRAIN BEATS: TEMPO EXTRACTION FROM EEG DATA

TOWARDS MUSIC IMAGERY INFORMATION RETRIEVAL: INTRODUCING THE OPENMIIR DATASET OF EEG RECORDINGS FROM MUSIC PERCEPTION AND IMAGINATION

Brain-Computer Interface (BCI)

MUSI-6201 Computational Music Analysis

Detecting Musical Key with Supervised Learning

Singer Traits Identification using Deep Neural Network

DATA! NOW WHAT? Preparing your ERP data for analysis

Audio. Meinard Müller. Beethoven, Bach, and Billions of Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

The Million Song Dataset

Beethoven, Bach, and Billions of Bytes

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Music Synchronization. Music Synchronization. Music Data. Music Data. General Goals. Music Information Retrieval (MIR)

Automatic Music Clustering using Audio Attributes

Neural Network for Music Instrument Identi cation

Music Information Retrieval

Beethoven, Bach und Billionen Bytes

Pre-Processing of ERP Data. Peter J. Molfese, Ph.D. Yale University

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Data-Driven Solo Voice Enhancement for Jazz Music Retrieval

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Lecture 9 Source Separation

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

Predicting the immediate future with Recurrent Neural Networks: Pre-training and Applications

Music Composition with RNN

Feature-Based Analysis of Haydn String Quartets

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

Music Similarity and Cover Song Identification: The Case of Jazz

Automatic Music Genre Classification

Experiment PP-1: Electroencephalogram (EEG) Activity

Tempo and Beat Tracking

Music Processing Introduction Meinard Müller

Common Spatial Patterns 3 class BCI V Copyright 2012 g.tec medical engineering GmbH

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

Meinard Müller. Beethoven, Bach, und Billionen Bytes. International Audio Laboratories Erlangen. International Audio Laboratories Erlangen

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Experimenting with Musically Motivated Convolutional Neural Networks

Data Driven Music Understanding

Common Spatial Patterns 2 class BCI V Copyright 2012 g.tec medical engineering GmbH

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

Tempo and Beat Analysis

Supplemental Material for Gamma-band Synchronization in the Macaque Hippocampus and Memory Formation

Enabling editors through machine learning

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

TRACKING THE ODD : METER INFERENCE IN A CULTURALLY DIVERSE MUSIC CORPUS

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Music BCI ( )

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex

FIBRE CHANNEL CONSORTIUM

DeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,

Image-to-Markup Generation with Coarse-to-Fine Attention

Music Genre Classification and Variance Comparison on Number of Genres

Music Information Retrieval (MIR)

Smooth Rhythms as Probes of Entrainment. Music Perception 10 (1993): ABSTRACT

CS229 Project Report Polyphonic Piano Transcription

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Structured training for large-vocabulary chord recognition. Brian McFee* & Juan Pablo Bello

Chord Classification of an Audio Signal using Artificial Neural Network

Good playing practice when drumming: Influence of tempo on timing and preparatory movements for healthy and dystonic players

Semi-supervised Musical Instrument Recognition

JOINT BEAT AND DOWNBEAT TRACKING WITH RECURRENT NEURAL NETWORKS

Automatic Piano Music Transcription

Hidden Markov Model based dance recognition

Video-based Vibrato Detection and Analysis for Polyphonic String Music

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Supervised Learning in Genre Classification

Fitt s Law Study Report Amia Oberai

BitWise (V2.1 and later) includes features for determining AP240 settings and measuring the Single Ion Area.

Music Genre Classification

Template Matching for Artifact Detection and Removal

Dr Kelly Jakubowski Music Psychologist October 2017

arxiv: v1 [cs.lg] 15 Jun 2016

Further Topics in MIR

Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder

Joint bottom-up/top-down machine learning structures to simulate human audition and musical creativity

THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. Gideon Broshy, Leah Latterner and Kevin Sherwin

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

An ecological approach to multimodal subjective music similarity perception

Personalized TV Recommendation with Mixture Probabilistic Matrix Factorization

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

Music Information Retrieval (MIR)

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

Sarcasm Detection in Text: Design Document

Measurements on GSM Base Stations According to Rec

Automatic Construction of Synthetic Musical Instruments and Performers

Automatic Rhythmic Notation from Single Voice Audio Sources

Outline. Why do we classify? Audio Classification

arxiv: v1 [cs.ir] 16 Jan 2019

DESIGNING OPTIMIZED MICROPHONE BEAMFORMERS

Music Representations

Automatic Laughter Detection

ONE SENSOR MICROPHONE ARRAY APPLICATION IN SOURCE LOCALIZATION. Hsin-Chu, Taiwan

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

Finger motion in piano performance: Touch and tempo

Classification of Dance Music by Periodicity Patterns

Pitch Perception. Roger Shepard

Transcription:

Learning about Music Cognition by Asking MIR Questions Sebastian Stober August 12, 2016 CogMIR, New York City sstober@uni-potsdam.de http://www.uni-potsdam.de/mlcog/ MLC g Machine Learning in Cognitive Science Lab

From MIR to MIIR Music Imagery Information Retrieval = retrieving music information from brain signals Sebastian Stober - CogMIR 2016 2016-08-12 2

12 audio stimuli from 8 music pieces 4 songs recorded each with and without lyrics 4 instrumental pieces complete musical phrases length between 6.9s and 16s (mean 10.5) https://github.com/sstober/openmiir Sebastian Stober - CogMIR 2016 2016-08-12 3

Experiment Setup sound booth presentation system feedback video audio events presentation system screen & speakers feedback keyboard markers recording system stimtracker receiver (optical) EEG amp on battery Biosemi ActiveTwo, 64 EEG + 4 EOG channels @ 512 Hz MLC g Sebastian Stober - CogMIR 2016 2016-08-12 4

The 12 Music Stimuli songs with / without lyrics: meter tempo length (s) 1 Chim Chim Cheree 3/4 210 13.3 13.5 2 Take me out to the Ballgame 3/4 186 7.7 7.7 3 Jingle Bells 4/4 200 9.7 9.0 4 Mary Had a Little Lamb 4/4 160 11.6 12.2 instrumental pieces: 1 Emperor Waltz 3/4 178 8.3 2 Harry Potter Theme 3/4 166 16.0 3 Imperial March (Star Wars Theme) 4/4 104 9.2 4 Eine Kleine Nachtmusik 4/4 140 6.9 Sebastian Stober - CogMIR 2016 2016-08-12 5

MIIR Questions audio reconstruction failed (non-sparsity) stimulus identification beat and tempo tracking meter classification lyrics / non-lyrics / instrumental classification Sebastian Stober - CogMIR 2016 2016-08-12 6

Stimulus Identification: Pre-Training Method Learning Distinguishing Features using Similarity-Constraint Encoders Sebastian Stober - CogMIR 2016 2016-08-12 7

Similarity-Constraint Encoder exploit synchronization between trials expect similar temporal patterns for the same stimulus goal: improve signal-to-noise ratio learn signal filters that lead to distinguishing (temporal) patterns for the different classes Sebastian Stober - CogMIR 2016 2016-08-12 8

Similarity-Constraint Encoder motivated by relative constraints used for metric learning: for all paired trials (A,B) + trial C from other class: sim(a,b) > sim(a,c) many combination for (A,B) and C favors features that are representative and allow to distinguish classes Sebastian Stober - CogMIR 2016 2016-08-12 9

Similarity-Constraint Encoder (virtual network structure) Input Triplet Feature Extraction (signal filter) Pairwise Similarity (dot product) Prediction (probabilities) Reference Input Encoder Paired Input Trial Encoder Similarity Softmax Other Input Trial Encoder Similarity (shared weights) minimize constraint violations Sebastian Stober - CogMIR 2016 2016-08-12 10

Stimulus Identification 12-class single-trial classification Sebastian Stober - CogMIR 2016 2016-08-12 11

Nested Cross-Validation 9-fold subject cross-validation train on data from 8 subjects (8x5x12=480 trials), test on remaining subject (1x5x12=60 trials) Pre-Training Supervised Training 5-fold trial block cross-validation 8x4x12=384 training trials from 4 trial blocks, 8x1x12=96 validation trials from remaining trial block train on 50688 triplets from 384 training trials select model (early stopping) based on 21120 validation triplets (a,b,c) with a from 96 validation trials and b,c from 480 training and validation trials encoder layer (L1) = average over folds 5-fold trial block cross-validation same training/validation splits as in pre-training phase SVC: select best value for C (grid search) based on highest mean validation accuracy train with selected C on 480 training trials Neural Network: select fold model (early stopping) based on highest validation accuracy classifier layer (L2) = average over folds Sebastian Stober - CogMIR 2016 2016-08-12 12

Resulting Spatial Filter only use within-subject trial triplets train on 8 of 9 subjects 9 versions (just 1 filter per encoder!) Sebastian Stober - CogMIR 2016 2016-08-12 13

Stimulus Classification (9-fold cross-subject validation) Classifier Features Accuracy SVC raw EEG 18.52% SVC raw EEG channel mean 12.41% End-to-end NN raw EEG 16.30% SVC 12-class encoder output 27.22% Neural Net 12-class encoder output 26.67% significant improvement over baseline (McNemar s test with n=540, p < 0.0002) Sebastian Stober - CogMIR 2016 2016-08-12 14

Stimulus Classification (9-fold cross-subject validation) Chim Chim Cheree (lyrics) Take Me Out to the Ballgame (lyrics) Jingle Bells (lyrics) Mary Had a Little Lamb (lyrics) Chim Chim Cheree Take Me Out to the Ballgame Jingle Bells Mary Had a Little Lamb Emperor Waltz Hedwig s Theme (Harry Potter) Imperial March (Star Wars Theme) Eine Kleine Nachtmusik Sebastian Stober - CogMIR 2016 2016-08-12 15

Mean NN Parameters very simple model similar patterns for lyrics / non-lyrics pairs Sebastian Stober - CogMIR 2016 2016-08-12 16

Sebastian Stober - CogMIR 2016 2016-08-12 17

Sebastian Stober - CogMIR 2016 2016-08-12 18

Classifying Imagination using the same pre-training technique hardly above random accuracy most likely due to poor timing / sync using the same pre-trained filter same problem: hard to learn temporal patterns => experiment redesign / different encoder Sebastian Stober - CogMIR 2016 2016-08-12 19

Tempo Extraction [ISMIR 16] Sebastian Stober - CogMIR 2016 2016-08-12 20

Tempo Extraction [ISMIR 16] (a) (b) Audio Tempo (BPM) 159 BPM Time (seconds) Tempo (BPM) (c) (d) EEG Tempo (BPM) 158 BPM Time (seconds) Tempo (BPM) Sebastian Stober - CogMIR 2016 2016-08-12 21

#peaks tempo error (%) (a) Single-trial (b) Fusion I (c) Fusion II nn = 1 δδ (a) (b) (c) 0 98 97 83 3 84 80 58 5 78 75 50 7 75 72 42 Stimulus ID Absolute BPM Error nn = 2 δδ (a) (b) (c) 0 96 97 83 3 79 67 42 5 71 57 33 7 65 52 25 Stimulus ID Absolute BPM Error δδ (a) (b) (c) 0 96 97 83 nn = 3 3 73 60 42 5 62 47 25 7 54 40 25 error tolerance (BPM) Stimulus ID Participant ID Participant ID Absolute BPM Error

Meter Classification 3/4 vs. 4/4 Sebastian Stober - CogMIR 2016 2016-08-12 23

Meter Classification (9-fold cross-subject validation) Classifier Features Accuracy SVC raw EEG 62.04% SVC raw EEG channel mean 58.52% End-to-end NN raw EEG 60.56% Dummy output of 12-class classifier 59.63% SVC 12-class encoder output 69.44% Neural Net 12-class encoder output 67,77% SVC meter-class encoder output 60.19% Neural Net meter-class encoder output 58.88% Sebastian Stober - CogMIR 2016 2016-08-12 24

Meter Classification (9-fold cross-subject validation using spatial filter from stimulus recognition) spatial filter SVC confusion NN confusion NN temporal patterns: 3/4 4/4 time (samples) Sebastian Stober - CogMIR 2016 2016-08-12 25

Group Classification lyrics / non-lyrics / instrumental Sebastian Stober - CogMIR 2016 2016-08-12 26

Group Classification (9-fold cross-subject validation) Classifier Features Accuracy SVC raw EEG 40.37% SVC raw EEG channel mean 38.70% End-to-end NN raw EEG 37.40% Dummy output of 12-class classifier 38.89% SVC 12-class encoder output 48.88% Neural Net 12-class encoder output 48.88% SVC group-class encoder output 35.37% Neural Net group-class encoder output 34.63% Sebastian Stober - CogMIR 2016 2016-08-12 27

Group Classification (9-fold cross-subject validation using spatial filter from stimulus recognition) spatial filter SVC confusion NN confusion NN temporal patterns: 0x 1x 2x time (samples) Sebastian Stober - CogMIR 2016 2016-08-12 28

Conclusions Sebastian Stober - CogMIR 2016 2016-08-12 29

MIIR Questions audio reconstruction failed (non-sparsity) stimulus identification beat and tempo tracking meter classification lyrics / non-lyrics / instrumental classification Sebastian Stober - CogMIR 2016 2016-08-12 30

Proposed MIIR Approach for different music features attempt classification / regression (derived from typical MIR tasks) use similarity-constraint encoder for contrasting i.e. learn features (from data) that are most different for the classes hypothesis-driven encoder design (assumptions about brain activity / features) limits: amount of trials; subject / stimuli bias Sebastian Stober - CogMIR 2016 2016-08-12 31

New Questions 1. How can the spatial filter be interpreted? recall: it produces distinguishable waveforms forward modeling (regression) 2. Which cognitive process results in the prominent signal peak at the 3 rd downbeat? => learn more about music cognition Sebastian Stober - CogMIR 2016 2016-08-12 32

Thank You! Avital Sternin, Jessica A. Grahn, Adrian M. Owen, Thomas Prätzlich, Meinard Müller contact: sstober@uni-potsdam.de www.uni-potsdam.de/mlcog/ code: https://github.com/sstober/deepthought (update coming!) dataset: https://github.com/sstober/openmiir Sebastian Stober - CogMIR 2016 2016-08-12 33