Audio Feature Extraction for Corpus Analysis

Similar documents
CORPUS ANALYSIS TOOLS FOR COMPUTATIONAL HOOK DISCOVERY

FANTASTIC: A Feature Analysis Toolbox for corpus-based cognitive research on the perception of popular music

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Trevor de Clercq. Music Informatics Interest Group Meeting Society for Music Theory November 3, 2018 San Antonio, TX

CSC475 Music Information Retrieval

CALCULATING SIMILARITY OF FOLK SONG VARIANTS WITH MELODY-BASED FEATURES

MUSI-6201 Computational Music Analysis

Computational Modelling of Harmony

Analysing Musical Pieces Using harmony-analyser.org Tools

Music Representations

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

jsymbolic 2: New Developments and Research Opportunities

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Modeling memory for melodies

Music Structure Analysis

THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. Gideon Broshy, Leah Latterner and Kevin Sherwin

Analysis of local and global timing and pitch change in ordinary

A MANUAL ANNOTATION METHOD FOR MELODIC SIMILARITY AND THE STUDY OF MELODY FEATURE SETS

A MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION

An Integrated Music Chromaticism Model

HST 725 Music Perception & Cognition Assignment #1 =================================================================

Music 175: Pitch II. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) June 2, 2015

The KING S Medium Term Plan - Music. Y10 LC1 Programme. Module Area of Study 3

Timbre blending of wind instruments: acoustics and perception

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

If You Wanna Be My Lover A Hook Discovery Game to Uncover Individual Differences in Long-term Musical Memory

Notes on David Temperley s What s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered By Carley Tanoue

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.

Chord Label Personalization through Deep Learning of Integrated Harmonic Interval-based Representations

Tempo and Beat Analysis

Psychoacoustics. lecturer:

Proceedings of Meetings on Acoustics

CSC475 Music Information Retrieval

Outline. Why do we classify? Audio Classification

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

A NOVEL MUSIC SEGMENTATION INTERFACE AND THE JAZZ TUNE COLLECTION

ELVIS. Electronic Locator of Vertical Interval Successions The First Large Data-Driven Research Project on Musical Style Julie Cumming

Curriculum Standard One: The student will listen to and analyze music critically, using the vocabulary and language of music.

SIMSSA DB: A Database for Computational Musicological Research

ILLINOIS LICENSURE TESTING SYSTEM

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

Corpus Studies of Harmony in Popular Music: A Response to Gauvin

Articulation Clarity and distinct rendition in musical performance.

Topic 4. Single Pitch Detection

Music Radar: A Web-based Query by Humming System

Pitfalls and Windfalls in Corpus Studies of Pop/Rock Music

Curriculum Development In the Fairfield Public Schools FAIRFIELD PUBLIC SCHOOLS FAIRFIELD, CONNECTICUT MUSIC THEORY I

Learners will practise and learn to perform one or more piece(s) for their instrument of an appropriate level of difficulty.

EXPLAINING AND PREDICTING THE PERCEPTION OF MUSICAL STRUCTURE

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

K-12 Performing Arts - Music Standards Lincoln Community School Sources: ArtsEdge - National Standards for Arts Education

Perceptual dimensions of short audio clips and corresponding timbre features

Grade HS Band (1) Basic

Grade Level 5-12 Subject Area: Vocal and Instrumental Music

The Human Features of Music.

A geometrical distance measure for determining the similarity of musical harmony. W. Bas de Haas, Frans Wiering & Remco C.

Computer Coordination With Popular Music: A New Research Agenda 1

Audio Structure Analysis

Beschrijving en corpusanalyse van populaire muziek (met een samenvatting in het Nederlands)

Gyorgi Ligeti. Chamber Concerto, Movement III (1970) Glen Halls All Rights Reserved

Introductions to Music Information Retrieval

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

Measurement of overtone frequencies of a toy piano and perception of its pitch

A probabilistic framework for audio-based tonal key and chord recognition

International School of Kenya

2013 Music Style and Composition GA 3: Aural and written examination

Perceptual Evaluation of Automatically Extracted Musical Motives

Empirical Musicology Review Vol. 11, No. 1, 2016

BARBERSHOP BASICS (All you ever wanted to know about Barbershop, but were afraid to ask.)

Music Information Retrieval Using Audio Input

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas

The song remains the same: identifying versions of the same piece using tonal descriptors

Subjective Similarity of Music: Data Collection for Individuality Analysis

MUSIC THEORY CURRICULUM STANDARDS GRADES Students will sing, alone and with others, a varied repertoire of music.

Music Curriculum Kindergarten

jsymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Acoustic and musical foundations of the speech/song illusion

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Creative Computing II

Curriculum Mapping Piano and Electronic Keyboard (L) Semester class (18 weeks)

TEST SUMMARY AND FRAMEWORK TEST SUMMARY

Sequential Association Rules in Atonal Music

Connecticut Common Arts Assessment Initiative

Effects of acoustic degradations on cover song recognition

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Speech To Song Classification

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

Music Similarity and Cover Song Identification: The Case of Jazz

A Matlab toolbox for. Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Singing accuracy, listeners tolerance, and pitch analysis

Voice & Music Pattern Extraction: A Review

TEST SUMMARY AND FRAMEWORK TEST SUMMARY

Sequential Association Rules in Atonal Music

scale of 1 to 6. *Sightread traditional monophonic hymns on their particular instrument. *Play liturgically appropriate literature in class.

Detecting Musical Key with Supervised Learning

Transcription:

Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1

Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends with computational methods Why? MIR: provide access to large corpora of music Musicology: research music from a data-rich perspective Test musicological hypotheses Today: corpus analysis of audio features on choruses hooks 2

Recap g Automatic Segmentation of music: Applications? g E.g. games, indexing for search in large collections, most salient part g Automatic Segmentation of music: what cues do humans use? g Gaps/change in musical features g Repetition g Closure g g Computational approaches to segmentation g Local gaps: Local boundary detection (LBDM) g Expectation: Information-theoretic approaches g Rule-based vs. data-driven models 3

Today: Corpus analysis in language studies: corpus linguistics in musicology: statistical musicology data-driven musicology empirical musicology examples: Syncopation patterns in ragtime See lecture on Rhythm and Meter in SMT course Huron: the melodic arch Rodriguez-Zivic: perception & musical style 4

Corpus analysis in language studies: corpus linguistics in musicology: statistical musicology data-driven musicology empirical musicology examples: Syncopation patterns in ragtime See lecture on Rhythm and Meter in SMT course Huron: the melodic arch Rodriguez-Zivic: perception & musical style 5

Corpus analysis David Huron (1995): The melodic arch in Western folksongs corpus: 6251 folk songs from the Essen Folksong Collection features: melodic pitch height, contour Hypothesis: music theorists - melodic passages tend to exhibit an arch shape where the overall pitch contour rises and then falls over the course of a phrase or an entire melody findings: tendency towards arch-shaped melodic contours confirmed 6

Corpus analysis in language studies: corpus linguistics in musicology: statistical musicology data-driven musicology empirical musicology examples: Syncopation patterns in ragtime Huron: the melodic arch Rodriguez-Zivic: perception & musical style 7

Corpus analysis Rodriguez-Zivic et al. (2011): Perceptual basis of evolving Western musical styles corpus: Peachnote corpus of classical music, http://www.peachnote.com/info.html features: melodic pitch intervals, paired into bigrams and clustered into 5 factors findings: baroque period music follows the diatonic scale closely ( white keys on the piano ) classical period works rely a lot on unison (repetition). Romantic and post-romantic music expand these vocabulary of intervals 8

9 Corpus analysis

Corpus analysis dictionary based on pairs of melodic intervals used represents each 5-year period between 1730 and 1930 as a single, compact distribution k = 5 factors are then identified using k- means clustering four coincide with the historic periods of baroque, classical, romantic and postromantic music Baroque: use of the diatonic scale, Classic: repeated notes, Harmony: wide harmonic intervals, post-modern: chromatic tonality. 10

Corpus analysis Many more studies using symbolic data: chords De Clercq and Temperley (2011) 99 rock songs, 20 for every decade 1950-2000 Analysis of chord root transitions and co-occurrence over time Result: strong (but decreasing) prominence of the IV chord and the IV-I progression Burgoyne (2013) analysis of 1379 songs from Billboard dataset Result: trend towards minor tonalities, decrease in the use of dominant chords, and a positive effect of non-core roots (roots other than I, V, and IV) on popularity rhythmic motives: Mauch et al (2012), Volk & De Haas (2013) Today s typology of corpus studies: hypothesis-driven vs. discovery-driven symbolic data vs. audio data 11

Audio features for corpus analysis main selection criteria for audio features: features must have a clear natural language interpretation, so that results in the feature domain can be translated back into natural language features can only be used if they can be reliably computed two example feature sets: psycho-acoustic features corpus-relative features PhD Thesis Jan van Balen: Audio Description and Corpus Analysis of Popular Music, 2016, Utrecht University 12

Psycho-acoustic features signal measurements that correspond to human ratings of an attribute of sound tested in a laboratory environment loudness sharpness roughness 13

Psycho-acoustic features loudness sharpness roughness wikimedia commons 14

Psycho-acoustic features loudness intensity (in db) frequency content sharpness roughness wikimedia commons 15

Psycho-acoustic features loudness sone and phone 1 sone = 1000 Hz at 40 db (=40 phons) Sone is basis of ISO standard scale sone is linear, phon logarithmic wikimedia commons 16

Psycho-acoustic features loudness Sharpness High frequency content compute sharpness as weighted sum of the specific loudness levels in various bands roughness Sharp: Unsharp: 19

Psycho-acoustic features loudness sharpness Roughness quantifies the subjective perception of rapid amplitude modulation of a sound rough not rough 24

Psycho-acoustic features loudness sharpness Roughness: background critical bandwidth filtering of frequencies within the cochlea only if two frequency components are different enough, we perceive two different tones if two frequency components are within the same critical bandwidth, we perceive them as one tone Perceptual roughness of a complex sound (comprising many partials or pure tone components) depends on the distance between the partials measured in critical bandwidths. A simultaneous pair of partials of about the same amplitude that is less than a critical bandwidth apart produces roughness associated with the inability of the basilar membrane to separate them clearly 25

Psycho-acoustic features loudness sharpness Roughness quantifies the subjective perception of rapid amplitude modulation of a sound rough not rough 26

Summary psycho-acoustic features Loudness sharpness roughness empirically established attributes of sound Attributes also used in natural language description of sound 27

The loudness war : Loudness and Dynamics 28

The loudness war : Loudness and Dynamics 29

Loudness and Dynamics Deruty & Tardieu (2014): Dynamic processing in mainstream music corpus: 4500 tracks released between 1967 and 2011 (100 per year) features: RMS, EBU-loudness, EBU-loudness range, peak-to-rms factors 30

Dynamic processing in mainstream music RMS (root-mean square of the arithmetic mean) Average loudness value during a certain time frame EBU Loudness EBU-loudness range Peak-to-RMS factors RMS: 31

Dynamic processing in mainstream music RMS EBU Loudness (European Broadcasting Union) EBU-loudness range Peak-to-RMS factors Loudness range: The difference between the 10 th and 95 th percentile of the distribution of 3 second loudness averages computed with 1 second overlap measures the variation of loudness on a macroscopic time-scale 32

Dynamic processing in mainstream music RMS EBU Loudness EBU-loudness range Peak-to-RMS factors 33

Dynamic processing in mainstream music RMS EBU Loudness EBU-loudness range Peak-to-RMS factors (measures micro dynamics) 34

Loudness and Dynamics Deruty & Tardieu (2014): Dynamic processing in mainstream music corpus: 4500 tracks released between 1967 and 2011 (100 per year) features: RMS, EBU-loudness, EBU-loudness range, peak-to-rms findings: 35

Loudness and Dynamics Deruty & Tardieu (2014): Dynamic processing in mainstream music corpus: 4500 tracks released between 1967 and 2011 (100 per year) features: RMS, EBU-loudness, EBU-loudness range findings: Loudness and RMS increase, with a peak around 2007 Micro-dynamics have decreased as loudness went up Macro-dynamics (loudness range) have not decreased 40

41 Application of psycho-acoustic features to chorus analysis

Chorus analysis Van Balen, Burgoyne, Wiering, Veltkamp (2013): An analysis of chorus features in popular song corpus: Billboard dataset ±7000 song sections, 1958-1992 features: loudness, loudness range, sharpness, roughness + a few others re: pitch height and timbre variance What makes a chorus distinct from other sections in a song? 42

Why chorus analysis? Choruses: more prominent, more catchy, more memorable than other sections in a song MIR: chorus detection primarily based on identifying the mostrepeated section in a song. chorus detection is tied to audio thumbnailing, music summarization, structural segmentation Question: Can we use computational methods to improve our understanding of choruses? 43

Chorus analysis analysis method: learning a probabilistic graphical model: (based on 11 perceptual features and chorusness variables) 45

Chorus analysis Van Balen, Burgoyne, Wiering, Veltkamp (2013): An analysis of chorus features in popular song corpus: Billboard dataset ±7000 song sections, 1958-1992 features: loudness, loudness range, sharpness, roughness + a few others re: pitch height and timbre variance findings: 49

Corpus analysis: Where to look for the hook a study of catchiness in popular songs what parts of songs are easily remembered? what is the hook? how important is repetition striking moment vs. recurring riff what role does expectation play? surprise vs. cliché 50

Where to look for the hook a study of catchiness in popular songs what parts of songs are easily remembered? what is the hook? how important is repetition striking moment vs. recurring riff what role does expectation play? surprise vs. cliché 51

Where to look for the hook a study of catchiness in popular songs 52 what parts of songs are easily remembered? what is the hook?

Where to look for the hook Hooked! a game-with-a-purpose to study catchiness Players get 15 s to recognize a song. If yes, the song mutes for 4 seconds. When it comes back, does it come back in the right place? 53

Where to look for the hook a study of catchiness in popular songs what parts of songs are easily remembered? what is the hook? how important is repetition striking moment vs. recurring riff what role does expectation play? surprise vs. cliché 55

Hook analysis Van Balen, Burgoyne, Bountouridis, Müllensiefen, Veltkamp (2015): Corpus Analysis Tools for Computational Hook Discovery corpus: Hooked! data 1750 song segments from 321 songs and 973 players features: chorus features + melody and harmony features + corpus-relative features based on the above 56

Where to look for the hook Corpus-relative features Second order features 57

Where to look for the hook Corpus-relative features Second order features Symbolic (e.g. FANTASTIC toolbox): discrete numbers (countable) 58

Where to look for the hook Corpus-relative features Second order features Symbolic (e.g. FANTASTIC toolbox): discrete numbers (countable) Audio: continuous, uninterrupted signals Features measured over short windows, represent continuous, uncountable quantities 59

Where to look for the hook Corpus-relative features Features in their raw form are not always informative Therefore: convert a features to a scale of common vs. uncommon. For 1-dimensional features (e.g. loudness): f(x) probability density estimate i.e., a non-parametric scaling of a feature values frequency 62

Where to look for the hook Corpus-relative features Features in their raw form are not always informative Therefore: convert a features to a scale of common vs. uncommon. For 1-dimensional features (e.g. loudness): f(x) probability density estimate N: size of reference corpus i.e., a non-parametric scaling of a feature values frequency 63

Where to look for the hook Corpus-relative features Features in their raw form are not always informative Therefore: convert a features to a scale of common vs. uncommon. For n-dimensional features (e.g. pitch class distribution): i.e., information: how much information does an observed distribution provide compared to a corpus average (measure of unexpectedness) 65

Where to look for the hook Corpus-relative features Features in their raw form are not always informative Therefore: convert a features to a scale of common vs. uncommon Reference corpus can be varied: large corpus as reference à feature measures conventionality sections from the same song as reference à feature measures recurrence 66

Hook analysis Van Balen, Burgoyne, Bountouridis, Müllensiefen, Veltkamp (2015): Corpus Analysis Tools for Hook Discovery corpus: Hooked! data 1750 song segments from 321 songs and 973 players features: chorus features + melody and harmony features + corpus-relative features based on the above findings: 67

Hook analysis findings: 8 components correlate significantly 68

Hook analysis Van Balen, Burgoyne, Müllensiefen, Veltkamp (in review): Corpus Analysis Tools for Hook Discovery corpus: Hooked! data 1750 song segments from 321 songs and 973 players features: chorus features + melody and harmony features + corpus-relative features based on the above findings: features correlated with vocals predict hooks best conventionality dominates the remainder of the results recurrence also contributes 69

Conclusions quality of corpus studies also depends on choice of data and analysis method, but generally good features have a clear natural language interpretation, so that results in the feature domain can be translated back into natural language..and can be reliably computed two types of feature that address these criteria: psycho-acoustic features corpus-relative features 70

Summary g Use of audio features for characterizing corpora g Features for characterizing evolution g Very important for classification of styles g Games and catchy music 71

References g g g g g g g David Huron (1995). The melodic arch in Western folksongs. Computing in Musicology, Vol. 10, pp. 3-23. John Ashley Burgoyne, Jonathan Wild, and Ichiro Fujinaga. Compositional Data Analysis of Harmonic Structures in Popular Music. Mathematics and Computation in Music, pages 52 63, 2013. Trevor de Clercq and David Temperley. A corpus analysis of rock harmony. Popular Music, 30(01):47 70, jan 2011. Rodriguez-Zivic, Shifres & Cecchi (2011). Perceptual basis of evolving Western Musical Styles. Proceedings of the National Academy of Science, Vol. 110, pp. 10034-10038, Deruty & Tardieu (2014). Dynamic processing in mainstream music. Journal of the Audio Engineering Society, Volume 62, pp. 42-55, Van Balen, Burgoyne, Wiering, Veltkamp (2013). An analysis of chorus features in popular song. Proceedings of the 14th Society of Music Information Retrieval Conference (ISMIR). Van Balen, Burgoyne, Bountouridis, Müllensiefen, Veltkamp (2015). Corpus Analysis Tools for Computational Hook Discovery. ISMIR proceeedings 72