Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.

Similar documents
A COMPARISON OF SYMBOLIC SIMILARITY MEASURES FOR FINDING OCCURRENCES OF MELODIC SEGMENTS

UvA-DARE (Digital Academic Repository) Clustering and classification of music using interval categories Honingh, A.K.; Bod, L.W.M.

Klee or Kid? The subjective experience of drawings from children and Paul Klee Pronk, T.

UvA-DARE (Digital Academic Repository) Cinema Parisien 3D Noordegraaf, J.J.; Opgenhaffen, L.; Bakker, N. Link to publication

Outline. Why do we classify? Audio Classification

A MANUAL ANNOTATION METHOD FOR MELODIC SIMILARITY AND THE STUDY OF MELODY FEATURE SETS

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Audio Feature Extraction for Corpus Analysis

Disputing about taste: Practices and perceptions of cultural hierarchy in the Netherlands van den Haak, M.A.

Tool-based Identification of Melodic Patterns in MusicXML Documents

FANTASTIC: A Feature Analysis Toolbox for corpus-based cognitive research on the perception of popular music

Citation for published version (APA): Paalman, F. J. J. W. (2010). Cinematic Rotterdam: the times and tides of a modern city Eigen Beheer

Ontology Representation : design patterns and ontologies that make sense Hoekstra, R.J.

Analysis of local and global timing and pitch change in ordinary

TOWARDS STRUCTURAL ALIGNMENT OF FOLK SONGS

[Review of: S.G. Magnússon (2010) Wasteland with words: a social history of Iceland] van der Liet, H.A.

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Music Alignment and Applications. Introduction

Algorithms for melody search and transcription. Antti Laaksonen

Music Radar: A Web-based Query by Humming System

Algorithmic Composition: The Music of Mathematics

2. AN INTROSPECTION OF THE MORPHING PROCESS

Sequential Association Rules in Atonal Music

Jazz Melody Generation and Recognition

UvA-DARE (Digital Academic Repository) Film sound in preservation and presentation Campanini, S. Link to publication

A MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION

Extracting Significant Patterns from Musical Strings: Some Interesting Problems.

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.

CHAPTER 3. Melody Style Mining

Music Composition with RNN

UvA-DARE (Digital Academic Repository)

A COMPARATIVE EVALUATION OF ALGORITHMS FOR DISCOVERING TRANSLATIONAL PATTERNS IN BAROQUE KEYBOARD WORKS

Evaluating Melodic Encodings for Use in Cover Song Identification

UvA-DARE (Digital Academic Repository) Informal interpreting in Dutch general practice Zendedel, R. Link to publication

CALCULATING SIMILARITY OF FOLK SONG VARIANTS WITH MELODY-BASED FEATURES

An Integrated Music Chromaticism Model

Speech To Song Classification

Modeling memory for melodies

A geometrical distance measure for determining the similarity of musical harmony. W. Bas de Haas, Frans Wiering & Remco C.

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Discovering repeated patterns in music: state of knowledge, challenges, perspectives

A wavelet-based approach to the discovery of themes and sections in monophonic melodies Velarde, Gissel; Meredith, David

Acoustic and musical foundations of the speech/song illusion

Sequential Association Rules in Atonal Music

ALGEBRAIC PURE TONE COMPOSITIONS CONSTRUCTED VIA SIMILARITY

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

Lab P-6: Synthesis of Sinusoidal Signals A Music Illusion. A k cos.! k t C k / (1)

MUSI-6201 Computational Music Analysis

Hidden Markov Model based dance recognition

AUTOMATIC ACCOMPANIMENT OF VOCAL MELODIES IN THE CONTEXT OF POPULAR MUSIC

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

Melodic String Matching Via Interval Consolidation And Fragmentation

Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx

A probabilistic framework for audio-based tonal key and chord recognition

Music Information Retrieval Using Audio Input

UvA-DARE (Digital Academic Repository) Grothendieck inequalities, nonlocal games and optimization Briët, J. Link to publication

Singer Recognition and Modeling Singer Error

Content-based Indexing of Musical Scores

Query By Humming: Finding Songs in a Polyphonic Database

Robert Alexandru Dobre, Cristian Negrescu

Music Information Retrieval

Some properties of non-octave-repeating scales, and why composers might care

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

Chords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm

ATOMIC NOTATION AND MELODIC SIMILARITY

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

Similarity matrix for musical themes identification considering sound s pitch and duration

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

A NOVEL MUSIC SEGMENTATION INTERFACE AND THE JAZZ TUNE COLLECTION

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Pitch Spelling Algorithms

Fantastic: Feature ANalysis Technology Accessing STatistics (In a Corpus): Technical Report v1.5

Open Research Online The Open University s repository of research publications and other research outputs

Wipe Scene Change Detection in Video Sequences

Chord Label Personalization through Deep Learning of Integrated Harmonic Interval-based Representations

On Computational Transcription and Analysis of Oral and Semi-Oral Chant Traditions

Creating Data Resources for Designing User-centric Frontends for Query by Humming Systems

A case based approach to expressivity-aware tempo transformation

a start time signature, an end time signature, a start divisions value, an end divisions value, a start beat, an end beat.

"Our subcultural shit-music": Dutch jazz, representation, and cultural politics Rusch, L.

The Intervalgram: An Audio Feature for Large-scale Melody Recognition

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

A Geometrical Distance Measure for Determining the Similarity of Musical Harmony

Published in: Proceedings of the 10th International Conference on Music Perception and Cognition (ICMPC 10), Sapporo, Japan

Music Database Retrieval Based on Spectral Similarity

Comparing Approaches to the Similarity of Musical Chord Sequences

A Pattern Recognition Approach for Melody Track Selection in MIDI Files

Transcription of the Singing Melody in Polyphonic Music

DISCOVERY OF REPEATED VOCAL PATTERNS IN POLYPHONIC AUDIO: A CASE STUDY ON FLAMENCO MUSIC. Univ. of Piraeus, Greece

Melody Retrieval using the Implication/Realization Model

Harmonic Factors in the Perception of Tonal Melodies

Pattern Based Melody Matching Approach to Music Information Retrieval

A Framework for Segmentation of Interview Videos

Evaluation of Melody Similarity Measures

Melodic Outline Extraction Method for Non-note-level Melody Editing

The Human Features of Music.

Published in: Proceedings of the 14th International Society for Music Information Retrieval Conference

Transcription:

UvA-DARE (Digital Academic Repository) Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J. Published in: Frontiers in Psychology DOI: 10.3389/fpsyg.2017.00621 Link to publication Citation for published version (APA): Janssen, B., Burgoyne, J. A., & Honing, H. (2017). Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies. Frontiers in Psychology, 8, [621]. DOI: 10.3389/fpsyg.2017.00621 General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl) Download date: 13 Sep 2018

: Memorability of folk songs: evidence from corpus analysis Berit Janssen*, John Ashley Burgoyne and Henkjan Honing *Correspondence: Berit Janssen: berit.janssen@meertens.knaw.nl 1 DETAILS ON DETECTING PHRASE OCCURRENCES We use a combination of three pattern matching methods, which have been shown to agree best with human judgements of phrase occurrences (Janssen et al., Under Revision): city-block distance (Steinbeck, 1982), local alignment (Smith and Waterman, 1981) and structure induction (Meredith, 2006). 1.1 Music representations For city-block distance and local alignment, melodies are represented as pitch sequences. Pitches (the heights of the melody notes in the human hearing range), are represented by integers, derived from their MIDI note numbers. The notes in the pitch sequences were weighted by their duration, i.e., a given pitch is repeated depending on the length of the notes. We represent a crotchet or quarter note by 16 pitch values, a quaver or eighth note by 8 pitch values, and so on. Note onsets of small duration units, especially triplets, may fall between these sampling points, which shifts their onset slightly in the representation. Structure induction uses (onset, pitch) pairs to represent notes in the melodies. In order to deal with transposition differences in folk songs, Van Kranenburg et al. (2013) transpose melodies to the same key using pitch histogram intersection. We take a similar approach. For each melody, a pitch histogram is computed with MIDI note numbers as bins, with the count of each note number weighted by its total duration in a melody. The pitch histogram intersection of two histograms h s and h t, with shift σ is defined as r P HI(h s, h t, σ) = min(h s,k+σ, h t,k ), (S1) k=1 where k denotes the index of the bin, and r the total number of bins. We define a non-existing bin to have value zero. For each tune family, we randomly pick one melody and for each other melody in the tune family we compute the σ that yields a maximum value for the histogram intersection, and transpose that melody by σ semitones. This process results in pitch-adjusted sequences. To deal with different notations for the durations of notes, we perform a similar correction for the durations of notes. Analogous to Equation S1, we define a duration histogram intersection of two duration histograms h t and h s, of which the σ minimizing DHI will be chosen as the designated shift. r DHI(h t, h s, σ) = min(h t,k+σ, h s,k ), k=1 (S2) 1

This σ is then used to calculate the multiplier of the onsets of melody t with relation to melody s, before transforming the pitch and duration values of melody t into a duration weighted pitch sequence: Mult(t, s) = 2 σ (S3) 1.2 City-block distance For city-block distance, the query sequence q, with pitch values q i, is compared with every sequence p of the same length, with pitch values p i, from the melody being searched for matches. If many pitch values are identical, city-block distance is small. dist(q, p) = n q i p i (S4) i=1 From each melody, we choose the pitch sequences p which have the lowest city-block distance to the query sequence, and determine their position in the melody. 1.3 Local alignment To compute the optimal local alignment, a matrix A is recursively filled according to equation S5. The matrix is initialized as A(i, 0) = 0, i {0,..., n}, and A(0, j) = 0, j {0,..., m}. W insertion and W deletion define the weights for inserting an element from melody s into segment q, and for deleting an element from segment q, respectively. subs(q i, s j ) is the substitution function, which gives a weight depending on the similarity of the notes q i and s j. A(i 1, j 1) + subs(q i, s j ) A(i, j 1) + W insertion A(i, j) = max (S5) A(i 1, j) + W deletion 0 We apply local alignment to pitch adjusted sequences. In this representation, local alignment is not affected by transposition differences, and it should be robust with respect to time dilation. For the insertion and deletion weights, we use W insertion = W deletion = 0.5, and we define the substitution score as { 1 if q i = s j subs(q i, s j ) =. (S6) 1 otherwise We normalize the maximal alignment score by the number of notes n in the query segment to receive the similarity of the found match with the query segment. The position of the pitch sequence associated with the maximal alignment score is determined through backtracing. sim(q, s) = 1 n max (A(i, j)) (S7) i,j 2

1.4 Structure induction Structure induction measures the difference between melodic segments through so-called translation vectors. The translation vector T between points in two melodic segments can be seen as the difference between the points q i and s j in onset, pitch space. T = ( qi,onset q i,pitch ) ( sj,onset s j,pitch ) (S8) The maximally translatable pattern (MTP) of a translation vector T for two melodies q and s is then defined as the set of melody points q i which can be transformed to melody points s j with the translation vector T. MT P (q, s, T) = {q i q i q q i + T s} (S9) We use the pattern matching method SIAM, defining the similarity of two melodies as the largest set match achievable through translation with any vector, normalized by the length n of the query melody: sim(q, s) = 1 n max MT P (q, s, T) (S10) T The maximally translatable patterns leading to highest similarity are selected as matches, and their positions are determined through checking the onsets of the first and last note of the MTPs. 1.5 Combination of the measures The similarity thresholds which result in the best agreement between human annotations of phrase occurrences and algorithmically determined matches were found through optimization on the training corpus. For a given query phrase, all similarity measures were used to determine whether or not a match was found in a given melody. For city-block distance, matches with dist(q, p) 0.9792, for local alignment, matches with sim(q, s) 0.5508, and for structure induction, matches with sim(q, s) 0.5833 are retained. Only if at least two of the three similarity measures retained matches, the melody in question was considered to contain an occurrence. 2 DETAILS ON THE FORMALIZATION OF HYPOTHESES 2.1 Pitch reversal Pitch-reversal is the linear combination of two other principles, registral direction and registral return. The principle of registral direction states that after large implicative intervals, a change of direction is more expected than a continuation of the direction. The tritone, or six semitones, is not defined in this principle. 0 if pint(s j 1 ) < 6 1 if 6 < pint(s j 1 ) < 12 and pint(s j ) pint(s j 1 ) < 0 PitchRev dir (s j ) = 1 if 6 < pint(s j 1 ) < 12 and pint(s j ) pint(s j 1 ) > 0 undefined otherwise (S11) Frontiers 3

The other component of pitch-reversal, registral return, states that if the realized interval has a different direction than the implicative interval, the size of the intervals is expected to be similar, i.e. they should not differ by more than two semitones. If the implicative interval describes a tone repetition, or if the difference between two consecutive pitch intervals of opposite direction is too large, pitch-reversal is zero, otherwise it has the value of 1.5. PitchRev ret (s j ) = { 1.5 if pint(s j ) > 0 and pint(s j 1 ) + pint(s j ) <= 2 0 otherwise (S12) Combined, registral direction and registral return form the pitch-reversal principle. PitchRev(s j ) = PitchRev dir (s j ) + PitchRev ret (s j ) (S13) Figure S1, drawn after a figure by Schellenberg, shows a schematic overview for the different values pitch reversal can take under different conditions. 2.2 Motif repetivity In Figure S2 we show the second, also fourth, and sixth phrase of the example melody. FANTASTIC (Müllensiefen, 2009) represents relationship between adjacent notes as follows: pitches can either stay the same (s1), move up or down by a diatonic pitch interval (e.g., u4, d2). In this representation, it does not matter whether, e.g, a step down (d2) contains one or two semitones. Durations either stay equal (e), get quicker (q) or longer (l). We will refer to the string representations of the motifs, in accordance with Müllensiefen and Halpern (2014), as M-Types. The second/fourth phrase of the example melody consists of six M-types, of which one repeats three times, so there are four unique M-types. This leads to the entropy of two-note M-Types: H(2) = 3/6 log 2 3/6 + 3 (1/6 log 2 1/6) log 2 1/6 = 0.5 3 0.43 2.58 = 0.69 (S14) These M-types can be combined to the following three-note M-Types: s1q u3e u3e u3e u3e d2l d2l d3q One M-Type, u3e u3e, is repeated. In total, there are five M-Types in the melody, and four unique M-Types. This means that the entropy for length l = 3 for this phrase is H(3) = 3/5 log 2 3/5 + 3 (1/5 log 2 1/5) log 2 1/5 = 0.44 3 0.46 2.32 = 0.79 (S15) There are no repeated four-note M-Types, so the entropy of M-Types of length l = 4 for this phrase is maximal, H(4) = 1.0. This means that all longer M-types also have maximal entropy, and the motif 4

repetivity, the negative average of the entropies for all lengths of mtypes, is MR = (0.69 + 0.79 + 3 1.0) / 5 = 0.90. The sixth phrase of the example melody (the second phrase shown in the figure) consists of seven M-Types, of which only one (u4l) appears twice. This leads to the entropy of two-note M-Types: H(2) = 2/7 log 2 2/7 + 5 (1/7 log 2 1/7) log 2 1/7 = 0.52 5 0.40 2.81 = 0.89 (S16) For the longer M-Types, there are no repetitions, hence the entropy is maximal at H(3, 4, 5, 6) = 1.0. This leads to a motif repetivity of MR = (0.89 + 4 1.0 / 5) = 0.98. REFERENCES Janssen, B., van Kranenburg, P., and Volk, A. (Under Revision). Finding Occurrences of Melodic Segments in Folk Songs: a Comparison of Symbolic Similarity Measures. Journal of New Music Research Van Kranenburg, P., Volk, A., and Wiering, F. (2013). A Comparison between Global and Local Features for Computational Classification of Folk Song Melodies. Journal of New Music Research 42, 1 18. doi:10.1080/09298215.2012.718790 Meredith, D. (2006). Point-set algorithms for pattern discovery and pattern matching in music. In Content- Based Retrieval. Dagstuhl Seminar Proceedings 06171, eds. T. Crawford and R. C. Veltkamp (Dagstuhl, Germany) Müllensiefen, D. (2009). FANTASTIC: Feature ANalysis Technology Accessing STatistics (In a Corpus): Technical Report. Tech. rep., Goldsmiths University of London, UK Müllensiefen, D. and Halpern, A. R. (2014). The Role of Features and Context in Recognition of Novel Melodies. Music Perception 31, 418 435 Schellenberg, E. G. (1997). Simplifying the Implication-Realization Melodic Expectancy. Music Perception: An Interdisciplinary Journal 14, 295 318 Smith, T. and Waterman, M. (1981). Identification of common molecular subsequences. Journal of Molecular Biology 147, 195 197. doi:10.1016/0022-2836(81)90087-5 Steinbeck, W. (1982). Struktur und Ähnlichkeit. Methoden automatisierter Melodienanalyse (Kassel: Bärenreiter) 3 SUPPLEMENTARY FIGURES Frontiers 5

Implicative interval (semitones) Realized interval (semitones) Different direction 12 0 12 0 Same direction 0 1.5 6 1 2.5 1-1 11 Figure S1. A visualization of the pitch reversal principle as defined by Schellenberg (1997), drawn after his figure. The vertical axis represents the size of the implicative interval, from 0 to 11, and the horizontal axis the size of the realized interval, from 0 to 12, which can have either the same direction (right side of the panel) or a different direction (left side of the panel). NLB074521_01, Phrases 2/4 and 6 4 2 s1e u3e u3e u3e d2l d3q s1e u4l d4e u4e u3e u4l d2q Figure S2. The second, also fourth, and sixth phrase of the example melody with symbols representing the pitch and duration relationships between adjacent notes. Notes can either stay at the same pitch (s1), or move up or down in a diatonic pitch interval (e.g., u4, d2). Durations can either be equal (e), quicker (q), or longer (l). 6