Evaluation of Melody Similarity Measures

Size: px
Start display at page:

Download "Evaluation of Melody Similarity Measures"

Transcription

1 Evaluation of Melody Similarity Measures by Matthew Brian Kelly A thesis submitted to the School of Computing in conformity with the requirements for the degree of Master of Science Queen s University Kingston, Ontario, Canada August 2012 Copyright c Matthew Brian Kelly, 2012

2 Abstract Similarity in music is a concept with significant impact on ethnomusicology studies, music recommendation systems, and music information retrieval systems such as Shazam and SoundHound. Various computer-based melody similarity measures have been proposed, but comparison and evaluation of similarity measures is inherently difficult due to the subjective and application-dependent nature of similarity in music. In this thesis, we address the diversity of the problem by defining a set of music transformations that provide the criteria for comparing and evaluating melody similarity measures. This approach provides a flexible and extensible method for characterizing selected facets of melody similarity, because the set of music transformations can be tailored to the user and to the application. We demonstrate this approach using three music transformations (transposition, tempo rescaling, and selected forms of ornamentation) to compare and evaluate several existing similarity measures, including String Edit Distance measures, Geometric measures, and N-Gram based measures. We also evaluate a newly implemented distance measure, the Beat and Direction Distance Measure, which is designed to have greater awareness of the beat hierarchy and better responsiveness to ornamentation. Training and test data is drawn from music incipits from the RISM A/II collection, and ground truth is taken from the MIREX 2005 Symbolic Melodic Similarity task. i

3 Our test results show that similarity measures that are responsive to music transformations generally have better agreement with human generated ground truth. ii

4 Acknowledgments I would like to express my sincere thanks to my supervisor, Dr. Dorothea Blostein, for her patient and knowledgeable guidance. Our discussions were always helpful, informative, and full of positive energy. This would not have been possible without your positive spirit and enthusiasm. I would also like to thank everyone at the School of Computing who helped me along the way. I am proud to be a part of such a great group of people. Finally, I d like to thank my friends and family who supported me along the way. David, Matthew, Sean, Jun-tian, and Brenna: your support meant a lot and helped a great deal. Special thanks and love to my parents, Shane and Sally, and my brother, Adam, for their unconditional support, love, and guidance. iii

5 Table of Contents Abstract i Acknowledgments iii Table of Contents iv List of Tables vi List of Figures ix Chapter 1: Introduction Similarity Measures for Melodies Thesis Contributions Chapter 2: Music Transformations Transposition Tempo Rescaling Ornamentation Composition of Transformations iv

6 Chapter 3: Background and Literature Review Symbolic Music Representations Categories of Melody Similarity Measures Evaluations Chapter 4: Methodology Selected Similarity Measures for Study Test Data for Evaluations Evaluation Criteria Chapter 5: Experiments and Results Evaluation Process Results Analysis of Results Chapter 6: Conclusion and Future Work Bibliography v

7 List of Tables 1.1 Table 1 from [1]: Music Representations in Music Information Retrieval Summary of results from previous MIREX Symbolic Melodic Similarity tasks Aggregate of categories of techniques used in previous MIREX Symbolic Melodic Similarity tasks Possible categories for direction in the Beat and Direction Distance Measure Table 4.3 from [2] summarizing the possible structures of passing notes Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value vi

8 5.4 Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value Results table for query Cells for each group and distance measure pair show the number of correct melodies and the recall value vii

9 5.12 Compilation of all Results. The values in each cell represent the average dynamic recall value for the distance measure on the specified query melody Sorted compilation of average ADR values Ranks of compilation of all results from Table viii

10 List of Figures 2.1 Illustration of transposition. The differences between melody (a) and (b) can be entirely modeled as a transposition transformation. The parameters for the transposition state that the entire melody is transposed up a fifth An example of tempo rescaling. The difference between melody (a) and (b) can be modeled completely as a tempo rescale transformation. The parameters for the tempo rescaling state that the entire melody is rescaled by a factor of 2, such that the duration of each note is doubled An example of pitch-echo ornamentation. The differences between melody (a) and (b) can be modeled as a pitch-echo ornamentation. The parameters of the ornamentation state that the first three notes of (a) are affected, with the ornamentation splitting the original note into two notes that are half as long, raising the pitch of the second note one fifth higher An example of beat-hierarchy ornamentation. The difference between melody (a) and (b) can be completely modeled as a beat-hierarchy ornamentation. Each of the four notes that lie on the beat remain the same, but the passing notes change. The parameters for this transformation define each passing note explicitly ix

11 2.5 An example of a composition of transformations. The difference between melody (a) and (c) cannot be completely modeled by any one transformation. However, it can be explained by first applying a tempo rescale transformation with a parameter of 2 (shown from (a) to (b)) and then applying a transposition transformation with a parameter of a fifth (as seen from (b) to (c)) Example of the symbolic music representation ABC, with corresponding western music notation. The ABC notation begins with 5 header lines. The line beginning with X indicates a reference number (for storing multiple melodies in a single file), T gives a unique name for the music fragment, M states the time signature, L indicates the base note length, and K provides the key signature. The header lines are followed by a symbolic encoding of the sequence of notes and rests that constitute this music fragment. In this encoding, E represents a note located on the lowest staff line (E flat above middle C), and e represents a note one octave higher. The 2 represents an eighth note (double the base note length described by L in the header), and 4 represents a quarter note (four times the base note length) Example of Edit Distance Insert operation Example of Edit Distance Delete operation Example of Edit Distance Substitute operation Example of Geometric Representation of Melody The framework for evaluating melody similarity measures x

12 4.2 An example from the RISM A/II dataset. The melody shown in (a) is used as the query melody. A group of 35 music experts determined that the most similar melody in the collection was (b). The differences between (a) and (b) can be almost completely modeled as a tempo rescaling transformation (doubling the note duration) and a transposition transformation (down a full step) A second example from the RISM A/II dataset. The melody shown in (a) is used as the query melody and the melody shown in (b) is the agreed upon ground truth. In this example, the differences between (a) and (b) are more difficult to model as transformations An example of a box whisker plot from a ground truth from the RISM A/II dataset as generated by [3]. This plot shows that the rankings of experts have a median of 4. The mean of the responses is 5.5. The standard deviation of the responses by the horizontal dotted line where they demonstrate 1 standard deviation above and below the mean An example of a visualization of a Wilcoxon rank sum test from a ground truth from the RISM A/II dataset as generated by [3] Figure from data generated in [3] reproduced with permission pending. An example of ground truth from the RISM A/II dataset as generated by [3]. The query melody is shown at the top. This is followed by 10 candidate melodies that experts rank as most similar to the query melody. These 10 candidate melodies are split into 5 groups (a partially ordered list) using the Wilcoxon rank sum test xi

13 5.1 Flow chart illustrating the data and algorithms used in the experiment to evaluate selected melody similarity measures Class diagram depicting the implementation discussed in this thesis Screenshot of the Melody Viewer tab Screenshot of the Single Measure Evaluation tab Screenshot of the results from running the Single Measure Evaluation tab Screenshot of the Single Measure Evaluation versus Ground Truth tab Screenshot of the results from running the Single Measure Evaluation versus Ground Truth tab Screenshot of the Complete Evaluation versus Ground Truth tab Screenshot of the results from running the Complete Measure Evaluation versus Ground Truth tab An example query melody from the RISM A/II dataset used to illustrate the effectiveness of the Beat and Direction Distance Measure in recognizing Beat Hierarchy Ornamentation An example candidate melody from the RISM A/II dataset used to illustrate the effectiveness of the Beat and Direction Distance Measure in recognizing Beat Hierarchy Ornamentation xii

14 Chapter 1 Introduction The purpose of a music similarity measure is to quantitatively or qualitatively characterize the similarity of two pieces of music. Music similarity is a subjective characteristic that varies depending on the application, and on the judgment and tastes of the user. Studies have shown that similarity judgments differ from person to person [4]. The definition of similarity depends on the context in which the music fragments are being analyzed. For example, the similarity of two music fragments can be based on style or genre (e.g. blues versus jazz) or on the artistic period (e.g. baroque versus medieval). Since the definition of similarity is situation-dependent, it is impossible to define a unique ground truth for music similarity databases. Human judgment can be used as a basis for defining a ground truth [5], as discussed further in Section 3.3. Music similarity measures are used in a variety of applications. In interactive music-generation systems such as [6], user interactions guide the process of generating music. In such systems, similarity measures can support search operations that help the user find a desired location for editing, or they can be used to analyze the 1

15 CHAPTER 1. INTRODUCTION 2 similarity between generated music fragments, as an aid in determining good candidate fragments. In ethnomusicology, which is... the comparative study of musical systems and cultures [7], similarity measures can aid in automatically identifying trends and similarities in music composition across periods. In music information retrieval systems such as [8],[9], and in music recommender systems such as [10],[11], similarity measures are used to find music that matches a user s query. Various representations for music fragments are in common use, including audio recordings, music notation, and symbolic music representations. Table 1.1 categorizes these representations and describes their common uses [1]. Estimation of music similarity can be based on audio recordings, music notation, or symbolic music representations. In this thesis, we consider similarity measures that operate on a symbolic music representation: music fragments are represented as sequences of notes and rests with given pitches and durations. Symbolic music representations are discussed further in Section Similarity Measures for Melodies This thesis investigates the definition and evaluation of melody similarity measures, where we define a melody as a symbolic, monophonic fragment of music. Symbolic means that the music fragment is represented as a sequence of notes and rests with given pitches and durations. Monophonic means that a new note does not begin until any current note has finished sounding [12]. In assessing melody similarity, two melodies are compared in their entirety. This is illustrated in Figures We do not consider more general formulations of the music similarity problem, in which the goal is to find the best match of a short query

16 CHAPTER 1. INTRODUCTION 3 Representation Description Research Symbolic Notation (scores, charts), Event-based recordings (MIDI), Hybrid Representations Audio Recordings, Streaming Audio, Instrument Libraries Matching, Theme/Melody Extraction, Voice Separation, Musical Analysis Sound/Song Spotting, Transcription, Timbre/Genre Classification, Musical Analysis, Recommendation Systems Visual Scores Score Reading (Optical Music Metadata Cataloging, Bibliography, Descriptions Recognition) Library Testbeds, Traditional IR, Interoperability, Recommendation Systems Table 1.1: Table 1 from [1]: Music Representations in Music Information Retrieval melody to subsections of a long music fragment, such as in [13]. Even more generally, music similarity can be applied to two music fragments that are both long, aiming to find subsections in one that are a close match to subsections in the other as in [14]. This task of melody similarity applies directly to applications where the symbolic representation of music is immediately available, such as in music-generation and ethnomusicology. However, for applications such as audio-based music information retrieval the application is indirect. Before applying melody similarity measures in these audio tasks the audio must be transformed into a symbolic representation.

17 CHAPTER 1. INTRODUCTION Thesis Contributions The objective of this work is to evaluate the performance of melody similarity measures. We define a framework for assessing melody similarity by addressing the following issues. Define melody similarity. A flexible definition is required because users must be able to adjust the definition of similarity to meet the needs of their application. We define melody similarity using a set of music transformations. A music transformation is a parameterized function that maps one melody to another. In Chapter 2, we define transformations for transposition, tempo rescaling, and two types of ornamentation (pitch-echo ornamentation and beat-hierarchy ornamentation). Music transformations of this type are often used in music composition. We believe that they form a useful basis for defining melody similarity. We propose in Section 4.3 that two melodies are highly similar if the differences between the melodies can be accurately accounted for by the successive application of music transformations. Flexibility is incorporated in this definition: a user can adjust the definition of similarity by adding or omitting music transformations, specifying the possible successive number of transformations, and adjusting the allowable bounds for parameters of transformations. Define the set of similarity measures that will be subject to performance evaluation. Published similarity measures are summarized in Section 3.2. We also implement a recently-proposed similarity measure that is designed to respond better to selected music transformations: Section describes the Beat and

18 CHAPTER 1. INTRODUCTION 5 Direction Measure which responds well to beat-hierarchy ornamentation. Section 4.1 defines the subset of measures that are evaluated in this thesis. The implementation of these measures is described in Section 5.1. Define a ground-truth data set for testing the performance of similarity measures. The data set may require manipulation to a consistent format, as discussed in 3.1. During testing, performance can be assessed based on the groundtruth data set. Alternatively, performance can be assessed by comparing answers to those given by an oracle similarity measure that is taken to be correct. Section 4.2 describes the data that were used in a previous evaluation campaign where ground truth was provided in [3]. We use this similarity-testing framework to investigate the following questions: 1. Do any categories of techniques for measuring melody similarity significantly outperform others? Many previously proposed melody similarity measures use or extend techniques developed for other areas of computer science (such as string edit distance). Section 3.2 describes categories of similarity measures such as edit-distance based measures, geometric based measures, and N-gram based measures. 2. Does context sensitivity improve the performance of a melody similarity measure? Many existing melody similarity measures assess the difference between notes without regard to the context in which the notes occur. A context sensitive measure assesses the difference between two notes by taking surrounding notes into consideration. 3. Does responsiveness to music transformations improve the performance of a

19 CHAPTER 1. INTRODUCTION 6 melody similarity measure? If so, which transformations are (most) important? In this thesis we begin investigation of this area by defining similarity measures that are responsive to transposition, tempo rescaling, and two types of ornamentation (pitch-echo ornamentation and beat-hierarchy ornamentation). The design and implementation of the evaluation framework is a major contribution of this thesis. The following 3 steps were used. A Design and implement software to load symbolic music files into a consistent internal data structure and use a similarity measure to perform an evaluation. Section 4.2 describes the collection and transformation of necessary data. Implementation of the evaluation engine is described by Section 4.3. B Implement a set of representative melody similarity measures from the set of published similarity measures. Among the implemented measures are forms of Geometric Measures, N-Gram based Measures, Edit Distance based Measures, and a newly developed similarity measures which is designed to be responsive to certain music transformations. C Evaluate the implemented similarity measures by assessing their performance with respect to the ground truth described in Section 4.2. Interpret the evaluation results to assess the importance of context sensitivity and responsiveness to music transformations. The next chapters present the details of the evaluation framework. Chapter 2 outlines the selected music transformations for our study. Chapter 3 provides background and literature review, including existing categories of techniques for assessing melody similarity, and a description of the MIREX symbolic music similarity task. Chapter

20 CHAPTER 1. INTRODUCTION 7 4 describes the similarity measures and data processing implemented in this work. Chapter 5 presents the performance evaluation results and analysis and Chapter 6 concludes with a summary and discussion of future work.

21 Chapter 2 Music Transformations This chapter presents our definition of music transformations and a description of the music transformations selected for study. We define a music transformation as a function which creates or models a mapping from one melody to another. Each transformation has a distinct set of P parameters. For example, a transposition transformation requires parameters that define which subsection of the music fragment is transposed, and how many half-steps to transpose. In general (Figure 4.1), a set of T transformations is used to capture the type of melody similarity that is of interest in a particular situation. In our experiments (Figure 5.1) we use T = 4. The remainder of this section describes this set of transformations in more detail. 2.1 Transposition Transposition is a transformation that alters the pitch of a series of notes by a fixed number of half steps (a discrete amount of pitch). Figure 2.1 illustrates transposition of an entire music segment. In many applications, it is desired that similarity measures 8

22 CHAPTER 2. MUSIC TRANSFORMATIONS 9 4 (a) 4 (b) Figure 2.1: Illustration of transposition. The differences between melody (a) and (b) can be entirely modeled as a transposition transformation. The parameters for the transposition state that the entire melody is transposed up a fifth. 4 (a) 4 (b) Figure 2.2: An example of tempo rescaling. The difference between melody (a) and (b) can be modeled completely as a tempo rescale transformation. The parameters for the tempo rescaling state that the entire melody is rescaled by a factor of 2, such that the duration of each note is doubled. produce a very high similarity rating for melodies whose differences can be modeled as a transposition. 2.2 Tempo Rescaling Tempo rescaling is a transformation that alters the durations of a series of notes by a fixed scale factor. As illustrated in Figure 2.2, the relative note durations are unchanged, but all notes are lengthened or compressed by the scale factor. In some applications, it is desired that similarity measures produce a very high similarity rating for melodies whose differences can be modeled as a tempo rescaling.

23 CHAPTER 2. MUSIC TRANSFORMATIONS 10 4 (a) 4 (b) Figure 2.3: An example of pitch-echo ornamentation. The differences between melody (a) and (b) can be modeled as a pitch-echo ornamentation. The parameters of the ornamentation state that the first three notes of (a) are affected, with the ornamentation splitting the original note into two notes that are half as long, raising the pitch of the second note one fifth higher. 2.3 Ornamentation Composers use many types of ornamentation to add embellishments and variety to the music. Here, we investigate two transformations that model particular types of ornamentation. We call these pitch-echo ornamentation and beat-hierarchy ornamentation, as illustrated in Figures 2.3 and 2.4 respectively Pitch-echo ornamentation In pitch-echo ornamentation, the target music segment is produced by introducing additional notes between notes in the source music segment. A uniform method is used to base the pitch and duration of added notes on the pitch and duration of the source notes. For example, the pitch-echo ornamentation in Figure 2.3 introduces intermediate notes that are a fifth higher than the original notes, and are half as long as the original notes Beat-hierarchy Ornamentation In beat-hierarchy ornamentation, pitches of notes that occur on strong beats are preserved more than pitches of notes that occur on weak beats. Notes on weak beats

24 CHAPTER 2. MUSIC TRANSFORMATIONS 11 4 (a) 4 (b) Figure 2.4: An example of beat-hierarchy ornamentation. The difference between melody (a) and (b) can be completely modeled as a beat-hierarchy ornamentation. Each of the four notes that lie on the beat remain the same, but the passing notes change. The parameters for this transformation define each passing note explicitly. may be altered to add embellishment or variety, as illustrated by the beat-hierarchy ornamentation in Figure Composition of Transformations We propose the possibility that the previously mentioned transformations may be used in succession. Figure 2.5 demonstrates an example of one such a combination. The combination of a tempo rescale transformation and afterwards a transposition transformation can model the differences between the melodies in Figure 2.5. Also, an example of two similar melodies (as judged by human music experts) is shown in Figure 4.2.

25 CHAPTER 2. MUSIC TRANSFORMATIONS 12 4 (a) 4 4 (b) (c) Figure 2.5: An example of a composition of transformations. The difference between melody (a) and (c) cannot be completely modeled by any one transformation. However, it can be explained by first applying a tempo rescale transformation with a parameter of 2 (shown from (a) to (b)) and then applying a transposition transformation with a parameter of a fifth (as seen from (b) to (c)).

26 Chapter 3 Background and Literature Review This chapter overviews previous work related to melody similarity. A number of symbolic representations exist to encode music in a readable and serializable format. Some examples of these, such as MusicXML [15] and ABC notation [16] are discussed in Section 3.1. Next, classes of published melody similarity measures are defined in Section 3.2. Finally, Section 3.3 describes the methodology used in previously performed evaluations of melody similarity, results of these evaluations, and difficulties in evaluation. 13

27 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 14 4 X: 1 T: M: 4/4 L: 1/16 K: Eb Major E2FGABcde4B4 G4z4zEGEAEBE Figure 3.1: Example of the symbolic music representation ABC, with corresponding western music notation. The ABC notation begins with 5 header lines. The line beginning with X indicates a reference number (for storing multiple melodies in a single file), T gives a unique name for the music fragment, M states the time signature, L indicates the base note length, and K provides the key signature. The header lines are followed by a symbolic encoding of the sequence of notes and rests that constitute this music fragment. In this encoding, E represents a note located on the lowest staff line (E flat above middle C), and e represents a note one octave higher. The 2 represents an eighth note (double the base note length described by L in the header), and 4 represents a quarter note (four times the base note length). 3.1 Symbolic Music Representations Many formulations of symbolic music representations exist. However, symbolic representations all encode the same information (music voices as sequences of pitch and duration). Examples of symbolic representations include MusicXML [15], ABC notation [16], and Humdrum kern [17]. ABC notation is chosen as the preferred notation for the experimental work done in this thesis. The notation was chosen for its low amount of data overhead as well as the simplicity of parsing given that it is stored in a plaintext file. Figure 3.1 shows an example of ABC notation and the corresponding western music notation. These symbolic representations have been used in systems such as in C-Brahms [18], and GUIDO/MIR [19].

28 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW Figure 3.2: Example of Edit Distance Insert operation. 4 4 Figure 3.3: Example of Edit Distance Delete operation. 3.2 Categories of Melody Similarity Measures Existing symbolic melody similarity measures can be categorized into various types. The following sections review edit-distance based measures, geometric based measures and N-gram based measures Edit Distance Based Measures for Melody Similarity Edit distance is a method of quantifying the distance between two strings, or patterns, by counting the minimum number of operations required to transform one string into the other. Commonly, the three types of operations that are used to correct differences in the strings are [20]: 1. Insert: A character is inserted into the target string when there is no corresponding character for the character in the source string.

29 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW Figure 3.4: Example of Edit Distance Substitute operation. 2. Delete: A character is deleted from the target string when it does not correspond to any character from the source string. 3. Substitute: A character in the target string is replaced with one from the source string when the corresponding characters do not match. Melody similarity measures based on edit distance use string matching, or Levenshtein distance to compute the similarity of two melodies [21], [22], [23], [24], [25], [26]. The distance between two melodies is computed by counting the minimum number of note transformation operations needed to transform the first melody into the second. Common operations include inserting a note (Figure 3.2), removing a note (Figure 3.3), and replacing a note (Figure 3.4). Each operation has a cost parameter. The similarity value is computed as the sum of the costs of all the operations used to transform one melody into the other. Edit distance has been used widely in previous work with various costs, operations, and representations. Gómez et. al. implemented and performed an analysis of the Mongeau-Sankoff algorithm [23] which introduces two additional operations fragmentation and consolidation [27]. Uitdenbogerd used edit distance with a representation based solely on pitch by ignoring the duration of notes [26]. Frieler and

30 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 17 Müllensiefen employed an edit distance algorithm to a number of simplified representations including discretization of pitch, rhythm, and contour [22], [24]. Grachten et. al. evaluated an edit distance algorithm which operates on an implication/realization [28] representation of melody [25]. Finally, Ferraro et. al. evaluated a string edit distance algorithm which is capable of polyphonic analysis; this algorithm is transposition invariant because it operates on a music representation that stores relative pitch rather than absolute pitch [21] Geometric Based Measures for Melody Similarity Geometric similarity measures are based on a two dimensional geometric representation of melody, where one axis represents time and the other axis represents pitch [22], [24], [29], [30], [31], [32], [33], [34]. This is illustrated in Figure 3.5. This geometric representation allows measures of polygonal similarity to be adapted to the assessment of melody similarity. Depending on how the geometric measure is defined, the measure may be able to model tempo rescaling (scaling along the horizontal axis) and transposition (shifting along the vertical axis). Frieler and Müllensiefen devised a geometric distance algorithm that was applied to a variety of representations [22], [24]. Typke et. al. evaluated the Earth Mover s Distance on melodies [30], [31]. Lemström et. al. proposed a geometric distance algorithm applied to a piano-roll representation of melody [29], [33]. Laitinen and Lemström proposed geometric approaches whereby tempo rescaling is handled through time-scaling and time-warping [35]. Finally, Urbano et. al. proposed a geometric representation of n-grams and evaluated similarity based upon those [32].

31 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 18 itctimep3.5: Example hfigure of Geometric Representation of Melody N-gram Based Measures for Melody Similarity An n-gram is a contiguous subsequence of n items from a given sequence. The distance between two sequences can be measured by counting the number of matching subsequences of length n that are shared by the sequences. For example, suppose we have a database of sequences and we wish to find the most similar sequence to a query sequence using n = 5. Suppose that the query sequence is ( ), which is represented by the two 5-grams ( ) and ( ). Given the following database of sequences, the distances are calculated as follows: Sequence A: ( ) is broken into the four 5-grams ( ), ( ), ( ), and ( ). The second of these 5-grams matches a query 5-gram. Sequence B: ( ) is broken into one 5-gram, ( ). Sequence C: ( ) is broken into five 5-grams ( ), ( ), ( ), ( ), and ( ). The first and last of these 5-grams match a query 5-gram. Because Sequence C has the most 5-grams matching query 5-grams it is ranked as the closest. Sequence A is second closest and Sequence B is ranked last.

32 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 19 In the case of melody similarity the items in each sequence are notes. Statistics about n-grams can be used to measure similarity, and this is efficient if n is small relative to the length of the music segment. (A melody consisting of K notes contains K n + 1 n-grams.) To evaluate the similarity of a query to a candidate sequence, the number of matching n-grams are counted. These counts are compared for all candidate sequences. The first published methods count the number of distinct matching n-grams to determine similarity [36]. Since then, additional representations and methods of comparing sets of n-grams have been developed and applied to the computation of melody similarity [22], [24], [32], [37], [38], [39]. 3.3 Evaluations To improve previous measures and to assess new melody similarity measures, evaluations must be performed. Some such evaluations have been performed in the past in a campaign entitled The Music Information Retrieval Evaluation exchange (MIREX). MIREX is an annual evaluation campaign for Music Information Retrieval (MIR) algorithms to aid in the evaluation and development of MIR techniques [40]. One of the MIR tasks evaluated at MIREX is the Symbolic Melodic Similarity (SMS) task. The goal of SMS is to, given a query, return a ranked set of the most similar items from a collection of symbolic pieces [41]. Table 3.1 details the results from the SMS task for the years 2005, 2006, 2007, and In addition to the category of distance measure used, this table shows which transformations (if any) the similarity measure is tolerant of. This information about music transformations is collected by manually inspecting the implementation details of each entry. Additionally, Table 3.2 aggregates the results (from 2005 onwards) by counting the number of times each

33 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 20 category of similarity measure is used. In the 2005 SMS task, the data used were taken from the RISM A/II collection [42]. To evaluate the results given by each similarity measure, ground truth was established by Typke et al. This ground truth was created through the cooperation of 35 music experts in manually judging the similarity of 11 melodies to a set of 50 candidate melodies [3]. Because this process proved to be time consuming the data used for the evaluations from 2006 onwards were taken from the Essen Collection ([43]) and ground truth was established ad-hoc. This ad-hoc ground truth was established by creating error-mutations in the following ways [44]: No errors (i.e. base ) One note deleted One note inserted One interval enlarged One interval compressed The MIREX SMS task results form lists of query melodies which map to sets of resulting candidate queries. The average dynamic recall (ADR) determines the best performing similarity measure(s) [41].

34 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 21 Year Source Polyphonic Monophonic Similarity method Tempo Rescaling Transposition Ornamentation Rank (ADR) 2005 [45] No Yes Combination of Edit Distance, N-Grams, and Geomentric Partial Yes No 7 of [29] No Yes Edit Distance, Geometric Distance No Yes No 5 and 6 of [30] No Yes Earth Mover s Distance Yes Yes No N/A 2005 [30] No Yes Segmented Earth Mover s Distance Partial Yes No 4 of [36] No Yes N-Gram matching Yes Partial No 3 of [37] No Yes Multilevel N-Grams of multiple features Yes Partial No 2 of [25] No Yes Edit Distance of I/R structures optimized by Genetic Algorithm Yes Yes No 1 of [21] Yes Yes Edit Distance No Yes No 2 of [46] Yes Yes Geometric Sweep Line Yes Partial No 7 and 6 of [31] Yes Yes Earth Mover s Distance No No No 1 of [22] No Yes Edit Distance, N-Grams, Geometric Partial Yes No 8 of [26] No Yes Edit Distance No Yes No 4 of [21] No Yes Edit Distance No Yes No 2 of [23] No Yes Edit Distance No Yes No 1 and 3 of [47] No Yes Time-Independent Interval Graph Yes Yes No 7 and 8 of [38] No Yes N-Gram matching Yes Yes No 4, 5, 6 of [39] No Yes N-Gram pitch and IOI matching Partial No No 6 and 8 of [48] No Yes Tree Representation with tree similarity Yes No No 7, 10, 12, and 13 of [35] No Yes Geometric Sweep Line Yes Yes No 4 and 11 of [32] No Yes N-Grams compared geometrically Yes No No 1, 2, 3, and 5 of 13 Table 3.1: Summary of results from previous MIREX Symbolic Melodic Similarity tasks

35 CHAPTER 3. BACKGROUND AND LITERATURE REVIEW 22 Method Count Edit Distance 7 N-Grams 6 Geometric Distance 3 Earth Mover s Distance 3 Trees 2 Fusion/Combination 2 Table 3.2: Aggregate of categories of techniques used in previous MIREX Symbolic Melodic Similarity tasks

36 Chapter 4 Methodology To evaluate the performance of a similarity measure we use transformations as building blocks to create a formal problem statement. Figure 4.1 illustrates the framework for evaluating melody similarity measures. The three components of the framework are the similarity measures selected for study (Section 4.1), the data sets used during testing (Section 4.2), and the evaluation criteria (Section 4.3). Figure 5.1 shows the instantiation of the framework that is used for the evaluation described in this thesis. 23

37 CHAPTER 4. METHODOLOGY 24 Figure 4.1: The framework for evaluating melody similarity measures.

38 CHAPTER 4. METHODOLOGY Selected Similarity Measures for Study This section describes the similarity measures that we select to study. The set of selected measures is representative of past work in symbolic melody similarity measures but not completely comprehensive Geometric Distance Measure We implement a geometric similarity measure which operates on a point-set representation of melody, as described in [49]. In order to create a measure that accounts for changes in melody length we select two data representations to implement: A A time sensitive representation in which the length of both the query and candidate melodies are not modified. B A time insensitive representation whereby the length of each note is ignored thereby effectively giving each note the same duration. In the case where the melodies are different lengths, the shorter of the two melodies is lengthened by enlarging the duration of each note appropriately. For example, when comparing a melody with 10 notes versus a melody with 5 notes, the note length of the former melody would be half of that of the latter. The implementation of the geometric distance measure was implemented using the General Polygon Clipper (GPC) library developed at the University of Manchester [50]. Melodies are first transformed into two dimensional point sets (of notes) where the first component represents the time passed before the onset of each note and the second component is the pitch.

39 CHAPTER 4. METHODOLOGY 26 When using representation B above, we calculate the first component (the time) of each note as the sum of the lengths of all preceding notes. Specifically, if X represents the set of lengths of all n notes in a melody: X = (X 1...X n ) then the first component of the note X i is calculated as i 1 n=1 X n. When using representation A, we simply use i 1 as the first component for note i. The second component of each note in a melody is an integer representing the pitch of the note. We calculate this as the number of half steps (semitones) the pitch differs from the middle C (C4). However, to ensure that all pitches are positive we add 24 to this value. All incipits used in our experiments are within 2 octaves of middle C (24 semitones). Finally, a polygon is constructed from the query and candidate point sets using the GPC library. The GPC library s difference(p olygon 1, P olygon 2 ) function returns a polygon which is the result of subtracting the second polygon from the first. The are of this resulting polygon is used to quantify the regions that the polygons do not share (their differences). The distance between query polygon Q and candidate polygon C is computed as difference(q, C) + difference(c, Q) Edit Distance Measure The Edit, or Levenshtein, Distance Measure is implemented by representing each melody as a sequence of notes. Each note in this sequence has both a pitch and a duration. As described in Section 3.2.1, there are a number of operations that are performed to transform one sequence to another. In our implementation, we use the insert, delete, and substitute operations. We assign all three operations equal cost values.

40 CHAPTER 4. METHODOLOGY 27 Two representations are used in our Edit Distance measure to aid in measuring the tolerance of said measures with respect to the transformations described in Chapter 2. These representations are as follows: A A duration sensitive approach which considers two notes to be matches only if both the pitch and the duration are exactly equal. B A duration insensitive approach which only requires the pitch of two notes to be equal for them to be considered matches. By contrasting the similarity ratings resulting from these two representations, we observe whether the edit-distance measure s sensitivity to transformations relates to human judgments of similarity N-Gram Distance Measure We implement an N-Gram Distance Measure, the operation of which is detailed in Section For our experiments, we chose N = 5. As for the Edit Distance Measure, we implement two types of matching: one duration sensitive and one duration insensitive. In addition to this, we chose to implement the following two methods of calculating distance from matches: A Total match counting, where the distance is equal to the inverse of the number of matches. For example, consider two melodies of size M where M = 7. We have 3 separate 5-grams for each melody. We assign a distance of 1 in case that there are no matches; this is the farthest possible distance. In the case where all 3 5-grams from the query melody match all 3 5-grams from the candidate melody, we have

41 CHAPTER 4. METHODOLOGY 28 a total of 9 matches, which is the closest possible distance. In this situation, we assign a distance of 1/9. B Distinct match counting, where the distance is again defined to be equal to the inverse number of matches. However, each n-gram from the query melody may only be matched to one n-gram from the candidate melody. So, considering the melodies where M = 7 in the previous method, there would only be 3 distinct matching 5 Grams. Therefore, the distance would be calculated as 1/ Beat and Direction Distance Measure We also implement a measure called the Beat and Direction Distance Measure [2]. This measure was developed with the goal of incorporating musical knowledge into a melody similarity measure while keeping the measure applicable to many types of melodies [2]. Recall from Chapter 1 that the correct answer in a melody similarity task is relative to many factors. Among these factors is the type of music being analyzed. The Beat and Direction Distance Measure operates on an abstraction of the query and candidate melodies by using the time signature to analyze the notes that fall on the beats of each melody. Each note that falls on the beat of a melody is called a beat note whereas every other note is called a passing note [2]. As the name implies, the Beat and Direction Distance Measure considers the ptich-direction of the melody as well as the beat notes. The measure performs a type of contourization by considering only the local extremes within the passing notes instead of analyzing each passing note. To calculate distance, the Beat and Direction Distance Measure uses a weighted

42 CHAPTER 4. METHODOLOGY 29 Condition pitch(beatnote 2 ) pitch(beatnote 1 ) > 0 pitch(beatnote 2 ) pitch(beatnote 1 ) < 0 pitch(beatnote 2 ) pitch(beatnote 1 ) = 0 Category Increasing Decreasing Unchanged Table 4.1: Possible categories for direction in the Beat and Direction Distance Measure Structure Definition Steep pitch(beatnote 1 ) pitch(beatnote 2 4 and passing notes, pitch(beatnote 1 ) pitch(passingnote) pitch(beatnote 2 ) Gradual pitch(beatnote 1 ) pitch(beatnote 2 < 4 and passing notes, pitch(beatnote 1 ) pitch(passingnote) pitch(beatnote 2 ) Upward bend pitch(passingnote) > pitch(beatnote 1 ), pitch(beatnote 2 ) Downward bend pitch(passingnote) < pitch(beatnote 1 ), pitch(beatnote 2 ) Oscillation pitch(passingnote 1 ) > pitch(beatnote 1 ), pitch(beatnote 2 ) and pitch(passingnote 2 ) < pitch(beatnote 1 ), pitch(beatnote 2 ) Table 4.2: Table 4.3 from [2] summarizing the possible structures of passing notes sum of three components: the similarity between beat notes, the similarity of directions between beat notes, and the similarity of structures between beats [2]. In our implementation these three components are calculated as follows: 1. The similarity between each corresponding beat note is calculated by taking the absolute difference between the pitches of the query and candidate beat notes. 2. To calculate the similarity of directions between beat notes, first the pitch direction is calculated for both the query and the candidate melodies. The three categories, Increasing, Decreasing, and Unchanged are defined in Table 4.1. If both the query and candidate melody s corresponding beats fall in the same category, the value for this component is 0. Otherwise, it is 1.

43 CHAPTER 4. METHODOLOGY Five structures of passing notes are used to determine the similarity between beats. These structures are summarized in Table 4.2. When corresponding beats have matching structures they are given a value of 0, otherwise they are given a 1. Finally, the overall distance between the query and candidate melodies is calculated as follows: distance = ω 1 + αω 2 + βω 3 where: ω 1 = the total score for similarity of beat notes ω 2 = the total score for similarity of directions of beats ω 3 = the total score for similariy of passing notes structures α,β = scalar weights In our experiments, we use α = 0.75 and β = Test Data for Evaluations (a) (b) Figure 4.2: An example from the RISM A/II dataset. The melody shown in (a) is used as the query melody. A group of 35 music experts determined that the most similar melody in the collection was (b). The differences between (a) and (b) can be almost completely modeled as a tempo rescaling transformation (doubling the note duration) and a transposition transformation (down a full step). As discussed in Section 3.3, the 2005 MIREX SMS task used data taken from the RISM A/II collection. The collection includes more than 657,000 pieces originating

44 CHAPTER 4. METHODOLOGY (a) (b) Figure 4.3: A second example from the RISM A/II dataset. The melody shown in (a) is used as the query melody and the melody shown in (b) is the agreed upon ground truth. In this example, the differences between (a) and (b) are more difficult to model as transformations. from over 22,000 composers from 32 countries [42]. This collection is useful due to its size and because it is made up of music written by real human composers. The collection is also in a digital format which makes it available for computational MIR tasks. Furthermore, expert-defined ground truth is provided for the similarity of 11 query melodies. This is sufficient reason to select this data for our evaluations. The generation of ground truth for the RISM A/II collection is done in a mostly manual process by Typke et. al. [3]. The set 11 of query melodies is first selected from the entire data set. To allow for the manual judgment by human experts the remainder of the data set must then be filtered to contain only relevant incipits. The filtering is done by first calculating certain features for the query melodies and then issuing SQL statements to the remainder of the database based upon these features. Some of these features (from [3]) include: Pitch range: the difference between the pitch of the highest note and the lowest note in the incipit. Duration ratio: the duration of the shortest note divided by the duration of the

45 CHAPTER 4. METHODOLOGY 32 Figure 4.4: An example of a box whisker plot from a ground truth from the RISM A/II dataset as generated by [3]. This plot shows that the rankings of experts have a median of 4. The mean of the responses is 5.5. The standard deviation of the responses by the horizontal dotted line where they demonstrate 1 standard deviation above and below the mean. longest note. Maximum interval: the largest interval between the onset of subsequent notes. Each query melody requires a different set of filtering steps to arrive at a manageable number of candidate melodies. Therefore, Typke et. al. manually apply filtering steps until the number of candidate melodies is under 300 [3]. This number of candidate melodies is sufficiently low such that the candidates may be inspected manually and sufficiently high such that candidates that are similar to the query melody are not accidentally excluded. To further ensure that possible matches are not excluded, the metadata available through the RISM A/II collection is used to find candidate incipits with similar titles. Typke et. al. state, For example, for Roslin Castle,... we made sure that every incipit whose title contains the word Roslin was included. [3]. Next, the ground truth is established by asking a group of 35 musical experts to establish an ordered similarity ranking of the candidate melodies. Figures 4.2 and 4.3 illustrate two queries and the best-matching candidate melody pairs as decided upon by the musical experts. Candidate melodies which do not seem similar to the query melody are given no rank. After collection of data, the ranks are analyzed to determine statistical significance by performing the Wilcoxon rank sum test to generate a partially ordered list of rankings.

46 CHAPTER 4. METHODOLOGY 33 Figure 4.5: An example of a visualization of a Wilcoxon rank sum test from a ground truth from the RISM A/II dataset as generated by [3]. An example of the complete ground truth for the melody in Figure 4.3 is shown in Figure 4.6. Figure 4.6 consists of the music notation representation of the query melody followed by the candidate melodies. To the right of each music notation representation there is a box-and-whisker plot representing the rankings given by the 35 experts. Figure 4.4 depicts an example of one of the box-and-whisker plots. The box in this plot represents the responses from the first to the third quartile and the whiskers represent the top and bottom 10 percent. Each red dot in the plot represents a response, the large dot represents the median of the responses, and the blue vertical line shows the mean. The horizontal dashed blue line shows the span of one standard deviation above and below the mean. Additionally, Figure 4.5 displays an example of the visualization of a Wilcoxon rank sum test. The size of the plot is representative of the number of incipits that are ranked higher than the current one. The colour of the plot is indicative of the p-value of the Wilcoxon rank sum test. Specifically, the proportion of the plot that is the darker, red colour is the p-value. Whenever the p-value is below a certain threshold (which is set to 0.25 in [3]) there is sufficient evidence to say that the rankings of the preceding melodies is not coincidental. Therefore, whenever the p-value falls below 0.25 there is a separation to create a partially ordered list.

47 CHAPTER 4. METHODOLOGY Evaluation Criteria The criteria with which we propose to evaluate our selected similarity measures are twofold. First, for each query melody selected in Section 4.2 we compute a list of candidate melodies ranked by similarity. We use this list to compute the Average Dynamic Recall (ADR) for each measure. Second, the relationship between ground truth query and candidate pairs is manually determined with respect to the music transformations selected in Chapter 2. We then combine the results from these steps in order to determine two things: The effectiveness of each selected measure in recognizing the music transformations The relationship between the human judgment of melody similarity (as defined by the ground truth) and the music transformations Each similarity measure returns a ranked list of N queries where N is the number of candidates with agreed upon rankings in the ground truth. The Average Dynamic Recall (ADR) is calculated for each similarity measure s ranked list of candidates as follows: ADR = 1 j j rank(i) i=1 where j is the number of groups in the ground truth and rank(i) is the number of correct candidates in the current group, divided by the total number in the current group. Figure 4.2 illustrates of the second step. Manual analysis of Figure 4.2 discovers that the differences between the query and candidate melodies can be explained well

48 CHAPTER 4. METHODOLOGY 35 as a set of transformations applied sequentially. Applying a tempo rescaling transformation then a transposition transformation to the query melody allows us to almost fully transform the query melody into the candidate.

49 Figure 4.6: Figure from data generated in [3] reproduced with permission pending. An example of ground truth from the RISM A/II dataset as generated by [3]. The query melody is shown at the top. This is followed by 10 candidate melodies that experts rank as most similar to the query melody. These 10 candidate melodies are split into 5 groups (a partially ordered list) using the Wilcoxon rank sum test. CHAPTER 4. METHODOLOGY 36 Ground truth for / SELECT candidatetuneid, rank FROM `MIRgroundtruth` WHERE QueryTuneID = AND SubjectID between 1 and 400 AND rank <300 ORDER BY candidatetuneid, rank 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3, XML source 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,4,5,6,7,8, XML source Wilcoxon: 1.463e-11 XML source 1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,4,5,5,5,5,5,5,5,6,7,8,11, XML source Wilcoxon: 2.051e-11/ XML source 2,2,3,3,3,3,4,4,4,4,4,4,4,4,4,4,5,5,6,6,6,7,7,7,7,7,8,9,24, XML source Wilcoxon: 2.624e-12/4.388e-06/ XML source 2,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,5,6,6,6,6,6,6,7,7,8,8,9,10, XML source Wilcoxon: 2.223e-12/6.844e-07/ /0.434 XML source 2,3,4,4,5,5,6,6,6,6,8,9,9,16, XML source Wilcoxon: 7.821e-10/4.076e-05/ /0.2363/ XML source 3,3,3,4,4,4,4,4,5,5,5,5,6,6,6,6,6,7,7,7,7,7,8,8,8,14,46, XML source Wilcoxon: 3.635e-12/1.329e-07/ /0.1154/0.3147/0.989 XML source 3,4,4,5,5,5,5,6,6,6,7,7,8,8,8,10,12,17,37, XML source Wilcoxon: 6.065e-11/3.045e-07/5.843e-05/ / /0.3874/ XML source 2,4,4,5,5,6,6,7,7,7,7,8,8,9,9,9,13,16,40, XML source Wilcoxon: 7.057e-11/5.005e-07/4.167e-05/ / /0.1803/ / XML source 4,5,8,8,8,9,10,10,10,11,13,13,27, XML source Wilcoxon: 9.23e-10/2.922e-07/8.068e-06/8.329e-05/ / / / / XML source IGNORED: 3,9,9,10,10,12,13,42, There are 31 votes. Threshold: * 31 = /462520: 1 (31) /464399: 2 (30) /466136: 3 (30) /289445: 4 (29) /464580: 5 (29) /434267: 6 (14) /476018: 6 (27) /472894: 6 (19) /475677: 7 (19) /123236: 10 (13) /393344: 10 (8) /474698: 10.5 (10) /139179: 11 (5) /219576: 13.5 (6) /432030: 14 (6) /252041: 14 (6) /8652: 15.5 (4) /61142: 16 (5) /16322: 18 (6) /101: 18 (7) XML source 6,9,9,10,10,11,12,15,35,38, XML source 9,11,11,23,28, XML source 7,8,12,15,21,44, XML source 11,13,14,14,19,41, XML source 10,12,13,15,36,41, XML source 12,14,17,24, XML source 6,10,16,30,36, XML source 1,4,11,25,26,28, XML source 6,8,9,18,22,31,32, XML source

50 Chapter 5 Experiments and Results We now describe the implementation details of our experiments and report on their results. Figure 5.1 illustrates the specific details for the experimental implementation of our approach in this thesis. This figure depicts the selections of transformations made in Chapter 2 as well as the selection of data and similarity measures made in Sections 4.2 and 4.1. We now describe the evaluation implementation process in Section 5.1, present our results in Section 5.2, and provide an informal analysis of these results in Section

51 CHAPTER 5. EXPERIMENTS AND RESULTS 38 Figure 5.1: Flow chart illustrating the data and algorithms used in the experiment to evaluate selected melody similarity measures.

52 CHAPTER 5. EXPERIMENTS AND RESULTS Evaluation Process The evaluation framework discussed in this thesis is implemented in the Java programming language. Java is chosen for its portability between different platforms and its object oriented nature. Object oriented design allows the implementation of the framework to have extensibility. Figure 5.1 details the implementation of the underlying search and similarity measure classes. Implementing a subclass of the interface ISimilarityMeasure allows a user to create their own melody similarity measure to be evaluated alongside the currently implemented measures. The evaluation implementation uses the similarity measures to perform searches on the selected dataset described in Section 4.2.

53 CHAPTER 5. EXPERIMENTS AND RESULTS 40 Figure 5.2: Class diagram depicting the implementation discussed in this thesis

54 CHAPTER 5. EXPERIMENTS AND RESULTS 41 In addition to these framework classes a Graphical User Interface (GUI) is developed to facilitate greater ease of use of the software. The GUI uses Java s Swing/AWT framework to display pertinent information and controls to the user. The GUI provides four separate tabs of functionality to the user for different tasks. These tasks are as follows: 1. Melody Viewer: This tab (Figure 5.3) allows the user to inspect the incipits from the RISM A/II collection, shown in a rectilinear polygon representation. This polygonal representation is similar to a piano roll representation which is named for the rolls on which music was recorded for player pianos to reproduce.

55 CHAPTER 5. EXPERIMENTS AND RESULTS 42 Figure 5.3: Screenshot of the Melody Viewer tab.

56 CHAPTER 5. EXPERIMENTS AND RESULTS Single Measure Evaluation: Figure 5.4 shows this tab. The topmost list allows the user to select a query melody from the RISM A/II collection. A polygonal representation of the selected melody shows the user the melody that they select. The bottom list facilitates the selection of a similarity measure by the user. Finally, the button (when clicked) performs an evaluation with the selected melody as the query and the remainder of the dataset as candidates. This evaluation returns the most similar melody found in the dataset according to the selected similarity measure, as shown in Figure 5.5.

57 CHAPTER 5. EXPERIMENTS AND RESULTS 44 Figure 5.4: Screenshot of the Single Measure Evaluation tab.

58 CHAPTER 5. EXPERIMENTS AND RESULTS 45 Figure 5.5: Screenshot of the results from running the Single Measure Evaluation tab.

59 CHAPTER 5. EXPERIMENTS AND RESULTS Single Measure Evaluation versus Ground Truth: This tab allows the user to perform an evaluation as described in Section 4.3. The functionality and look of this tab mirrors that of the Single Measure Evaluation tab except for the list of available query melodies. This tab lists only the query melodies which have ground truth associated with them from [3]. Additionally, upon execution, the evaluation presents a different type of results. Figure 5.7 illustrates an example of the results produced by this tab. Instead of displaying the most similar melody to the selected query, the results window displays the recall ability of the selected similarity measure with respect to the partially ordered list provided by the ground truth.

60 CHAPTER 5. EXPERIMENTS AND RESULTS 47 Figure 5.6: Screenshot of the Single Measure Evaluation versus Ground Truth tab.

61 CHAPTER 5. EXPERIMENTS AND RESULTS Complete Evaluation versus Ground Truth: This tab asks the user to select a query melody to use in an evaluation. Figure 5.8 displays an example of this tab. Upon execution, the software performs an evaluation of all available similarity measures using the selected melody as the ground truth. Figure 5.9 illustrates an example of the results obtained by comparing the selected similarity measure s results against the ground truth partially ordered list.

62 CHAPTER 5. EXPERIMENTS AND RESULTS 49 Figure 5.7: Screenshot of the results from running the Single Measure Evaluation versus Ground Truth tab.

63 CHAPTER 5. EXPERIMENTS AND RESULTS 50 Figure 5.8: Screenshot of the Complete Evaluation versus Ground Truth tab.

64 CHAPTER 5. EXPERIMENTS AND RESULTS Results Tables present the results of the evaluation carried out during this thesis research. The top row describes the size of each group (partial ordering) in the ground truth. Each group is comprised of the set of all melodies that are judged to be more similar to the query melody than those in later groups. That is to say that if the first group has 1 melody, the second 3, and the third 5 then the group sizes will be 1, 4, and 9 accordingly. The remainder of the rows each show the results for the query melody in question for a single similarity measure. Specifically, the first column names the similarity measure, and the subsequent columns list the number of correct candidate melodies that were selected by the similarity measure along with the recall value for that group.

65 CHAPTER 5. EXPERIMENTS AND RESULTS 52 Figure 5.9: Screenshot of the results from running the Complete Measure Evaluation versus Ground Truth tab.

Melody Retrieval using the Implication/Realization Model

Melody Retrieval using the Implication/Realization Model Melody Retrieval using the Implication/Realization Model Maarten Grachten, Josep Lluís Arcos and Ramon López de Mántaras IIIA, Artificial Intelligence Research Institute CSIC, Spanish Council for Scientific

More information

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University Week 14 Query-by-Humming and Music Fingerprinting Roger B. Dannenberg Professor of Computer Science, Art and Music Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Metadata-based

More information

Tool-based Identification of Melodic Patterns in MusicXML Documents

Tool-based Identification of Melodic Patterns in MusicXML Documents Tool-based Identification of Melodic Patterns in MusicXML Documents Manuel Burghardt (manuel.burghardt@ur.de), Lukas Lamm (lukas.lamm@stud.uni-regensburg.de), David Lechler (david.lechler@stud.uni-regensburg.de),

More information

Robert Alexandru Dobre, Cristian Negrescu

Robert Alexandru Dobre, Cristian Negrescu ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q

More information

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance RHYTHM IN MUSIC PERFORMANCE AND PERCEIVED STRUCTURE 1 On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance W. Luke Windsor, Rinus Aarts, Peter

More information

Extracting Significant Patterns from Musical Strings: Some Interesting Problems.

Extracting Significant Patterns from Musical Strings: Some Interesting Problems. Extracting Significant Patterns from Musical Strings: Some Interesting Problems. Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence Vienna, Austria emilios@ai.univie.ac.at Abstract

More information

Algorithms for melody search and transcription. Antti Laaksonen

Algorithms for melody search and transcription. Antti Laaksonen Department of Computer Science Series of Publications A Report A-2015-5 Algorithms for melody search and transcription Antti Laaksonen To be presented, with the permission of the Faculty of Science of

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu

More information

CSC475 Music Information Retrieval

CSC475 Music Information Retrieval CSC475 Music Information Retrieval Symbolic Music Representations George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 30 Table of Contents I 1 Western Common Music Notation 2 Digital Formats

More information

Modeling memory for melodies

Modeling memory for melodies Modeling memory for melodies Daniel Müllensiefen 1 and Christian Hennig 2 1 Musikwissenschaftliches Institut, Universität Hamburg, 20354 Hamburg, Germany 2 Department of Statistical Science, University

More information

Outline. Why do we classify? Audio Classification

Outline. Why do we classify? Audio Classification Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify

More information

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.

More information

Perceptual Evaluation of Automatically Extracted Musical Motives

Perceptual Evaluation of Automatically Extracted Musical Motives Perceptual Evaluation of Automatically Extracted Musical Motives Oriol Nieto 1, Morwaread M. Farbood 2 Dept. of Music and Performing Arts Professions, New York University, USA 1 oriol@nyu.edu, 2 mfarbood@nyu.edu

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

Evaluating Melodic Encodings for Use in Cover Song Identification

Evaluating Melodic Encodings for Use in Cover Song Identification Evaluating Melodic Encodings for Use in Cover Song Identification David D. Wickland wickland@uoguelph.ca David A. Calvert dcalvert@uoguelph.ca James Harley jharley@uoguelph.ca ABSTRACT Cover song identification

More information

A Case Based Approach to Expressivity-aware Tempo Transformation

A Case Based Approach to Expressivity-aware Tempo Transformation A Case Based Approach to Expressivity-aware Tempo Transformation Maarten Grachten, Josep-Lluís Arcos and Ramon López de Mántaras IIIA-CSIC - Artificial Intelligence Research Institute CSIC - Spanish Council

More information

Music Similarity and Cover Song Identification: The Case of Jazz

Music Similarity and Cover Song Identification: The Case of Jazz Music Similarity and Cover Song Identification: The Case of Jazz Simon Dixon and Peter Foster s.e.dixon@qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary

More information

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t MPEG-7 FOR CONTENT-BASED MUSIC PROCESSING Λ Emilia GÓMEZ, Fabien GOUYON, Perfecto HERRERA and Xavier AMATRIAIN Music Technology Group, Universitat Pompeu Fabra, Barcelona, SPAIN http://www.iua.upf.es/mtg

More information

Analysis and Clustering of Musical Compositions using Melody-based Features

Analysis and Clustering of Musical Compositions using Melody-based Features Analysis and Clustering of Musical Compositions using Melody-based Features Isaac Caswell Erika Ji December 13, 2013 Abstract This paper demonstrates that melodic structure fundamentally differentiates

More information

A Comparison of Different Approaches to Melodic Similarity

A Comparison of Different Approaches to Melodic Similarity A Comparison of Different Approaches to Melodic Similarity Maarten Grachten, Josep-Lluís Arcos, and Ramon López de Mántaras IIIA-CSIC - Artificial Intelligence Research Institute CSIC - Spanish Council

More information

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra

More information

MELODIC SIMILARITY: LOOKING FOR A GOOD ABSTRACTION LEVEL

MELODIC SIMILARITY: LOOKING FOR A GOOD ABSTRACTION LEVEL MELODIC SIMILARITY: LOOKING FOR A GOOD ABSTRACTION LEVEL Maarten Grachten and Josep-Lluís Arcos and Ramon López de Mántaras IIIA-CSIC - Artificial Intelligence Research Institute CSIC - Spanish Council

More information

Computational Modelling of Harmony

Computational Modelling of Harmony Computational Modelling of Harmony Simon Dixon Centre for Digital Music, Queen Mary University of London, Mile End Rd, London E1 4NS, UK simon.dixon@elec.qmul.ac.uk http://www.elec.qmul.ac.uk/people/simond

More information

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS Areti Andreopoulou Music and Audio Research Laboratory New York University, New York, USA aa1510@nyu.edu Morwaread Farbood

More information

Music Information Retrieval Using Audio Input

Music Information Retrieval Using Audio Input Music Information Retrieval Using Audio Input Lloyd A. Smith, Rodger J. McNab and Ian H. Witten Department of Computer Science University of Waikato Private Bag 35 Hamilton, New Zealand {las, rjmcnab,

More information

Analysis of local and global timing and pitch change in ordinary

Analysis of local and global timing and pitch change in ordinary Alma Mater Studiorum University of Bologna, August -6 6 Analysis of local and global timing and pitch change in ordinary melodies Roger Watt Dept. of Psychology, University of Stirling, Scotland r.j.watt@stirling.ac.uk

More information

CSC475 Music Information Retrieval

CSC475 Music Information Retrieval CSC475 Music Information Retrieval Monophonic pitch extraction George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 32 Table of Contents I 1 Motivation and Terminology 2 Psychacoustics 3 F0

More information

ANNOTATING MUSICAL SCORES IN ENP

ANNOTATING MUSICAL SCORES IN ENP ANNOTATING MUSICAL SCORES IN ENP Mika Kuuskankare Department of Doctoral Studies in Musical Performance and Research Sibelius Academy Finland mkuuskan@siba.fi Mikael Laurson Centre for Music and Technology

More information

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment Gus G. Xia Dartmouth College Neukom Institute Hanover, NH, USA gxia@dartmouth.edu Roger B. Dannenberg Carnegie

More information

A Transformational Grammar Framework for Improvisation

A Transformational Grammar Framework for Improvisation A Transformational Grammar Framework for Improvisation Alexander M. Putman and Robert M. Keller Abstract Jazz improvisations can be constructed from common idioms woven over a chord progression fabric.

More information

arxiv: v1 [cs.ai] 2 Mar 2017

arxiv: v1 [cs.ai] 2 Mar 2017 Sampling Variations of Lead Sheets arxiv:1703.00760v1 [cs.ai] 2 Mar 2017 Pierre Roy, Alexandre Papadopoulos, François Pachet Sony CSL, Paris roypie@gmail.com, pachetcsl@gmail.com, alexandre.papadopoulos@lip6.fr

More information

Doctor of Philosophy

Doctor of Philosophy University of Adelaide Elder Conservatorium of Music Faculty of Humanities and Social Sciences Declarative Computer Music Programming: using Prolog to generate rule-based musical counterpoints by Robert

More information

Study Guide. Solutions to Selected Exercises. Foundations of Music and Musicianship with CD-ROM. 2nd Edition. David Damschroder

Study Guide. Solutions to Selected Exercises. Foundations of Music and Musicianship with CD-ROM. 2nd Edition. David Damschroder Study Guide Solutions to Selected Exercises Foundations of Music and Musicianship with CD-ROM 2nd Edition by David Damschroder Solutions to Selected Exercises 1 CHAPTER 1 P1-4 Do exercises a-c. Remember

More information

Introductions to Music Information Retrieval

Introductions to Music Information Retrieval Introductions to Music Information Retrieval ECE 272/472 Audio Signal Processing Bochen Li University of Rochester Wish List For music learners/performers While I play the piano, turn the page for me Tell

More information

Chapter 40: MIDI Tool

Chapter 40: MIDI Tool MIDI Tool 40-1 40: MIDI Tool MIDI Tool What it does This tool lets you edit the actual MIDI data that Finale stores with your music key velocities (how hard each note was struck), Start and Stop Times

More information

SIMSSA DB: A Database for Computational Musicological Research

SIMSSA DB: A Database for Computational Musicological Research SIMSSA DB: A Database for Computational Musicological Research Cory McKay Marianopolis College 2018 International Association of Music Libraries, Archives and Documentation Centres International Congress,

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

arxiv: v1 [cs.sd] 8 Jun 2016

arxiv: v1 [cs.sd] 8 Jun 2016 Symbolic Music Data Version 1. arxiv:1.5v1 [cs.sd] 8 Jun 1 Christian Walder CSIRO Data1 7 London Circuit, Canberra,, Australia. christian.walder@data1.csiro.au June 9, 1 Abstract In this document, we introduce

More information

Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,

More information

Representing, comparing and evaluating of music files

Representing, comparing and evaluating of music files Representing, comparing and evaluating of music files Nikoleta Hrušková, Juraj Hvolka Abstract: Comparing strings is mostly used in text search and text retrieval. We used comparing of strings for music

More information

Evolutionary Computation Applied to Melody Generation

Evolutionary Computation Applied to Melody Generation Evolutionary Computation Applied to Melody Generation Matt D. Johnson December 5, 2003 Abstract In recent years, the personal computer has become an integral component in the typesetting and management

More information

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You Chris Lewis Stanford University cmslewis@stanford.edu Abstract In this project, I explore the effectiveness of the Naive Bayes Classifier

More information

Algorithmic Music Composition

Algorithmic Music Composition Algorithmic Music Composition MUS-15 Jan Dreier July 6, 2015 1 Introduction The goal of algorithmic music composition is to automate the process of creating music. One wants to create pleasant music without

More information

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016 6.UAP Project FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System Daryl Neubieser May 12, 2016 Abstract: This paper describes my implementation of a variable-speed accompaniment system that

More information

INTERACTIVE GTTM ANALYZER

INTERACTIVE GTTM ANALYZER 10th International Society for Music Information Retrieval Conference (ISMIR 2009) INTERACTIVE GTTM ANALYZER Masatoshi Hamanaka University of Tsukuba hamanaka@iit.tsukuba.ac.jp Satoshi Tojo Japan Advanced

More information

Analysis of Caprice No. 42. Throughout George Rochberg s Caprice No. 42, I hear a kind of palindrome and inverse

Analysis of Caprice No. 42. Throughout George Rochberg s Caprice No. 42, I hear a kind of palindrome and inverse Mertens 1 Ruth Elisabeth Mertens Dr. Schwarz MUTH 2500.004 6 March 2017 Analysis of Caprice No. 42 Throughout George Rochberg s Caprice No. 42, I hear a kind of palindrome and inverse effect, both in the

More information

A case based approach to expressivity-aware tempo transformation

A case based approach to expressivity-aware tempo transformation Mach Learn (2006) 65:11 37 DOI 10.1007/s1099-006-9025-9 A case based approach to expressivity-aware tempo transformation Maarten Grachten Josep-Lluís Arcos Ramon López de Mántaras Received: 23 September

More information

Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network

Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network Indiana Undergraduate Journal of Cognitive Science 1 (2006) 3-14 Copyright 2006 IUJCS. All rights reserved Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network Rob Meyerson Cognitive

More information

2. AN INTROSPECTION OF THE MORPHING PROCESS

2. AN INTROSPECTION OF THE MORPHING PROCESS 1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,

More information

Melodic String Matching Via Interval Consolidation And Fragmentation

Melodic String Matching Via Interval Consolidation And Fragmentation Melodic String Matching Via Interval Consolidation And Fragmentation Carl Barton 1, Emilios Cambouropoulos 2, Costas S. Iliopoulos 1,3, Zsuzsanna Lipták 4 1 King's College London, Dept. of Computer Science,

More information

PERCEPTUALLY-BASED EVALUATION OF THE ERRORS USUALLY MADE WHEN AUTOMATICALLY TRANSCRIBING MUSIC

PERCEPTUALLY-BASED EVALUATION OF THE ERRORS USUALLY MADE WHEN AUTOMATICALLY TRANSCRIBING MUSIC PERCEPTUALLY-BASED EVALUATION OF THE ERRORS USUALLY MADE WHEN AUTOMATICALLY TRANSCRIBING MUSIC Adrien DANIEL, Valentin EMIYA, Bertrand DAVID TELECOM ParisTech (ENST), CNRS LTCI 46, rue Barrault, 7564 Paris

More information

AP Music Theory. Sample Student Responses and Scoring Commentary. Inside: Free Response Question 1. Scoring Guideline.

AP Music Theory. Sample Student Responses and Scoring Commentary. Inside: Free Response Question 1. Scoring Guideline. 2017 AP Music Theory Sample Student Responses and Scoring Commentary Inside: Free Response Question 1 Scoring Guideline Student Samples Scoring Commentary 2017 The College Board. College Board, Advanced

More information

Jazz Melody Generation from Recurrent Network Learning of Several Human Melodies

Jazz Melody Generation from Recurrent Network Learning of Several Human Melodies Jazz Melody Generation from Recurrent Network Learning of Several Human Melodies Judy Franklin Computer Science Department Smith College Northampton, MA 01063 Abstract Recurrent (neural) networks have

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2012 AP Music Theory Free-Response Questions The following comments on the 2012 free-response questions for AP Music Theory were written by the Chief Reader, Teresa Reed of the

More information

A wavelet-based approach to the discovery of themes and sections in monophonic melodies Velarde, Gissel; Meredith, David

A wavelet-based approach to the discovery of themes and sections in monophonic melodies Velarde, Gissel; Meredith, David Aalborg Universitet A wavelet-based approach to the discovery of themes and sections in monophonic melodies Velarde, Gissel; Meredith, David Publication date: 2014 Document Version Accepted author manuscript,

More information

AP MUSIC THEORY 2016 SCORING GUIDELINES

AP MUSIC THEORY 2016 SCORING GUIDELINES AP MUSIC THEORY 2016 SCORING GUIDELINES Question 1 0---9 points Always begin with the regular scoring guide. Try an alternate scoring guide only if necessary. (See I.D.) I. Regular Scoring Guide A. Award

More information

What is Statistics? 13.1 What is Statistics? Statistics

What is Statistics? 13.1 What is Statistics? Statistics 13.1 What is Statistics? What is Statistics? The collection of all outcomes, responses, measurements, or counts that are of interest. A portion or subset of the population. Statistics Is the science of

More information

SAMPLE ASSESSMENT TASKS MUSIC JAZZ ATAR YEAR 11

SAMPLE ASSESSMENT TASKS MUSIC JAZZ ATAR YEAR 11 SAMPLE ASSESSMENT TASKS MUSIC JAZZ ATAR YEAR 11 Copyright School Curriculum and Standards Authority, 2014 This document apart from any third party copyright material contained in it may be freely copied,

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

Pitch Spelling Algorithms

Pitch Spelling Algorithms Pitch Spelling Algorithms David Meredith Centre for Computational Creativity Department of Computing City University, London dave@titanmusic.com www.titanmusic.com MaMuX Seminar IRCAM, Centre G. Pompidou,

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval IPEM, Dept. of musicology, Ghent University, Belgium Outline About the MAMI project Aim of the

More information

Processes for the Intersection

Processes for the Intersection 7 Timing Processes for the Intersection In Chapter 6, you studied the operation of one intersection approach and determined the value of the vehicle extension time that would extend the green for as long

More information

Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx

Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx Olivier Lartillot University of Jyväskylä, Finland lartillo@campus.jyu.fi 1. General Framework 1.1. Motivic

More information

Notes on David Temperley s What s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered By Carley Tanoue

Notes on David Temperley s What s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered By Carley Tanoue Notes on David Temperley s What s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered By Carley Tanoue I. Intro A. Key is an essential aspect of Western music. 1. Key provides the

More information

Popular Music Theory Syllabus Guide

Popular Music Theory Syllabus Guide Popular Music Theory Syllabus Guide 2015-2018 www.rockschool.co.uk v1.0 Table of Contents 3 Introduction 6 Debut 9 Grade 1 12 Grade 2 15 Grade 3 18 Grade 4 21 Grade 5 24 Grade 6 27 Grade 7 30 Grade 8 33

More information

Measuring Musical Rhythm Similarity: Further Experiments with the Many-to-Many Minimum-Weight Matching Distance

Measuring Musical Rhythm Similarity: Further Experiments with the Many-to-Many Minimum-Weight Matching Distance Journal of Computer and Communications, 2016, 4, 117-125 http://www.scirp.org/journal/jcc ISSN Online: 2327-5227 ISSN Print: 2327-5219 Measuring Musical Rhythm Similarity: Further Experiments with the

More information

Chords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm

Chords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm Georgia State University ScholarWorks @ Georgia State University Music Faculty Publications School of Music 2013 Chords not required: Incorporating horizontal and vertical aspects independently in a computer

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,

More information

Building a Better Bach with Markov Chains

Building a Better Bach with Markov Chains Building a Better Bach with Markov Chains CS701 Implementation Project, Timothy Crocker December 18, 2015 1 Abstract For my implementation project, I explored the field of algorithmic music composition

More information

Algorithmic Composition: The Music of Mathematics

Algorithmic Composition: The Music of Mathematics Algorithmic Composition: The Music of Mathematics Carlo J. Anselmo 18 and Marcus Pendergrass Department of Mathematics, Hampden-Sydney College, Hampden-Sydney, VA 23943 ABSTRACT We report on several techniques

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Melody Retrieval On The Web

Melody Retrieval On The Web Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,

More information

Jazz Melody Generation and Recognition

Jazz Melody Generation and Recognition Jazz Melody Generation and Recognition Joseph Victor December 14, 2012 Introduction In this project, we attempt to use machine learning methods to study jazz solos. The reason we study jazz in particular

More information

Pitch correction on the human voice

Pitch correction on the human voice University of Arkansas, Fayetteville ScholarWorks@UARK Computer Science and Computer Engineering Undergraduate Honors Theses Computer Science and Computer Engineering 5-2008 Pitch correction on the human

More information

Music and Mathematics: On Symmetry

Music and Mathematics: On Symmetry Music and Mathematics: On Symmetry Monday, February 11th, 2019 Introduction What role does symmetry play in aesthetics? Is symmetrical art more beautiful than asymmetrical art? Is music that contains symmetries

More information

Searching digital music libraries

Searching digital music libraries Searching digital music libraries David Bainbridge, Michael Dewsnip, and Ian Witten Department of Computer Science University of Waikato Hamilton New Zealand Abstract. There has been a recent explosion

More information

METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING

METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING Proceedings ICMC SMC 24 4-2 September 24, Athens, Greece METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING Kouhei Kanamori Masatoshi Hamanaka Junichi Hoshino

More information

Computer Coordination With Popular Music: A New Research Agenda 1

Computer Coordination With Popular Music: A New Research Agenda 1 Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,

More information

Course Report Level National 5

Course Report Level National 5 Course Report 2018 Subject Music Level National 5 This report provides information on the performance of candidates. Teachers, lecturers and assessors may find it useful when preparing candidates for future

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

The dangers of parsimony in query-by-humming applications

The dangers of parsimony in query-by-humming applications The dangers of parsimony in query-by-humming applications Colin Meek University of Michigan Beal Avenue Ann Arbor MI 489 USA meek@umich.edu William P. Birmingham University of Michigan Beal Avenue Ann

More information

Music Segmentation Using Markov Chain Methods

Music Segmentation Using Markov Chain Methods Music Segmentation Using Markov Chain Methods Paul Finkelstein March 8, 2011 Abstract This paper will present just how far the use of Markov Chains has spread in the 21 st century. We will explain some

More information

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J. UvA-DARE (Digital Academic Repository) Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J. Published in: Frontiers in

More information

GENERAL WRITING FORMAT

GENERAL WRITING FORMAT GENERAL WRITING FORMAT The doctoral dissertation should be written in a uniform and coherent manner. Below is the guideline for the standard format of a doctoral research paper: I. General Presentation

More information

Semi-supervised Musical Instrument Recognition

Semi-supervised Musical Instrument Recognition Semi-supervised Musical Instrument Recognition Master s Thesis Presentation Aleksandr Diment 1 1 Tampere niversity of Technology, Finland Supervisors: Adj.Prof. Tuomas Virtanen, MSc Toni Heittola 17 May

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring 2009 Week 6 Class Notes Pitch Perception Introduction Pitch may be described as that attribute of auditory sensation in terms

More information

Baylor College of Medicine The Graduate School of Biomedical Sciences

Baylor College of Medicine The Graduate School of Biomedical Sciences Baylor College of Medicine The Graduate School of Biomedical Sciences Instructions for Formatting and Submitting the M.S. Thesis 1. The best guide for formatting your thesis is a journal to which the work

More information

An Approach to Classifying Four-Part Music

An Approach to Classifying Four-Part Music An Approach to Classifying Four-Part Music Gregory Doerfler, Robert Beck Department of Computing Sciences Villanova University, Villanova PA 19085 gdoerf01@villanova.edu Abstract - Four-Part Classifier

More information

Music Representations

Music Representations Lecture Music Processing Music Representations Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals

More information

SAMPLE ASSESSMENT TASKS MUSIC GENERAL YEAR 12

SAMPLE ASSESSMENT TASKS MUSIC GENERAL YEAR 12 SAMPLE ASSESSMENT TASKS MUSIC GENERAL YEAR 12 Copyright School Curriculum and Standards Authority, 2015 This document apart from any third party copyright material contained in it may be freely copied,

More information

Speaking in Minor and Major Keys

Speaking in Minor and Major Keys Chapter 5 Speaking in Minor and Major Keys 5.1. Introduction 28 The prosodic phenomena discussed in the foregoing chapters were all instances of linguistic prosody. Prosody, however, also involves extra-linguistic

More information

AP Music Theory 1999 Scoring Guidelines

AP Music Theory 1999 Scoring Guidelines AP Music Theory 1999 Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought

More information

A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System

A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2006 A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System Joanne

More information

Introduction to capella 8

Introduction to capella 8 Introduction to capella 8 p Dear user, in eleven steps the following course makes you familiar with the basic functions of capella 8. This introduction addresses users who now start to work with capella

More information

Characterization and improvement of unpatterned wafer defect review on SEMs

Characterization and improvement of unpatterned wafer defect review on SEMs Characterization and improvement of unpatterned wafer defect review on SEMs Alan S. Parkes *, Zane Marek ** JEOL USA, Inc. 11 Dearborn Road, Peabody, MA 01960 ABSTRACT Defect Scatter Analysis (DSA) provides

More information