arxiv: v1 [cs.sd] 8 Jun 2016

Similar documents
arxiv: v1 [cs.lg] 15 Jun 2016

Modelling Symbolic Music: Beyond the Piano Roll

Music Composition with RNN

CS229 Project Report Polyphonic Piano Transcription

Evaluating Melodic Encodings for Use in Cover Song Identification

LSTM Neural Style Transfer in Music Using Computational Musicology

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

SIMSSA DB: A Database for Computational Musicological Research

Robert Alexandru Dobre, Cristian Negrescu

CSC475 Music Information Retrieval

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Music Information Retrieval with Temporal Features and Timbre

Generating Music with Recurrent Neural Networks

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *

arxiv: v2 [cs.sd] 15 Jun 2017

Chorale Harmonisation in the Style of J.S. Bach A Machine Learning Approach. Alex Chilvers

CPU Bach: An Automatic Chorale Harmonization System

Methodologies for Creating Symbolic Early Music Corpora for Musicological Research

A STUDY ON LSTM NETWORKS FOR POLYPHONIC MUSIC SEQUENCE MODELLING

Automatic Composition from Non-musical Inspiration Sources

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Chord Classification of an Audio Signal using Artificial Neural Network

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

Music Representations

Effects of acoustic degradations on cover song recognition

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Comparison of Dictionary-Based Approaches to Automatic Repeating Melody Extraction

Outline. Why do we classify? Audio Classification

Algorithmic Music Composition

Data Driven Music Understanding

MUSI-6201 Computational Music Analysis

Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Music Representations

A probabilistic approach to determining bass voice leading in melodic harmonisation

Statistical Modeling and Retrieval of Polyphonic Music

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

An Empirical Comparison of Tempo Trackers

Automatic Rhythmic Notation from Single Voice Audio Sources

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

NOTE-LEVEL MUSIC TRANSCRIPTION BY MAXIMUM LIKELIHOOD SAMPLING

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

Chapter 40: MIDI Tool

Composer Style Attribution

N-GRAM-BASED APPROACH TO COMPOSER RECOGNITION

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

Jazz Melody Generation and Recognition

Piano Transcription MUMT611 Presentation III 1 March, Hankinson, 1/15

Score Printing and Layout

Modeling Musical Context Using Word2vec

Rewind: A Music Transcription Method

A repetition-based framework for lyric alignment in popular songs

Extracting Significant Patterns from Musical Strings: Some Interesting Problems.

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

jsymbolic 2: New Developments and Research Opportunities

Transcription An Historical Overview

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Chord Representations for Probabilistic Models

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

Building a Better Bach with Markov Chains

Similarity matrix for musical themes identification considering sound s pitch and duration

A Fast Alignment Scheme for Automatic OCR Evaluation of Books

A Graphical Model for Chord Progressions Embedded in a Psychoacoustic Space

NEW QUERY-BY-HUMMING MUSIC RETRIEVAL SYSTEM CONCEPTION AND EVALUATION BASED ON A QUERY NATURE STUDY

Background/Purpose. Goals and Features

Harmonising Melodies: Why Do We Add the Bass Line First?

Music Database Retrieval Based on Spectral Similarity

Creating a Feature Vector to Identify Similarity between MIDI Files

STRING QUARTET CLASSIFICATION WITH MONOPHONIC MODELS

Evolutionary Computation Applied to Melody Generation

Analysis of local and global timing and pitch change in ordinary

Music BCI ( )

Pattern Discovery and Matching in Polyphonic Music and Other Multidimensional Datasets

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Neuratron AudioScore. Quick Start Guide

Topic 10. Multi-pitch Analysis

Hidden Markov Model based dance recognition

Content-based Indexing of Musical Scores

AP MUSIC THEORY 2016 SCORING GUIDELINES

Music Radar: A Web-based Query by Humming System

Music Representations. Beethoven, Bach, and Billions of Bytes. Music. Research Goals. Piano Roll Representation. Player Piano (1900)

Music Similarity and Cover Song Identification: The Case of Jazz

Evaluation of Melody Similarity Measures

Topics in Computer Music Instrument Identification. Ioanna Karydi

arxiv: v1 [cs.sd] 20 Nov 2018

Algorithmic Composition: The Music of Mathematics

arxiv: v1 [cs.sd] 12 Jun 2018

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

A Model of Musical Motifs

Improving Frame Based Automatic Laughter Detection

A Model of Musical Motifs

Music Genre Classification

Harmonic syntax and high-level statistics of the songs of three early Classical composers

arxiv: v3 [cs.sd] 14 Jul 2017

Rhythm together with melody is one of the basic elements in music. According to Longuet-Higgins

A wavelet-based approach to the discovery of themes and sections in monophonic melodies Velarde, Gissel; Meredith, David

Feature-Based Analysis of Haydn String Quartets

Transcription:

Symbolic Music Data Version 1. arxiv:1.5v1 [cs.sd] 8 Jun 1 Christian Walder CSIRO Data1 7 London Circuit, Canberra,, Australia. christian.walder@data1.csiro.au June 9, 1 Abstract In this document, we introduce a new dataset designed for training machine learning models of symbolic music data. Five datasets are provided, one of which is from a newly collected corpus of K midi files. We describe our preprocessing and cleaning pipeline, which includes the exclusion of a number of files based on scores from a previously developed probabilistic machine learning model. We also define training, testing and validation splits for the new dataset, based on a clustering scheme which we also describe. Some simple s are included. 1 Introduction In this appendix we provide an overview of the symbolic music datasets we offer in pre-processed form 1. Note that the source of four out of five of these datasets is the same set of midi files used in [BLBV1], which also provides pre-processed data. That work provided piano roll representations, which essentially consist of a regular temporal grid (of period one eighth note) of on/off indicators for each midi note number. While the piano roll is an excellent simplified music format for early investigations into symbolic music modelling, it does have several limitations, as discussed in previous work [Wal1]. To name one such limitation, the piano roll format does not explicitly represent note endings, and therefore cannot differentiate between, say, two successive eighth notes, and a single quarter note. To address these limitations, we have extracted additional information from the same set of midi files. Our goal is to represent the performance (or sounding) of notes by when they begin and end, rather than whether they are sounding or not at each time on a regular temporal grid. The representation we adopt consists of sets of five-tuples of integers representing the: piece number (corresponding to a midi file), 1 The data is available for download here: http://bit.ly/1pqntj 1

Dataset Long Name Source Total Pieces Midi Resolution PMD piano-midi.de [PE7, BLBV1] 1 8 JSB J.S Bach Chorales [AW5, BLBV1] 38 1 MUS MuseData [mus, BLBV1] 3 NOT Nottingham [not, BLBV1] 137 8 CMA Classical Midi Archives [cla] (new) 197 variable Table 1: A summary of the datasets used in this study. track (or part) number, defined by the midi channel in which the note event occurs, midi note number, ranging -17 according to the midi standard, and 1-11 inclusive for the data we consider here, note start time, in ticks, ( ticks = 1 beat = one quarter note), note end time, also in ticks. This document provides some background on the data, with a special emphasis on our new relatively large dataset, which we derived from an archive kindly provided to us by Pierre Schwob of http://www.classicalarchives.com. We are permitted to release this data in the form we provide, but not to provide the original midi files. Please refer to the data archive itself 1 for a detailed description of the format. A summary of the five datasets is provided in Table 1. Preprocessing We applied the following processing steps and filters to the raw midi data. Combination of piano sustain pedal signals with key press information to obtain equivalent individual note on/off events. Removal of duplicate/overlapping notes which occur on the same midi channel (while not technically allowed, this still occurs in real midi data due to the permissive nature of the midi file format). Unfortunately, this step is ill posed, and different midi software packages handle this differently. Our approach involves processing notes sequentially in order of start time, and ignoring those note events that overlap a previously added note event. Removal of midi channels with less than two note events (these occurred in the MUS and CMA datasets, and were always information tracks containing authorship information and acknowledgements, etc.).

Removal of percussion tracks. These occurred in some of the Haydn symphonies and Bach Cantatas contained in the MUS dataset, as well as in the CMA dataset. It is important to filter these as the percussion instruments are not necessarily pitched, and hence the s in these tracks are not comparable with those of pitched instruments, which we aim to model. Re-sampling of the timing information to a resolution of ticks per quarter note, as this is the lowest common multiple of the original midi file resolutions (see Table 1) for the four datasets considered in [BLBV1]. We accept some quantization error for some of the CMA files, although is already an unusually fine grained midi quantization (cf. the resolutions of the other datasets, in Table 1). For our new CMA dataset, we also removed 3 of the, midi files due to their suspect nature. We did this by assigning a heuristic score to each file and ranking. The score was computed by first training our model [Wal1] on the union of the four (transposed) datasets, JSB, PMD, NOT and MUS. We then computed the negative log-probability of each midi note number in the raw CMA data under the aforementioned model. Finally, we defined our heuristic score as, for each file, the mean of these negative log probabilities plus the standard error. The scores we obtained in this way are depicted in Figure 1. A listening test on the best and worst files verified the effectiveness of this measure. In any case, some degree of noise is to be expected in a data set of this size, and should be handled by subsequent modelling efforts. 3 Splits The four datasets used in [BLBV1] retain the original training, testing, and validation splits used in that work. For CMA, we took a careful approach to data splitting. The main issue was data duplicates, since the midi archive we were provided contained multiple versions of several pieces, each encoded slightly differently by a different transcriber. To reduce the statistical dependence between the train/test/validation splits of the CMA set, we used the following methodology: 1. We computed a simple signature vector for each file, which consisted of the concatenation of two vectors. The first was the normalised of midi note numbers in the file. For the second vector, we quantized the event durations into a set of sensible bins, and computed a normalised of the resulting quantised durations.. Given the signature vectors associated with each file, we performed hierarchical clustering using the function scipy.cluster.hierarchy.dendrogram from the python scipy library. We then ordered the files by traversing the resulting hierarchy in a depth first fashion. https://www.scipy.org 3

Sorted Midi File Quality Score score = mean + standard error of per-note negative log probability 5 3 1 5 1 15 file index Figure 1: Our filtering score for the original, midi files provided by the website http://www.classicalarchives.com. We kept the top 19,7, discarding files with a score greater than 3.9. 3. Given the above ordering, we took contiguous chunks of 15,7, 1,97 and 1,97 files for the train, test, validation sets, respectively. This leads to a similar ratio of split sizes as in [BLBV1]. Basic Exploratory Plots We provide some basic exploratory plots in figures 5. The Note Distribution and Number of Notes Per Piece plots are self explanatory. Note that the Number of Parts Per Piece (lower left sub figure) is fixed at one for the entire JSB dataset. This is due to an unfortunate lack of midi track information in those files, many of which are in fact four part harmonies. The pieces in the NOT dataset feature either one part (in the case of pure melodies) or two (in the case of melodies with associated chord accompaniments). The PMD dataset features up to six parts (for a three-part Bach fugue in which left and right hands are tracked separately). MUS features up to 7 parts (for Bach s St. Matthew s Passion). The CMA data features two pieces with parts Ravel s Valses Nobles et Sentimentales, and Venus, by Gustav Holst. The least obvious sub-figures are those on the lower-right labeled Peak Polyphonicity Per Piece. Polyphonicity simply refers to the number of simultaneously sounding notes, and this number can be rather high. For the PMD data, this is mainly attributable to musical runs which are performed with the piano sustain pedal depressed, for example in some of the Liszt pieces.

For the MUS data, this is mainly due to the inclusion of large orchestral works which feature many instruments. The CMA data, of course, contains both of the aforementioned sources of high levels of polyphonicity. Acknowledgements Special thanks to Pierre Schwob of http://www.classicalarchives.com, who permitted us to release the data in the form we describe. References [AW5] M. Allan and C. K. I. Williams, Harmonising Chorales by Probabilistic Inference, Advances in Neural Information Processing Systems 17 (5). [BLBV1] N. Boulanger-Lewandowski, Y. Bengio and P. Vincent, Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription, in Proceedings of the Twenty-ninth International Conference on Machine Learning (ICML 1), ACM, 1. [cla] [mus] [not] www.classicalarchives.com. www.musedata.org. ifdo.ca/~seymour/nottingham/nottingham.html. [PE7] G. E. Poliner and D. P. W. Ellis, A Discriminative Model for Polyphonic Piano Transcription., EURASIP J. Adv. Sig. Proc. 7 (7). [Wal1] C. Walder, Modelling Symbolic Music: Beyond the Piano Roll, arxiv (1), 1.138. 5

Note Distribution 1. 1. 1. 1..8.... 3 5 7 8 9 3 31 3 33 3 35 3 37 38 39 1 3 5 7 8 9 5 5 5 5 1 3 5 7 8 9 7 7 7 7 8 8 8 8 89 9 91 9 93 9 95 9 97 98 1 99 11 1 13 1 15 Number of Parts Per Piece.5..35.3.5..15.1.5 Number of Notes Per Piece. 8 1 1.1 Peak Polyphonicity Per Piece. 1.5 1..5...5 3. 3.5..5 5. 5.5..1.1.8.... 5 1 15 5 3 35 Figure : Summary of the PMD dataset. Note Distribution.1 Number of Notes Per Piece 1..8....8.... 3 5 7 8 9 5 5 5 5 1 3 5 7 8 9 7 7 7 7 8 8 8 8 89 9 91 9 93 9 95 9. 5 1 15 5 3 35 1 Number of Parts Per Piece 1 Peak Polyphonicity Per Piece 8 1 8...8 1. 1. 1. 1. 3. 3. 3.8.... Figure 3: Summary of the JSB dataset.

8 Note Distribution.5 Number of Notes Per Piece 7. 5 3.15.1 1.5 3 37 38 39 1 3 5 7 8 9 5 5 5 5 1 3 5 7 8 9 7 7 7 7 8 8 8 8. 5 1 15 5 5 3 1 Number of Parts Per Piece 1. 1.5..5 3..9.8.7..5..3..1 Peak Polyphonicity Per Piece. 1 3 5 7 8 9 Figure : Summary of the NOT dataset. 1 Note Distribution.35 Number of Notes Per Piece 1.3 1.5 1 8..15.1.5.5 3 5 7 8 9 3 31 3 33 3 35 3 37 38 39 1 3 5 7 8 9 5 5 5 5 1 3 5 7 8 9 7 7 7 7 8 8 8 8 89 9 91 9 93 9 95 9 97 98 1 99 Number of Parts Per Piece. 5 1 15 5.5 Peak Polyphonicity Per Piece...15.1.15.1.5.5. 5 1 15 5 3. 5 1 15 5 3 Figure 5: Summary of the MUS dataset. 7

1 17 18 19 1 3 5 7 8 9 3 31 3 33 3 35 3 37 38 39 1 3 5 7 8 9 5 5 5 5 1 3 5 7 8 9 7 7 7 7 8 8 8 8 89 9 91 9 93 9 95 9 97 98 5 Note Distribution (Entire Dataset) 1-3 Number of Notes by Piece 1-3 1-5 1-1 1-7 99 1 11 1 13 1 15 1 17 18 19 11 1-8 1 3 5 1 Number of Parts by Piece 1-1 Peak Polyphonicity by Piece 1-1 1-1 - 1-3 1-3 1-1 - 1-5 1-5 1 3 5 1-5 1 15 5 Figure : Summary of the CMA dataset. Note the log scale on three of the plots. 8