CADENCE DETECTION IN WESTERN TRADITIONAL STANZAIC SONGS USING MELODIC AND TEXTUAL FEATURES

Similar documents
A QUERY BY HUMMING SYSTEM THAT LEARNS FROM EXPERIENCE

e-workbook TECHNIQUES AND MATERIALS OF MUSIC Part I: Rudiments

Content-Based Movie Recommendation Using Different Feature Sets

CpE 442. Designing a Pipeline Processor (lect. II)

Version Capital public radio. Brand, Logo and Style Guide

Û Û Û Û J Û . Û Û Û Û Û Û Û. Û Û 4 Û Û &4 2 Û Û Û Û Û Û Û Û. Û. Û. Û Û Û Û Û Û Û Û Û Û Û. œ œ œ œ œ œ œ œ. œ œ œ. œ œ.

Study on evaluation method of the pure tone for small fan

Melodic Similarity - a Conceptual Framework

4.5 Pipelining. Pipelining is Natural!

Chapter 4. Minor Keys and the Diatonic Modes BASIC ELEMENTS

The game of competitive sorcery that will leave you spellbound.

Deal or No Deal? Decision Making under Risk in a Large-Payoff Game Show

Language and Music: Differential Hemispheric Dominance in Detecting Unexpected Errors in the Lyrics and Melody of Memorized Songs

Stochastic analysis of Stravinsky s varied ostinati

A METRIC FOR MUSIC NOTATION TRANSCRIPTION ACCURACY

H-DFT: A HYBRID DFT ARCHITECTURE FOR LOW-COST HIGH QUALITY STRUCTURAL TESTING

Ranking Fuzzy Numbers by Using Radius of Gyration

Precision Interface Technology

Citrus Station Mimeo Report CES WFW-Lake Alfred, Florida Lake Alfred, Florida Newsletter No. 2 6.

A Reconfigurable Frame Interpolation Hardware Architecture for High Definition Video

Experimental Investigation of the Effect of Speckle Noise on Continuous Scan Laser Doppler Vibrometer Measurements

Precision Interface Technology

other islands for four players violin, soprano sax, piano & computer nick fells 2009

Music from an evil subterranean beast

CLASSIFICATION OF RECORDED CLASSICAL MUSIC USING NEURAL NETWORKS

R&D White Paper WHP 119. Mezzanine Compression for HDTV. Research & Development BRITISH BROADCASTING CORPORATION. September R.T.

RBM-PLDA subsystem for the NIST i-vector Challenge

C2 Vectors C3 Interactions transfer momentum. General Physics GP7-Vectors (Ch 4) 1

iphone or Kindle: Competition of Electronic Books Sales

Scalable Music Recommendation by Search

Music Technology Advanced Subsidiary Unit 1: Music Technology Portfolio 1

AP Music Theory 2003 Scoring Guidelines

System Design For FEC In Aeronautical Telemetry. Erik Perrins AIR FORCE FLIGHT TEST CENTER EDWARDS AFB, CA 12 MARCH 2012

On the Design of LPM Address Generators Using Multiple LUT Cascades on FPGAs

A Practical and Historical Guide to Johann Sebastian Bach s Solo in A Minor BWV 1013

VOICES IN JAPANESE ANIMATION: HOW PEOPLE PERCEIVE THE VOICES OF GOOD GUYS AND BAD GUYS. Mihoko Teshigawara

Chapter 1: Choose a Research Topic

Compact Beamformer Design with High Frame Rate for Ultrasound Imaging

Grant Spacing Signaling at the ONU

BRASS TECHNIQUE BARITONE

Focus: Orff process, timbre, movement, improvisation. Audience: Teachers K-8

Making Fraction Division Concrete: A New Way to Understand the Invert and Multiply Algorithm

SCP725 Series. 3M It s that Easy! Picture this:

Please note that not all pages are included. This is purposely done in order to protect our property and the work of our esteemed composers.

Spreadsheet analysis of a hierarchical control system model of behavior. RICHARD S. MARKEN Aerospace Corporation, Los Angeles, California

Adapting Bach s Goldberg Variations for the Organ. Siu Yin Lie

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /VETECF.2002.

EWCM 900. technical user manual. electronic controller for compressors and fans

Design of Address Generators Using Multiple LUT Cascade on FPGA

(2'-6") OUTLINE OF REQUIRED CLEAR SERVICE AREA

Keller Central Percussion

crotchets Now transpose it up to E minor here! 4. Add the missing bar lines and a time signature to this melody

Jaggies as aliasing or reconstruction phenomena: a tutorial

A Low Cost Scanning Fabry Perot Interferometer for Student Laboratory

TABLE OF CONTENTS. Jacobson and the Meaningful Life Center. Introduction: Birthday Greeting from Rabbi Simon. Postscript: Do You Matter?

Cross-Cultural Music Phrase Processing:

The new face of Speke NEW MERSEY SHOPPING PARK, LIVERPOOL L24 8QB

To Bean or not to bean! by Uwe Rosenberg, with illustrations by Björn Pertoft Players: 2 7 Ages: 10 and up Duration: approx.

The new face of Speke NEW MERSEY SHOPPING PARK, LIVERPOOL L24 8QB

A New Method for Tracking Modulations in Tonal Music in Audio Data Format 1

Jump, Jive, and Jazz! - Improvise with Confidence!

Auburn University Marching Band

research is that it is descriptive in nature. What is meant by descriptive is that in a

LISG Laser Interferometric Sensor for Glass fiber User's manual.

Final Project: Musical Memory

Lesson 1 Group 2. Cotton Tail Composed by Duke Ellington. This version is from Duke Ellington, Ella Fitzgerald and Duke Ellington.

04/17/07 Trevor de Clercq TH521 Laitz HARMONY HOMEWORK

Flagger Control for Resurfacing or Moving Operation. One-Lane Two-Way Operation

SUITES AVAILABLE. TO LET Grade A Offices

Aural Skills Quiz (Introduction)

Texas Transportation Institute The Texas A&M University System College Station, Texas

Class Piano Resource Materials

Copland and the Folk Song: Sources, Analysis, Choral Arrangements

SN54273, SN54LS273, SN74273, SN74LS273 OCTAL D-TYPE FLIP-FLOP WITH CLEAR

This is a PDF file of an unedited manuscript that has been accepted for publication in Omega.

Texas Bandmasters Association 2016 Convention/Clinic

Newton Armstrong. unsaying (2010) for violoncello and voice

INTRODUCING. By M. GREENWALD TRY THESE FEW BARS : Copyrtsht by LEO. FEIST, Inc, Feist Bulldlns, New York.

automatic source-changeover system with 2 devices

Analog Signal Input. ! Note: B.1 Analog Connections. Programming for Analog Channels

Non-Chord Tones. œ œ. () œ. () œ. () œ œ. ( œ œ œ œ) œ

FM ACOUSTICS NEWS. News for Professionals. News for Domestic Users. Acclaimed the world over: The Resolution Series TM Phono Linearizers/Preamplifiers

Auditory Stroop and Absolute Pitch: An fmri Study

SAMPLE. Mass of Glory Keyboard Edition Ken Canedo and Bob Hurd. 2013, OCP 5536 NE Hassalo, Portland, OR (503) ocp.

MARTIN KOLLÁR. University of Technology in Košice Department of Theory of Electrical Engineering and Measurement

Music Theory Level 2. Name. Period

Flagger Control for Resurfacing or Moving Operation. One-Lane Two-Way Operation

Û Û Û Û J Û . Û Û Û Û Û Û Û. Û Û 4 Û Û &4 2 Û Û Û Û Û Û Û Û. Û. Û. Û Û Û Û Û Û Û Û Û Û Û. œ œ œ œ œ œ œ œ. œ œ œ. œ œ.

Chapter 4 (Part I) The Processor. Baback Izadi Division of Engineering Programs

FOR PREVIEW REPRODUCTION PROHIBITED

Strategic Informative Advertising in a Horizontally Differentiated Duopoly

Treatment of Minorities in Texas Government Textbooks

PMT EFFECTIVE RADIUS AND UNIFORMITY TESTING

A Wave-Pipelined On-chip Interconnect Structure for Networks-on-Chips

Review: What is it? What does it do? slti $4, $5, 6

S. Patta, Canzon francese, "La Gironda", Sacrorum canticorum. Liber secundus (1613), ed. N. M. Jensen, 2016

Unit 6 Writing About Research April/May

7/8" BOR LENGTH - 1-3/4 IN MOR LENGTH MOR LENGTH PER SUBMISSION. Example of continuous row

Give sequence to events Have memory y( (short-term) Use feedback from output to input to store information

Performance Suggestions

Transcription:

CADENCE DETECTION IN WESTERN TRADITIONAL STANZAIC SONGS USING MELODIC AND TEXTUAL FEATURES Pete van Kanenbg, Folget Kasdop Meetens Institte, Amstedam, Nethelands {pete.van.kanenbg,folget.kasdop}@meetens.knaw.nl ABSTRACT Many Westen songs ae hieahially stted in stanzas and phases. The melody of the song is epeated fo eah stanza, while the lyis vay. Eah stanza is sbdivided into phases. It is to be expeted that melodi and textal fomlas at the end of the phases offe intinsi les of lose to a listene o singe. In the ent pape we aim at a method to detet sh adenes in symbolially enoded folk songs. We take a tigam appoah in whih we lassify tigams of notes and pithes as adential o as non-adential. We se pith, onto, hythmi, textal, and ontextal feates, and a gop of feates based on the onditions of lose as stated by Namo [11]. We employ a andom foest lassifiation algoithm. The peision of the lassifie is onsideably impoved by taking the lass labels of adjaent tigams into aont. An ablation stdy shows that none of the kinds of feates is sffiient to aont fo good lassifiation, while some of the gops pefom modeately well on thei own. 1. INTRODUCTION This pape pesents both a method to detet adenes in Westen folk-songs, patilaly in folk songs fom Dth oal tadition, and a stdy to the impotane of vaios msial paametes fo adene detetion. Thee ae vaios easons to fos speifially on adene pattens. The onept of adene has played a majo ole in the stdy of Westen folk songs. In seveal of the most impotant folks song lassifiation systems, adene tones ae among the pimay feates that ae sed to pt the melodies into a linea odeing. In one of the ealiest lassifiation systems, devised by Ilmai Kohn [10], melodies ae fistly odeed aoding to the nmbe of phases, and seondly aoding to the seqene of adene tones. This method was adapted fo Hngaian melodies by Bátok and Kodály [16], and late on fo Geman folk songs by Sppan and Stief [17] in thei monmental Melodietypen des Detshen Volksgesanges. Bonson [3] intoded a nmbe of feates fo the stdy of Anglo- Ameian folk song melodies, of whih final adene and Pete van Kanenbg, Folget Kasdop. Liensed nde a Ceative Commons Attibtion 4.0 Intenational Liense (CC BY 4.0). Attibtion: Pete van Kanenbg, Folget Kasdop. Cadene Detetion in Westen Taditional Stanzai Songs sing Melodi and Textal Feates, 15th Intenational Soiety fo Msi Infomation Retieval Confeene, 2014. mid-adene ae the most pominent ones. One of the ndelying assmptions is that the seqene of adene tones is elatively stable in the poess of oal tansmission. Ths, vaiants of the same melody ae expeted to end p nea to eah othe in the eslting odeing. Fom a ognitive point of view, the peeption of lose is of fndamental impotane fo a listene o singe to ndestand a melody. In tems of expetation [8, 11], a final adene implies no ontination at all. It is to be expeted that speifi feates of the songs that ae elated to lose show diffeent vales fo adential pattens as ompaed to non-adential pattens. We inlde a sbset of feates that ae based on the onditions of lose as stated by Namo [11, p.11]. Cadene detetion is elated to the poblem of segmentation, whih is elevant fo Msi Infomation Retieval [21]. Most segmentation methods fo symbolially epesented melodies ae eithe based on pe-defined les [4, 18] o on statistial leaning [1,9,12]. In the ent pape, we fos on the msial popeties of adene fomlas athe than on the task of segmentation as sh. Taking Dth folk songs as ase stdy, we investigate whethe it is possible to deive a geneal model of the melodi pattens o fomlas that speifially indiate melodi adenes sing both melodi and textal feates. To addess this qestion, we take a omptational appoah by employing a andom foest lassifie (Setions 5 and 6). To investigate whih msial paametes ae of impotane fo adene detetion, we pefom an ablation stdy in whih we sbseqently emove etain types of feates in ode to evalate the impotane of the vaios kinds of feates (Setion 7). 2. DATA We pefom all o expeiments on the folk song olletion fom the Meetens Tne Colletions (MTC-FS, vesion 1.0), whih is a set of 4,120 symbolially enoded Dth folk songs. 1 Roghly half of it onsists of tansiptions fom field eodings that wee made in the Nethelands ding the 20th enty. The othe half is taken fom song books that ontain epetoie that is dietly elated to the eodings. Ths, we have a oheent olletion of songs that eflets Dth eveyday song lte in the ealy 20th enty. Vitally all of these songs have a stanzai stte. Eah stanza epeats the melody, and eah stanza 1 Available fom: http://www.liedeenbank.nl/mt. 391

onsists of a nmbe of phases. Both in the tansiptions and in the song books, phase endings ae indiated. Fige 1 shows a typial song fom the olletion. The langage of the songs is standad Dth with oasionally some dialet wods o nonsense syllables. All songs wee digitally enoded by hand at the Meetens Institte (Amstedam) and ae available in Hmdm **ken fomat. The phase endings wee enoded as well and ae available fo omptational analysis and modeling. 3. OUR APPROACH O geneal appoah is to isolate tigams fom the melodies and to label those as eithe adential o non-adential. A adential tigam is the last tigam in a phase. We ompae two kinds of tigams: tigams of sessive notes (note-tigams), and tigams of sessive pithes (pithtigams), onsideing epeated pithes as one event. In the ase of pith-tigams, a adene patten always onsists of the thee last niqe pithes of the phase. Thee ae two easons fo inlding pith-tigams. Fist, pith epetition is often ased by the need to plae the ight nmbe of syllables to the melody. It os that a qate note in one stanza oesponds to two eighth notes in anothe stanza bease thee is an exta syllable at that spot in the song text. Seond, in models of lose in melody [11, 15] sessions of pithes ae of pimay impotane. Fige 1 depits all pith-tigams in the pesented melody. The tigam that ends on the final note of a phase is a adential tigam. These ae indiated in bold. Some adential tigams oss a phase bonday when the next phase stats with the same pith. Fom eah tigam we extat a nmbe of feate vales that eflet both melodi and textal popeties. We then pefom a lassifiation expeiment sing a Random Foest Classifie [2]. This appoah an be egaded a bagof-tigams appoah, whee eah pedition is done independently of the othes, i.e. all seqential infomation is lost. Theefoe, as a next step we take the labels of the diet neighboing tigams into aont as well. The final lassifiation is then based on a majoity vote of the pedited labels of adjaent tigams. These steps will be explained in detail in the next setions. 8 6 Die Jan Die Al bets daa daa van van die ve ve it een een het ij meis en je je wo al meis al gaan. Fige 1. Examples of pith-tigams. The adential tigams ae indiated in bold. 4. FEATURES We epesent eah tigam as a veto of feate vales. We mease seveal basi popeties of the individal pithes and of the patten as a whole. The ode to atomatially extat the feate vales was witten in Python, sing the msi21 toolbox [5]. The feates ae divided into gops that ae elated to distint popeties of the songs. Some feates o in moe than one gop. The following oveview shows all feates and in paentheses the vale fo the fist tigam in Fige 1. Detailed explanations ae povided in setions 4.1 and 4.2. Pith Feates Sale degee Sale degees of the fist, seond, and thid item (5, 1, 3). Range Diffeene between highest and lowest pith (4). Has ontast thid Whethe thee ae both even and odd sale degees in the tigam (). Conto Feates Contains leap Whethe thee is a leap in the tigam (Te). Is asending Whethe the fist and seond intevals, and both ae asending (, Te, ). Is desending Whethe the fist and seond intevals, and both ae desending (Te,, ). Lage-small Whethe the fist inteval is lage and the seond is small (Te). Registal hange Whethe thee is a hange in dietion between the fist and the seond inteval (Te). Rhythmi Feates Beat stength The meti weights of the fist, seond and thid item (0.25, 1.0, 0.25). Min beat stength The smallest meti weight (0.25). Next is est Whethe a est follows the fist, seond and thid item (,, ). Shot-long Whethe the seond item is longe than the fist, and the thid is longe than the seond (, ). Mete The mete at the beginning of the tigam ( 6/8 ). Textal Feates Rhymes Whethe a hyme wod ends at the fist, seond and thid item (,, ). Wod stess Whethe a stessed syllable is at the fist, seond and thid item (Te, Te, Te). Distane to last hyme Nmbe of notes between the last the fist, seond and thid item and the last hyme wod o beginning of the melody (0, 1, 2). Namo Close Feates Beat stength The meti weights of the fist, seond and thid item (0.25, 1.0, 0.25). Next is est Whethe a est follows the fist, seond and thid item (,, ). Shot-long Whethe the seond item is longe than the fist, and the thid is longe than the seond (, ). Lage-small Whethe the fist inteval is lage ( fifth) and the seond is small ( thid) (Te). Registal hange Whethe thee is a hange in dietion between the fist and the seond inteval (Te). Contextal Feates Next is est thid Whethe a est o end of melody follows the thid item (). Distane to last hyme Nmbe of notes between the last the fist, seond and thid item and the last hyme wod o beginning of the melody (0, 1, 2). 4.1 Melodi Feates Seveal of the feates need some explanation. In this setion we desibe the melodi feates, while in the next setion, we explain how we extated the textal feates. HasContastThid is based on the theoy of Jos Smits- Van Waesbeghe [15], the oe idea of whih is that a melody gets its tension and inteest by altenating between 392

pithes with even and neven sale degees, whih ae two ontasting seies of thids. The meti weight in the Rhythmi feates is the beatstength as implemented in msi21 s mete model. The Namo feates ae based on the six (peliminay) onditions of lose that Namo states at the beginning of his fist book on the Impliation-Realisation theoy [11, p.11]: [...] melodi lose on some level os when 1. a est, an onset of anothe stte, o a epetition intepts an implied patten; 2. meti emphasis is stong; 3. onsonane esolves dissonane; 4. dation moves mlatively (shot note to long note); 5. intevalli motion moves fom lage inteval to small inteval; 6. egistal dietion hanges (p to down, down to p, lateal to p, lateal to down, p to lateal, o down to lateal). Of ose, these six may appea in any ombination. Bease the melodies ae monophoni, ondition 3 has no ontepat in o feate set. The ontextal feates ae not feates of the tigam in isolation, bt ae elated to the position in the melody. In an initial expeiment we fond that the distane between the fist note of the tigam and the last adene is an impotant pedito fo the next adene. Sine this is based on the gond-tth label, we annot inlde it dietly into o feate set. Sine we expet hyme in the text to have a stong elation with adene in the melody, we inlde the distane to the last hyme wod in nmbe of notes. 8 6 4 Jan jan Al Al meis je me+s j@ bets b@ts stan Te zax ij E+ en j@ ve ve wo wa+w gaan. xan Te moj zax meis me+s je j@ ve ve moj stan Te Fige 2. Rhyme as deteted by o method. The fist line shows the oiginal text afte emoving non-ontent wods. The seond line shows the phonologial epesentations of the wods (in SAMPA notation). The thid line shows whethe hyme is deteted ( Te if a hyme wod ends at the oesponding note). 4.2 Textal Feates In many poetial texts, phase bondaies ae detemined by seqenes of hyme. These establish a stte in a text, both fo aesthetis please and memoy aid [14]. In folk msi, phasal bondaies established by seqenes of hyme ae likely to elate to phases in the melody. We developed a hyme detetion system whih allows s to extat these seqenes of hyming lyis. Bease of othogaphial ambigities (e.g. ise, whee /:/ is epesented by i wheeas in mse it is epesented by ), it is not as staightfowad to pefom hyme detetion on othogaphial epesentations of wods. Theefoe, we tansfom eah wod into its phonologial epesentation (e.g. ise beomes /k:z/ and bike /baik/). i i s i s e i i s i s e s e e Fige 3. Example sliding window fo phoneme lassifiation. We appoah the poblem of phonemiization as a spevised lassifiation task, whee we ty to pedit fo eah haate in a given wod its oesponding phoneme. We take a sliding window-based appoah whee fo eah fos haate (i.e. the haate fo whih we want to pedit its phonemi epesentation) we extat as feates n haates to the left of the fos haate, n haates to the ight, and the fos haate itself. Fige 3 povides a gaphial epesentation of the feate vetos extated fo the wod ise. The foth olmn epesents the fos haate with a ontext of thee haates befoe and thee afte the fos haate. The last olmn epesents the taget phonemes whih we wold like to pedit. Note that the fist taget phoneme in Fige 3 is peeded by an apostophe ( k), whih epesents the stess position on the fist (and only) syllable in ise. This symboli notation of stess in ombination with phonology allows s to simltaneosly extat a phonologial epesentation of the inpt wods as well as thei stess pattens. Fo all wods in the lyis in the dataset we apply o sliding window appoah with n = 5, whih seves as inpt fo the spevised lassifie. In this pape we make se of a k = 1 Neaest Neighbo Classifie as implemented by [6] sing defalt settings, whih was tained on the data of the e- Lex database 2. In the nning text of o lyis, 89.5% of the wods has a diet hit in the instane base, and fo the emaining wods in many ases sitable neaest neighbos wee fond. Theefoe, we onside the phonemiization sffiiently eliable. We assme that only ontent wods (nons, adjetives, vebs and advebials) ae possible andidate hyme wods. This assmption follows lingisti knowledge as phases typially do not end with fntion wods sh as detemines, pepositions, etetea. Fntion wods ae pat of a losed ategoy in Dth. We extat all fntion wods fom the lexial database e-lex and mak fo eah wod in eah lyi whethe it is a fntion wod. We implemented hyme detetion aoding to the les fo Dth hyme as stated in [19]. The algoithm is staightfowad. We ompae the phoneme-epesentations of two wods bakwads, stating at the last phoneme, ntil we eah the fist vowel, exlding shwas. If all phonemes 2 http://tst-entale.og/en/podten/lexia/ e-lex/7-25 'k : 0 z 0 393

Class p e F 1 σ F1 sppot note-tigams adene 0.84 0.72 0.78 0.01 23,925 noadene 0.96 0.98 0.97 0.01 183,780 pith-tigams adene 0.85 0.69 0.76 0.01 23,838 noadene 0.95 0.98 0.96 0.00 130,992 Table 1. Reslts fo single labels. Class p e F 1 σ F1 sppot note-tigams adene 0.89 0.72 0.80 0.01 23,925 noadene 0.96 0.99 0.98 0.00 183,780 pith-tigams adene 0.89 0.71 0.79 0.01 23,838 noadene 0.95 0.98 0.97 0.01 130,992 Table 2. Reslts fo lassifiation with label tigams. and the vowel ae exatly the same, the two wods hyme. As an example we take kindeen ( hilden ) and hindeen ( to hinde ). The phoneme epesentations as poded by o method ae /kind@@/ and /hind@@/. The fist vowel stating fom the bak of the wod, exlding the shwas (/@/), is /I/. Stating fom this vowel, the phoneme epesentations of both wods ae idential (/Ind@@/). Theefoe these wods hyme. We also onside liteal epetition of a wod as hyme, bt not if a seqene of wods is epeated liteally, sh as in the example in Fige 1. Sh epetition of entie phases os in many songs. Labeling all wods as hyme wods wold weaken the elation with adene o end-ofsentene. We only label the last wod of epeated phases as a hyme wod. Fige 2 shows an example. 5. CLASSIFICATION WITH SINGLE LABELS As a fist appoah we onside the tigams independently. A melody is epesented as bag-of-tigams. Eah tigam has a gond-tth label that is eithe adene o no adene, as depited in Fige 1 fo pith-tigams We employ a Random Foest lassifie [2] as implemented in the Python libay sikit-lean [13]. This lassifie ombines n deision tees (peditos) that ae tained on andom samples extated fom the data (with eplaement). The final lassifiation is a majoity vote of the peditions of the individal tees. This poede has poven to pefom moe obstly than a single deision tee and is less pone to ove-fitting the data. Given the elatively lage size of o data set, we set the nmbe of peditos to 50 instead of the defalt 10. Fo the othe paametes, we keep the defalt vales. The evalation is pefomed by 10-fold oss-validation. One non-tivial aspet of o poede is that we onstt the folds at the level of the songs, athe than at that of individal tigams. Sine it is qite ommon fo folk songs to have phases that ae liteally epeated, folding at the level of tigams old eslt in idential tigams in the tain and test sbsets, whih old lead to an ovefitted lassifie. By ensing that all tigams fom a song ae eithe in the test o in the tain sbset, we expet bette genealization. This poede is applied thoghot this pape. The eslts ae shown in Table 1. Fo both lasses aveages of the vales fo the peision, the eall and the F 1 - mease ove the folds ae inlded, as well as the standad deviation of the F 1 mease, whih indiates the vaiation ove the folds. The nmbe of items in both lasses (sp- pot) shows that adenes ae lealy a minoity lass. We obseve that the note-tigams lead to slightly bette adene-detetion as ompaed to pith-tigams. Appaently, the epetition of pithes does not ham the disiminability. Fthemoe, thee is an nbalane between the peision and the eall of the adene-tigams. The peision is athe high, while the eall is modeate. 6. CLASSIFICATION WITH LABEL TRIGRAMS When o adene detetion system pedits the lass of a new tigam, it is oblivios of the deisions made fo ealie peditions. One patilaly negative effet of this neasightedness is that the lassifie feqently pedits two (o even moe) adenes in a ow, whih, given o o taining mateial, is extemely nlikely. We attempt to imvent this defet sing a method, developed by [20] that pedits tigams of lass labels instead of single, binay labels. Fige 4 depits the standad single lass lassifiation setting, whee eah tigam is pedited independent of all othe peditions. In the label tigam setting (see Fige 5), the oiginal lass labels ae eplaed with the lass label of the pevios tigam, the lass label of the ent tigam and the label of the next tigam. The leaning poblem is tansfomed into a seqential leaning poblem with two stages. In the fist stage we pedit fo eah tigam a label tigam y (t) = (y 1, y 2, y 3 ) whee y {0, 1}. To aive at the final single lass peditions (i.e. is it a adene o not), in the seond stage we take the majoity vote ove the peditions of the fos tigam and those of its immediate left and ight neighboing tigams. Take t 4 in Fige 5 as an example. It pedits that the ent tigam is a adene. The next tigam and the pevios tigam also pedit it to be a adene and based on this majoity vote, the final pedition is that t 4 is a adene. Shold t 3 and t 5 both have pedited the zeo lass (e.g. y (t3) = (0, 0, 0) and y (t5) = (0, 1, 0)), the majoity vote wold be 0. The advantage of this method is that given the negligible nmbe of neighboing adenes in o taining data, we an vitally le ot the possibility to eoneosly pedit two o moe adenes in a ow. Table 2 shows the pefomane of the label-tigam lassifie fo both lasses and both fo pith and note tigams. The vales show an impotant impovement fo the peision of adene-detetion and a slight impovement of the eall. The lowe nmbe of false positives is what we expeted by obseving the lassifiation of adjaent tigams as adene in the ase of the single-label lassifie. 394

0 0 0 1 0 t 1 t 2 t 3 t 4 t 5 Fige 4. Shot example seqene of tigams. Eah tigam t i has a binay label indiating whethe the tigam is adential (1) o non-adential (0). 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 t 1 t 2 t 3 t 4 t 5 Fige 5. Label-tigams fo the same seqene as in Fige 1, whee t 4 has label 1 and the othe tigams have label 0. Eah tigam t i gets a ompond label onsisting of its own label and the labels of the diet neighboing tigams. 7. ABLATION STUDY To stdy the impotane of the vaios kinds of feates, we pefom an ablation stdy. We sessively emove eah of the gops of feates as defined in setion 4 fom the fll set and do a lassifiation expeiment with the emaining feates. Sbseqently, we pefom a simila seies of lassifiation expeiments, bt now with eah single gop of feates. The fist seies shows the impotane of the individal gops of feates, and the seond seies shows the peditive powe fo eah of the gops. Bease the gops ae assembled aoding to distint popeties of msi and text, this will give insight in the impotane of vaios msial and textal paametes fo adene detetion. We se the label-tigam lassifie with the notetigams, whih pefomed best on the fll set. We expet oene of ests to be a vey stong pedito, bease aoding to o definition a est always follows afte the final adene, and we know that in o ops ests almost exlsively o between phases. Theefoe, we also take the thee feates that indiate whethe a est os in the tigam o dietly afte it, as a sepaate gop. The pefomane when leaving these thee feates ot will show whethe they ae ial fo adene detetion. Table 3 shows the evalation meases fo eah of the feate sbsets. Peision, eall and F 1 fo lass adene ae epoted. Again, the vales ae aveaged ove 10 folds. We see that none of the single gops of feates is ial fo the pefomane that was ahieved with the omplete set of feates. The basi melodi feates (F pith, F onto, and F hyhmi ) all pefom vey bad on thei own, showing low to extemely low eall vales. The onto feates even do not ontibte at all. Only the hythmi feates yield some pefomane. The feates on est ae Sbset p e F 1 σ F1 F all 0.89 0.72 0.80 0.01 F all \ F pith 0.88 0.72 0.79 0.01 F pith 0.84 0.04 0.08 0.01 F all \ F onto 0.88 0.73 0.80 0.01 F onto 0.00 0.00 0.00 0.00 F all \ F hythmi 0.79 0.49 0.60 0.01 F hythmi 0.90 0.35 0.50 0.01 F all \ F textal 0.85 0.58 0.69 0.02 F textal 0.70 0.40 0.51 0.01 F all \ F namo 0.83 0.55 0.66 0.01 F namo 0.95 0.30 0.45 0.01 F all \ F ontextal 0.87 0.67 0.76 0.01 F ontextal 0.71 0.45 0.56 0.01 F all \ F est 0.87 0.67 0.76 0.01 F est 0.97 0.27 0.43 0.02 Table 3. Reslts fo vaios feate sbsets fo lass adene. inlded in the set of hythmi feates. The lassifiation with jst the feates on est, F est shows vey high peision and low eall. Still, the eall with all hythmi feates is highe than only sing the est-feates. Sine ests ae so tightly elated to adenes in o ops, the high peision fo F est is what we expeted. If we exlde the est-feates, the peision stays at the same level as fo the entie feate set and the eall dops with 0.06, whih shows that only a minoity of the adenes exlsively elies on est-feates to be deteted. The set of feates that is based on the onditions of lose as fomlated by Namo shows high peision and low eall. Espeially the high peision is inteesting, bease this onfims Namo s onditions of lose. Appaently, most pattens that ae lassified as adene based on this sbset of feates, ae adenes indeed. Still, the low eall indiates that thee ae many adenes that ae left ndeteted. One ase old be that the set of onditions as stated by Namo is not omplete, anothe ase old be the disepany between o feates and Namo s onditions. Fthe investigation wold be neessay to shed light on this. Removing the Namo-based feates fom the fll feate set does not have a big impat. The othe feates have enogh peditive powe. The textal feates on thei own show modeate peision and vey modeate eall. They ae able to disen etain kinds of adenes to a etain extent, while missing most of the othe adenes. The dop of 0.14 in eall fo F all \ F textal as ompaed to the fll set shows that text feates ae ial fo a onsideable nmbe of adenes to be deteted. The same applies to a somewhat lesse extent to ontextal feates. Removing the ontextal feates fom the fll set ases a dop of 0.05 in the eall, whih is onsideable bt not exteme. It appeas that the gop of adene tigams fo whih the ontextal feates ae ial is not vey big. 395

8. CONCLUSION AND FUTURE WORK In this pape we developed a system to detet adenes in Westen folk songs. The system makes se of a Random Foest Classifie that on the basis of a nmbe of handafted feates (both msial and textal) is able to aately loate adenes in nning melodies. In a followp expeiment we employ a method, oiginally developed fo textal seqenes, that pedits label-tigams instead of the binay labels adene o non-adene. We show that inopoating the peditions of neighboing instanes into the final pedition, has a stong positive effet on peision withot a loss in eall. In the ablation stdy we fond that all gops of feates, exept fo the onto feates, ontibte to the oveall lassifiation, while none of the gops is ial fo the majoity of the adenes to be deteted. This indiates that adene detetion is a mlti-dimensional poblem fo whih vaios popeties of melody and text ae neessay. The ent eslts give ise to vaios follow-p stdies. A deepe stdy to the kinds of eos of o system will lead to impoved feates and ineased knowledge abot adenes. Those that wee deteted exlsively by textal feates fom a patila inteesting ase, possibly giving ise to new melodi feates. Next, n-gams othe than tigams as well as skip-gams [7] old be sed, we will ompae the pefomane of o method with existing symboli segmentation algoithms, and we want to make se of othe feates of the text sh as oespondene between syntati nits in the text and melodi nits in the melody. 9. REFERENCES [1] Rens Bod. Pobabilisti gammes fo msi. In Poeedings of BNAIC 2001, 2001. [2] L. Beiman. Random foests. Mahine Leaning, 45(1):5 32, 2001. [3] Betand H Bonson. Some obsevations abot melodi vaiation in bitish-ameian folk tnes. Jonal of the Ameian Msiologial Soiety, 3:120 134, 1950. [4] Emilios Camboopolos. The loal bonday detetion model (lbdm) and its appliation in the stdy of expessive timing. In Po. of the Intl. Compte Msi Conf, 2001. [5] Mihael Sott Cthbet and Chistophe Aiza. Msi21: A toolkit fo ompte-aided msiology and symboli msi data. In Poeedings of the 11th Intenational Confeene on Msi Infomation Retieval (ISMIR 2010), pages 637 642, 2010. [6] Walte Daelemans, Jakb Zavel, Ko Van de Sloot, and Antal Van den Bosh. TiMBL: Tilbg Memoy Based Leane, vesion 6.3, Refeene Gide, 2010. [7] David Gthie, Ben Allison, W. Li, Loise Gthie, and Yoik Wilks. A lose look at skip-gam modelling. In Poeedings of the Fifth intenational Confeene on Langage Resoes and Evalation LREC- 2006, 2006. [8] David Hon. Sweet Antiipation. MIT Pess, Cambidge, Mass., 2006. [9] Zoltán Jhász. Segmentation of hngaian folk songs sing an entopy-based leaning system. Jonal of New Msi Reseah, 33(1):5 15, 2004. [10] Ilmai Kohn. Welhe ist die beste Methode, m Volks- nd volksmässige Liede nah ihe melodishen (niht textlihen) Beshaffenheit lexikalish z odnen? Sammelbände de intenationalen Msikgesellshaft, 4(4):643 60, 1903. [11] Egene Namo. The Analysis and Cognition of Basi Melodi Sttes - The Impliation-Realization Model. The Univesity of Chiago Pess, Chiago and Londen, 1990. [12] Mas Peae, Daniel Müllensiefen, and Geaint Wiggins. The ole of expetation and pobabilisti leaning in aditoy bonday peeption: A model ompaison. Peeption, 39(10):1365 1389, 2010. [13] F. Pedegosa et al. Sikit-lean: Mahine leaning in Python. Jonal of Mahine Leaning Reseah, 12:2825 2830, 2011. [14] David C. Rbin. Memoy in Oal Taditions. Oxfod Univesity Pess, New Yok, 1995. [15] Jos Smits van Waesbeghe. A Textbook of Melody: A ose in fntional melodi analysis. Ameian Institte of Msiology, 1955. [16] B. Shoff. Pefae, pages ix lv. State Univesity of New Yok Pess, Albany, 1981. [17] W. Sppan and W. Stief, editos. Melodietypen des Detshen Volksgesanges. Hans Shneide, Ttzing, 1976. [18] David Tempeley. A pobabilisti model of melody peeption. In Poeedings of the 7th Intenational Confeene on Msi Infomation Retieval (ISMIR 2006), Vitoia, BC, 2006. [19] Eia van Boven and Gillis Doleijn. Liteai Mehaniek. Cotinho, Bssm, 2003. [20] Antal Van den Bosh and Walte Daelemans. Impoving seqene segmentation leaning by pediting tigams. In Poeedings of the Ninth Confeene on Natal Langage Leaning, CoNLL-2005, pages 80 87, Ann Abo, MI, 2005. [21] Fans Wieing and Hemi J.M. Tabahnek-Shijf. Cognition-based segmentation fo msi infomation etieval systems. Jonal of New Msi Reseah, 38(2):137 154, 2009. 396