Using Natural Language Processing Techniques for Musical Parsing

Size: px
Start display at page:

Download "Using Natural Language Processing Techniques for Musical Parsing"

Transcription

1 Using Natural Language Processing Techniques for Musical Parsing RENS BOD School of Computing, University of Leeds, Leeds LS2 9JT, UK, and Department of Computational Linguistics, University of Amsterdam Spuistraat 134, 1012 VB Amsterdam, Holland Abstract. We investigate whether probabilistic parsing techniques from Natural Language Processing (NLP) can be used for musical parsing. As in natural language, a piece of music can be segmented into groups or phrases which can be conveniently represented by a phrase-structure tree (Longuet-Higgins 1976; Tenney & Polansky 1980; Lerdahl & Jackendoff 1983). One of the main challenges for musical parsers is the problem of ambiguity: several different phrase structures may be compatible with a given musical sequence while a listener typically hears only one structure. In this paper we will consider three parsing techniques from the NLP literature that use a probabilistic heuristic to solve ambiguity. We present a new parser which combines two of these techniques, and which can correctly predict up to 85.9% of the phrases for a test set of 1,000 folksongs from the Essen Folksong Collection (Schaffrath 1995). To the best of our knowledge, this work presents the first parsing experiments with the Essen Folksong Collection, which we hope may be used as a baseline for other approaches. Our parser may also be used to speed up the timeconsuming annotation of newly collected folksongs, thereby contributing to the creation of larger musical databases in computer-assisted musicology. Keywords. computer-assisted musicology, natural language processing, music perception, probabilistic grammars, musical databases 1. Introduction We investigate whether probabilistic parsing techniques from Natural Language Processing (NLP) can be used for musical parsing. As in natural language, a listener segments a sequence of notes into groups or phrases that form a grouping structure for the whole piece (Longuet-Higgins 1976; Tenney & Polansky 1980; Lerdahl & Jackendoff 1983). For example, according to Lerdahl & Jackendoff (1983: 37) a listener hears the following grouping structure for the first few bars of melody in the Mozart G Minor Symphony, K. 550.

2 Figure 1. Grouping structure for the opening theme of Mozart's G Minor Symphony Each group is represented by a slur beneath the musical notation. A slur enclosed within a slur indicates that a group is heard as part of a larger group. This hierarhical structure of melody can, without loss of generality, also be represented by a phrase structure tree, as in figure 2. Figure 2. Tree structure for the grouping structure in figure 1 Although visually quite different, it is easy to see that the two representations in figures 1 and 2 are mathematically equivalent. Note the analogy with phrase structure trees in linguistics: a tree describes how parts of the input combine into groups or constituents and how these constituents combine into a representation for the whole input. Apart from this analogy, there is also an important difference: while the nodes in a linguistic tree structure are typically labeled with syntactic categories such as S, NP, VP etc., musical tree structures are unlabeled. This is because in language there are syntactic constraints on how words can be combined into larger constituents (e.g. in English a determiner can be combined with a noun only if it precedes that noun, which is expressed by the rule NP -> Det N), while in music there are no such restrictions: in principle any note may be combined with any other note. This makes the problem of ambiguity in music much harder than in language. Longuet-Higgins and Lee (1987) note that "Any given sequence of note values is in principle infinitely ambiguous, but this ambiguity is seldom apparent to the listener.".

3 To give an example of this ambiguity, the first few bars of Mozart's G Minor Symphony could also be assigned the following, alternative grouping structure (among the many other possible structures): Figure 3. Alternative grouping structure for the opening theme of Mozart's G Minor Symphony While this alternative structure is possible in that it can be perceived, it does not correspond to the structure that is actually perceived by a human listener. There is thus an important research question as to how to select the perceived tree structure from the total, possibly infinite set of tree structures of a musical input. In the field of natural language processing (NLP), the use of probabilistic corpusbased parsing techniques has become increasingly influential for solving ambiguity (see Charniak 1997 or Manning & Schütze 1999 for an overview). Instead of using a predefined set of rules, a probabilistic corpus-based parser learns how to parse new input by generalizing from examples of previously annotated data; in case of ambiguity, such a parser computes the most probable phrase structure for a given input. State-of-the-art probabilistic parsers, which use the Wall Street Journal corpus in the Penn Treebank (Marcus et al. 1993) as a test domain, obtain around 90% correctly predicted phrases (e.g. Collins 2000; Charniak 2000; Bod 2001a). With the current availability of large annotated musical corpora, such as the Essen Folksong Collection (Schaffrath 1995), we may wonder whether such probabilistic corpus-based parsing techniques carry over to musical parsing. In this paper we will test the usefulness of three probabilistic parsing techniques for music: the Treebank grammar technique of Charniak (1996), the Markov grammar technique of Collins (1999), and the Data-Oriented Parsing (DOP) technique of Bod (1998). We develop a new parser which combines two of these techniques, and which correctly predicts up to 85.9% of the phrases for a held-out test set of 1,000 folksongs from the Essen Folksong Collection (Schaffrath 1995). To the best of our knowledge, this paper contains

4 the first parsing experiments on the Essen Folksong Collection; moreover, it also contains the first parsing experiments on a musical test set of non-trivial size. In the following we first describe the Essen Folksong Collection, after which we test a number of probabilistic parsing models on this collection. Since no other parsing results on the Essen Folksong Collection are available, we will only informally compare our technique with other approaches that aim at solving ambiguity in music. 2. The Essen Folksong Collection The Essen Folksong Collection provides a large sample of (mostly) European folksongs that have been collected and encoded under the supervision of Helmut Schaffrath at the University of Essen (see Schaffrath 1993, 1995; Selfridge-Field 1995; or Each of the 6,251 folksongs in the Essen Folksong Collection is annotated with the Essen Associative Code (ESAC) which includes pitch and duration information, meter signatures and explicit phrase markers. The presence of phrase markers makes the Essen Folksong Collection a unique test case for musical parsers. The pitch encodings in the Essen Folksong Collection resemble "solfege": scale degree numbers are used to replace the movable syllables "do", "re", "mi", etc. Thus 1 corresponds to "do", 2 corresponds to "re", etc. Chromatic alterations are represented by adding either a "#" or a "b" after the number. The plus ("+") and minus ("-") signs are added before the number if a note falls resp. above or below the principle octave (thus -1, 1 and +1 refer al to "do", but on different octaves). Duration is represented by adding a period or an underscore after the number. A period (".") increases duration by 50% and an underscore ("_") increases duration by 100%; more than one underscore may be added after each number. If a number has no duration indicator, its duration corresponds to the smallest value. A pause is represented by 0, possibly followed by duration indicators. No loudness or timbre indicators are used in ESAC. Thus, the opening theme of Mozart's G Minor Symphony in figure 1 can be encoded in ESAC as follows (since the piece is in G Minor, all notes are related to G which corresponds to the number 1). 6b55_6b55_6b55_+3b_0_ Figure 4. ESAC encoding for the opening theme of Mozart's G Minor Symphony ESAC uses hard returns to indicate a phrase boundary. To make the Essen annotations readable for our probabilistic parsers, we automatically convert ESAC's phrase boundary indications into bracket representations, where "(" indicates the start of a phrase and ")" the

5 end of a phrase. The phrase structures in figures 1 and 2 would thus correspond to the following bracket representation. ( ( (6b55_) (6b55_) ) (6b55_+3b_0_) ) Figure 5. Bracket representation for the phrase structures in figures 1 and 2 of the opening theme of Mozart's G Minor Symphony The following figure gives an example of an encoding of an actual folksong from the Essen Folksong Collection ("Schlaf Kindlein feste") converted to our bracket representation: (3_221_-5)( _-5)( )( _)(3_221_-5_) Figure 6. Bracket representation for folksong K0029, "Schlaf Kindlein feste" It is important to note that the annotations in the Essen Folksong Collection do not contain hierarchical or nested structures. Differently from the examples in figures 1, 2 and 3, the Essen Folksong annotations represent the basic phrases (or "segmentations") only and neglect any phrase-internal or phrase-external structure (such as motives, periods and sections). Although this results in rather simple annotations, we will see that the Essen Collection is still a very tough test case for our parsers. The following example shows that many phrases in the Essen Folksong Collection could have been further analyzed in terms of subphrases (e.g. the fifth phrase into three very similar subphrases). (3 2 1_1_-5_)(-5_3_3_2_2_1_1_-5_)(-5_1_2_3_1_4 2_)(1_-7_1_2_-5_3 1_) (3_1-5_3_1_1_-5_3_1-5_)(-5_1_2_3_1_4_3_223_1 1_0_) Figure 7. Bracket representation for folksong K0690, "Ruru Rinneken" And a more extreme case of the shallowness of the Essen Folksong Collection is provided by folksong Z0147 ("Besenbinders Tochter und kachelmachers Sohn"): (5_4#_5_3_1 1_3_2_1#_2_-7_-5.)(3_5_4#_5_3_1 1_3_ 221#_2_-7_-5.) (-5_-5_-5_-5-5-5_4 4_3_2_2_3_4_5 +1_)(3_5_4#_5_3_1_-7_1_332_1#_2_3_1 0 ) (-5_-5_-5_-5_444_4_3_2_2_3_4_5 +1_)(3_5_4#_5_3_1_1_1_3_2_1#_2_3_1.) (3_5_4#_5_3_1_1_1_3_2_1#_2_3_1 1_)(3_5_4#_5_3_1_-7_1_3_2_1#_2_3_1 1_0_) (-5_-5_-5_-5_444_4_3_2_2_3_4_5 +1_)(3_5_4#_5_3_1_1_1_3_2_1#_2_3_1 )

6 Figure 8. Bracket representation for folksong Z0147, "Besenbinders Tochter und kachelmachers Sohn" We believe that every phrase in this folksong could have been further analyzed into subphrases. Yet, the annotation in figure 8 is not wrong; it just represents the most basic phrase structure of the piece only. We want to emphasize that for our experiments in section 3 we did not add (or modify) any structure in the Essen annotations. As we will see, despite (or perhaps due to) its shallow annotations, the Essen Folksong Collection is quite an interesting test case. This brings us to the problem of evaluation. To evaluate our probabilistic parsers for music, we employed the blind testing method which has been widely used in evaluating natural language parsers (see Manning & Schütze 1999). This method dictates that a collection of annotated data is randomly divided into a training set and a test set, where the annotations in the training set are used to "train" the parser, while the unannotated strings in the test set are used as input to test the parser. The degree to which the predicted structures for the test set strings match with the correct structures in the test set is a measure for the accuracy of the parser. For our experiments in section 3, we randomly divided the Essen Folksong Collection into a training set of 5,251 folksongs and a test set of 1,000 folksongs. There is an important question as to what kind of evaluation measure is most appropriate to compare the phrase structures proposed by the parser with the correct phrase structures in the test set. A widely used evaluation scheme in natural language parsing is the PARSEVAL scheme, which is based on the notions of precision and recall (see Black et al. 1991). PARSEVAL compares a proposed parse P with the corresponding test set parse T as follows: Precision = # correct phrases in P # phrases in P Recall = # correct phrases in P # phrases in T A phrase is correct if both the start and the end of the phrase is correctly predicted. Note that these measures "punish" a parser which assigns too many phrases to a folksong: for example, an extremely overgenerating parser which assigns phrases to any combination of notes would trivially include all correct phrases, resulting in an excellent recall, but its precision would be very low. On the other hand, a very conservative parser which predicts

7 very few, though correct phrases, will receive a high precision, but its recall will be low. A good parser will thus need to obtain both a high precision and a high recall. (It goes probably without saying that for computing the precision and recall for all test set strings, one needs to divide the total number of correctly predicted phrases in all proposed parses P by the total number of phrases in respectively all parses P and T.) The precision and recall scores are often combined into a single measure of performance, known as the F-score (see Manning & Schütze 1999): F-score = 2 Precision Recall Precision + Recall We will use these three measures of Precision, Recall and F-score to quantitatively evaluate our probabilistic parsing models for music. As a final pre-processing step, we (automatically) added to each phrase in the folksong the label "P" and to each whole song the label "S", so as to obtain conventional parse trees. Thus the annotation in figure 6 becomes: S( P(3_221_-5) P( _-5) P( ) P( _) P(3_221_-5_) ) Figure 9. Labeled-bracketing annotation for the structure in figure 6 Note that this labeled-bracketing annotation is equivalent to the following visual tree representation. S P P P P P 3_221_ _ _ 3_221_-5_ Figure 10. Visual tree structure for the labeled-bracketing annotation in figure 9

8 The advantage of labeled-bracketing annotations is that we can now directly apply existing probabilistic parsing models to the Essen Folksong Collection. 3. Parsing the Essen Folksong Collection In this section, we test three probabilistic parsing models from the literature on the Essen Folksong Collection: the Treebank grammar technique of Charniak (1996), the Markov grammar technique of Seneff (1992) and Collins (1999), and the Data-Oriented Parsing (DOP) technique of Bod (1993, 1998). Unless stated differently, we used the same random split of the Essen Folksong Collection into a training set of 5,251 folksongs and a test set of 1,000 folksongs. 3.1 The Treebank Grammar Technique The Treebank grammar technique is an extremely simple learning technique: it reads all context-free rewrite rules from the training set structures, and assigns each rule a probability proportional to its frequency in the training set. For example, the following context-free rules can be extracted from the structure in figure 9: S -> PPPPP P -> 3_221_-5 P -> _-5 P -> P -> _ P -> 3_221_-5_ Next, each rewrite rule is assigned a probability by dividing the number of occurrences of a particular rule in the training set by the total number of occurrences of rules that expand the same nonterminal as the particular rule. For instance, if we take folksong in figure 9 as our only training data, then the probability of the rule P -> 3_221_-5 is equal to 1/5 since this rule occurs once among a total of 5 rules that expand the nonterminal P. A Treebank grammar extracted in this way from the training set corresponds to a socalled Probabilistic Context-Free Grammar or PCFG (Booth 1969). A crucial assumption underlying PCFGs is that the context-free rules are statistically independent. Thus, given the probabilities of the individual rules, we can calculate the probability of a parse tree by taking the product of the probabilities of each rule used therein. PCFGs have been extensively studied in the literature (cf. Wetherell 1980; Charniak 1993), and the efficient parsing algorithms that exist for Context-Free Grammars carry over to PCFGs (see Charniak 1993 or Manning & Schütze 1999 for the relevant algorithms).

9 Any probabilistic grammar extracted from a training set faces the problem of datasparseness: many of the rules in the training set are so infrequent that their observed probabilities are very bad estimates of their true probabilities. A widely used method to cope with this problem is the Good-Turing method (Good 1953). In general, Good-Turing estimates the expected population frequency f* of a type by adjusting its observed sample frequency f. In order to estimate f*, Good-Turing uses an additional notion, n f, which is defined as the number of types which occur f times in an observed sample. Thus, n f can be understood as the frequency of frequency f. The Good-Turing estimator uses this extra information for computing the adjusted frequency f* as f* = ( f+1) n f+1 nf We thus compute the probabilities of our context-free rules in the Treebank grammar from their adjusted frequencies rather than from their raw observed frequences. For an instructive paper on Good-Turing, together with a proof of the formula, see Church & Gale (1991). The Treebank grammar that was obtained in this way from the 5,251 training folksongs was used to parse the 1,000 folksongs in the test set. We computed for each test folksong the most probable parse using a standard best-first parsing algorithm based on Viterbi optimization (see Charniak 1993; Manning & Schütze 1999). Although we may already foresee that a Treebank grammar is doomed to misparse folksongs if it is does not find the correct rule in the training set, it will serve as the basis for our more sophisticated parsing techniques in the following sections. Using the evaluation measures given in section 2, our Treebank grammar obtained a precision of 68.7%, a recall of 3.4%, and an F-score of 6.5%. Although the precision score may seem reasonable, the recall score is extremely low, which indicates that the Treebank grammar technique is a very conservative learner: it predicts very few phrases from the total number of phrases in the Essen Folksong Collection, resulting in a very low F-score. As noted, one of the problems with the Treebank grammar technique is that it only learns those context-free rules that literally occur in the training set, which is evidently not a very robust technique for musical parsing (while it has been shown to perform quite well in natural language parsing -- see Charniak 1996). We will see, however, that the results improve significantly if we slightly loosen the way of extracting rules from the training set. 3.2 The Markov Grammar Technique A technique which overcomes the conservativity of Treebank grammars is the Markov grammar technique (Seneff 1992; Collins 1999). While a Treebank grammar can only assign probabilities to context-free rules that have been seen in the training set, a Markov

10 grammar can in principle assign a probability to any possible context-free rule, thus resulting in a more robust model. This is accomplished by decomposing a rule and its probability by a Markov process (see Collins 1999: 44-48). For example, a third-order Markov process estimates the probability p of a rule P -> by: p(p -> 12345) = p(1) p(2 1) p(3 1, 2) p(4 1, 2, 3) p(5 2, 3, 4) p(end 3, 4, 5). The conditional probability p(end 3, 4, 5) encodes the probability that a rule ends after the notes 3, 4, 5. Thus even if the rule P -> does not literally occur in the training set, we can still estimate its probability by using a Markov history of three notes. The extension to larger Markov histories follows from obvious generalization of the above example. However, also a Markov grammar suffers from data-sparseness: we may get low counts, including zero counts, for many Markov histories. Zero counts are especially problematic: if one of the conditional probabilities in the formula above has a zero occurrence in the training set, then the whole rule is assigned a zero probability. A widely used technique to solve the data-sparseness problem in Markov models is the linear interpolation technique (see Manning & Schütze 1999: ). This technique smooths a Markov history by taking into account its shorter histories. Let n 1, n 2 and n 3 denote three notes, then the conditional probability p(n 1 n 2, n 3 ) is smoothed ("interpolated") as p(n 1 n 2, n 3 ) = λ 1 p(n 1 ) + λ 2 p(n 1 n 2 ) + λ 3 p(n 1 n 2, n 3 ). where 0 λ i 1 and λ 1 + λ 2 + λ 3 = 1. These λ-weights may be set by hand, but in general one wants to find the combination of weights λ i which works best. A simple algorithm that finds the optimal weights is Powell's algorithm (see Press et al. 1988), which is also discussed in Manning & Schütze (1999: 218). We used this algorithm to assign weights to the lambdas in the linear interpolation technique, which in turn was used to estimate the conditional probabilities in the Markov grammar technique. Furthermore, each of the probabilities p(n 1 ), p(n 1 n 2 ) and p(n 1 n 2, n 3 ) were not directly estimated from their observed relative frequencies in the training set, but were adjusted by the Good-Turing method, just as with Treebank grammars (section 3.1). Note that the extension to larger Markov histories follows from obvious generalization of the formulas above. The probability of a parse tree of a musical piece is computed by the product of the probabilities of the rules that partake in the parse tree, just as with Treebank grammars. For our experiments, we used a Markov grammar with a history of four notes. This grammar obtained a precision of 63.1%, a recall of 80.2%, and an F-score of 70.6%. These results are to some extent complementary to the Treebank grammar: although the precision is

11 somewhat lower, the recall is (much) higher than for the Treebank grammar. Thus, while the Treebank grammar predicts too few phrases, the Markov grammar predicts (a bit) too many phrases. The combined F-score of 70.6% shows an immense improvement over the Treebank grammar technique. Experiments with higher or lower order Markov models diminished our results. 3.3 Extending the Markov Grammar Technique with the DOP Technique Although the Markov grammar technique obtained considerably better scores than the Treebank grammar technique, it does not take into account any global context in computing the probability of a parse tree. Knowledge of global context, such as the number of phrases that occur in a folksong, is likely to be important for predicting the correct segmentations for new folksongs. In order to include global context, we conditioned over the S-rule higher in the structure in computing the probability of a P-rule. This approach corresponds to the Data-Oriented Parsing (DOP) technique (Bod 1993, 1998) which can condition over any higher or lower rule in a tree, and which has recently been integrated with the Markov grammar technique (Sima'an 2000). In the original DOP technique, any fragment seen in the training set, regardless of size, is used as a productive unit. But in the Essen Folksong Collection we have only two levels of constituent structure in each tree, which results in a much simpler probabilistic model. As an example take again the rule P -> and a higher S-rule such as S -> PPPP; a DOP-Markov model based on a history of three notes computes the (conditional) probability of this rule as: p(p -> S -> PPPP) = p(1 S -> PPPP) p(2 S -> PPPP, 1) p(3 S -> PPPP, 1, 2) p(4 S -> PPPP, 1, 2, 3) p(5 S -> PPPP, 2, 3, 4) p(end S -> PPPP, 3, 4, 5). The extension to larger histories follows from obvious generalization of the above example. For our experiments, we used a history of four notes, extended with the same smoothing techniques as in section 3.2 (i.e. linear interpolation combined with Good-Turing). The most probable parse of a folksong is again computed by maximizing the product of the rule probabilities that generate the folksong. Using the same training/test set division as before, this DOP-Markov parser obtained a precision of 76.6%, a recall of 85.9%, and an F-score of 81.0%. The F-score is an improvement of 10.4% over the Markov parser. Note that the DOP-Markov parser is relatively well-balanced: it is neither terribly conservative nor does it predict too many redundant phrases -- keeping in mind the idiosyncracy of the Essen Folksong annotations. While there is no reason to expect a near to 100% accuracy for the shallowly annotated Essen Folksong Collection, our results show the importance of including global context in

12 computing the probability of a parse. We also checked the statistical signifance of our results, by testing on 9 additional random splits of the Essen Folksong Collection (into training sets of 5,251 folksongs and a test sets of 1,000 folksongs). On these splits, the DOP-Markov parser obtained an average F-score of 80.7% with a standard deviation of 1.9%, while the Markov parser obtained an average F-score of 70.8% with a standard deviation of 2.2%. These differences were statistically significant according to paired t- testing. Finally, we were interested in testing the impact of the training size on the F-score. In the following experiments we started with an initial training set of only 500 folksongs (randomly chosen from the full training set of 5,251 folksongs). We then increased the size of this initial training set with 500 folksongs each time (randomly chosen from the full training set). The test set was kept constant at 1,000 folksongs. The results are shown in table 1. Training F-score % 1, % 1, % 2, % 2, % 3, % 3, % 4, % 4, % 5, % 5, % Table 1. F-score as a function of training set size The table shows that the F-score rapidly increases when the size of the training set is enlarged from 500 to 2,000 folksongs. The accuracy continues to increase at a lower rate if the training set is further enlarged. We may thus expect that the accuracy of our parser further increases if we have access to larger musical corpora. This is important if we want to use our parser for the semi-automatic annotation of musical databases. Starting with an initial, relatively small set of hand-annotated pieces, our parser can use these annotations as its training set on the basis of which the annotations for a new set of musical pieces can be predicted. The predicted annotations will need to be corrected by hand, but once we have added these corrected annotations to the training set, our parser will more accurately predict

13 the annotations for fresh folksongs. Table 1 suggests that the amount of human correction decreases if more training data becomes available. We thus expect that our parser can be used to speed up the time-consuming annotation of musical pieces, thereby contributing to the creation of larger databases in computer-assisted musicology. 4. Other approaches to musical parsing There exists an extensive literature in the field of computational models of music analysis (see Cambouropoulos 1998, or Camouropoulos et al for an overview). Most if not all approaches to musical parsing are non-probabilistic and are based on the assumption that the perceived phrase structure of a musical piece can be predicted on the basis of a combination of low-level phenomena, such as the Gestalt phenomena of proximity and similarity, and higher-level phenomena, such as melodic parallelism and internal harmony. For example, Tenney & Polansky (1980), Lerdahl & Jackendoff (1983), Handel (1989) and Cambouropoulos (1996, 1997) use the Gestalt rules of Wertheimer (1923) to predict the low-level grouping structure of a piece: phrase boundaries preferably fall on larger time intervals, larger pitch intervals, etc. While most models also incorporate higherlevel phenomena, such as melodic parallelism and harmony, these phenomena remain often unformalized. For example, Lerdahl & Jackendoff (1983) do not provide any systematic description of higher-level musical parallelism, and Narmour's Implication-Realization model (Narmour 1990, 1992) relies on factors such as meter, harmony and similarity which are not fully described by the model. As a result, these models have not been evaluated against test sets of non-trivial size, such as the Essen Folksong Collection. Only very few, hand-selected passages are typically used to evaluate these models, which questions the objectivity of the results. More importantly, perhaps, is the fact that the Gestalt principles, which were originally proposed for visual perception (Wertheimer 1923), do not straightforwardly carry over to music perception. Elsewhere (Bod 2001b), we have shown that more than 15% of the phrase boundaries in the Essen Folksong Collection fall before or after large pitch/time intervals (as in the folksong of figure 7), rather than on such intervals, and that phrase boundaries even appear between identical notes. This goes against the predictions of any Gestalt-based parser, which assigns phrase boundaries exactly on large intervals rather than before or after them. Moreover, we have shown in Bod (2001b) that higher-level phenomena, such as melodic parallelism and internal harmony, are not of any help for predicting the correct phrase boundaries for these 15% "exceptional" phrases. On the contrary, for almost all these phrases (98.7%), melodic parallelism and internal harmony reinforced the incorrect predictions made by the Gestalt principles. It is noteworthy that our DOP-Markov parser, on the other hand, performed equally well on both "exceptional" phrases and "normal" phrases

14 (where boundaries do fall on large pitch/time intervals). While our parser is still far from perfect, we believe that a probabilistic, corpus-based approach is more apt to musical parsing as it considers counts of any note sequence that has been observed with a certain structure, thereby taking into account the entire continuum between "exceptional" and "normal" phrases, rather than trying to capture this gradiency by a few formal rules. We fully admit that a fair comparison between our parser and a Gestalt-based/parallelism-based parser should await further experimental evaluation, but we hope to have made clear that musical parsing models should be tested on large corpora of musical annotations such as the Essen Folksong Collection (otherwise "exceptional" phrases may easily remain unnoted). If we wish to propose a corpus-based approach to musical parsing as a serious alternative to a Gestalt-based approach, we should address the question of how any structure can be acquired if we do not have any structured pieces in our corpus to start with. With an already analyzed corpus, we can at best simulate adult music perception -- as with an analyzed corpus of natural language (see Bod 1998). We conjecture that the acquisition of a structured corpus may be the result of a bootstrapping process where the discovery of recurrent patterns and distributional regularities plays an important role. As soon as a sequence of notes appears more than once, it may be hypothesized as a group, and may be used as a productive unit to analyze new pieces. The frequency with which a pattern occurs is used to decide between conflicting groups. Much research in unsupervised language learning is concerned with bootstrapping syntactic structure on the basis of pattern similarity and statistics from large (unannotated) language corpora (e.g. Finch & Chater 1994; Brent and Cartwright 1996; van Zaanen 2000). One of our future goals is to investigate whether such unsupervised learning techniques carry over to bootstrapping musical structure, and whether the learned structure corresponds to the structure as perceived by human listeners. On the other hand, there is already a considerable amount of work on unsupervised musical pattern induction (e.g. Cope 1990; Crawford et al. 1998; Rolland & Ganascia 2000). We hope to assess these models, along with unsupervised models of natural language learning, for the task of bootstrapping structure in a large musical corpus. Once an initial corpus of musical patterns has been bootstrapped, these patterns can be used by our probabilistic models to more efficiently parse new pieces. Only for completely new sequences of notes that have never appeared before, unsupervised methods need still to be invoked. The exact interplay between unsupervised and supervised (or memory-based) aspects of musical parsing needs to await further investigation. 5. Conclusion

15 We have shown that probabilistic parsing models from Natural Language Processing can be successfully applied to musical parsing. We have tested three models that parse musical pieces by combining fragments from structures of previously encountered pieces. In case of ambiguity, these models compute the analysis that can be considered the most probable one on the basis of the occurrence-frequencies of the fragments. We developed a new parser which combines two of these techniques (i.e. the Markov grammar technique and the DOP technique), and which can correctly predict up to 85.9% of the phrases for a test set of 1,000 folksongs from the Essen Folksong Collection. We hope that our results may serve as a baseline for other computational models of music analysis. Our parser may also be used to speed up the time-consuming annotation of newly collected folksongs, thereby contributing to the creation of larger musical databases in computer-assisted musicology. References E. Black, S. Abney, D. Flickinger, C. Gnadiec, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini and T. Strzalkowski, A Procedure for Quantitatively Comparing the Syntactic Coverage of English, Proceedings DARPA Speech and Natural Language Workshop, Pacific Grove, Morgan Kaufmann. R. Bod, Using an Annotated Language Corpus as a Virtual Stochastic Grammar. Proceedings AAAI'93, Morgan Kaufmann, Menlo Park. R. Bod, Beyond Grammar: An Experience-Based Theory of Language, Stanford, CSLI Publications (distributed by Cambridge University Press). R. Bod, 2001a. What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy? Proceedings ACL'2001, Toulouse, France. R. Bod, 2001b. Evidence against the Gestalt Principles in Music. Proceedings International Computer Music Conference 2001 (ICMC'2001), Havana, Cuba. (to appear in September 2001) T. Booth, Probabilistic Representation of Formal Languages, Tenth Annual IEEE Symposium on Switching and Automata Theory. M. Brent and T. Cartwright, Distributional Regularity and Phonotactic Contraints are Useful for Segmentation, Cognition, 61, E. Cambouropoulos, A Formal Theory for the Discovery of Local Boundaries in a Melodic Surface. Proceedings of the Troisièmes Journées d'informatique Musicale (JIM-96), Caen, France. E. Cambouropoulos, Musical Rhythm: A Formal Model for Determining Local Boundaries, Accents and Meter in a Melodic Surface, in M. Leman (ed.), Music, Gestalt and Computing - Studies in Systematic and Cognitive Musicology, Berlin, Springer-Verlag. E. Cambouropoulos, Towards a General Computational Theory of Musical Structure, Ph.D. thesis, University of Edinburgh, UK.

16 E. Cambouropoulos, T. Crawford and C. Iliopoulos, Pattern Processing in Melodic Sequences: Challenges, Caveats and Prospects. Computers and the Humanities 35: E. Charniak, Statistical Language Learning, Cambridge, The MIT Press. E. Charniak, Tree-bank Grammars, Proceedings AAAI-96, Menlo Park, Ca. E. Charniak, Statistical Techniques for Natural Language Parsing, AI Magazine, Winter 1997, E. Charniak, A Maximum-Entropy-Inspired Parser. Proceedings ANLP-NAACL'2000, Seattle, Washington. K. Church and W. Gale, A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams, Computer Speech and Language 5, M. Collins, Head-Driven Statistical Models for Natural Language Parsing, PhD-thesis, University of Pennsylvania, PA. M. Collins, Discriminative Reranking for Natural Language Parsing, Proceedings ICML-2000, Stanford, Ca. D. Cope, Pattern-Matching as an Engine for the Computer Simulation of Musical Style, Proceedings ICMC'1990, Glasgow, UK. R. Crawford, C. Iliopoulos, and R. Raman, String Matching Techniques for Musical Similarity and Melodic Recognition, Computing in Musicology 11, S. Finch and N. Chater Distributional Bootstrapping: From Word Class to Proto-Sentence, Proceedings 16th Annual Cognitive Science Society, , Hillsdale, Lawrence Erlbaum. I. Good, The Population Frequencies of Species and the Estimation of Population Parameters, Biometrika 40, S. Handel, Listening. An Introduction to the Perception of Auditory Events. Cambridge, The MIT Press. F. Lerdahl and R. Jackendoff, A Generative Theory of Tonal Music. Cambridge, The MIT Press. H. Longuet-Higgins, Perception of Melodies. Nature 263, October 21, H. Longuet-Higgins and C. Lee, The Rhythmic Interpretation of Monophonic Music. In: Mental Processes: Studies in Cognitive Science, Cambridge, The MIT Press. C. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. Cambridge, The MIT Press. M. Marcus, B. Santorini and M. Marcinkiewicz, Building a Large Annotated Corpus of English: the Penn Treebank, Computational Linguistics 19(2). E. Narmour, The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model, The University of Chicago Press, Chicago. E. Narmour, The Analysis and Cognition of Melodic Complexity, The University of Chicago Press, Chicago. W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C. Cambridge University Press.

17 P. Rolland and J. Ganascia, Musical Pattern Extraction and Similarity Assessment, in E. Miranda (ed.) Readings in Music and Artificial Intelligence, Harwood Academic Publishers. H. Schaffrath, Repräsentation einstimmiger Melodien: computerunterstützte Analyse und Musikdatenbanken. In B. Enders and S. Hanheide (eds.) Neue Musiktechnologie, , Mainz, B. Schott's Söhne. H. Schaffrath, The Essen Folksong Collection in the Humdrum Kern Format. D. Huron (ed.). Menlo Park, CA: Center for Computer Assisted Research in the Humanities. E. Selfridge-Field, The Essen Musical Data Package. Menlo Park, California: Center for Computer Assisted Research in the Humanities (CCARH). S. Seneff, TINA: A Natural Language System for Spoken Language Applications. Computational Linguistics 18(1), K. Sima'an, Tree-gram Parsing: Lexical Dependencies and Structural Relations, Proceedings ACL'2000, Hong Kong, China. J. Tenney and L. Polansky, Temporal Gestalt Perception in Music, Journal of Music Theory, 24, M. Wertheimer, Untersuchungen zur Lehre von der Gestalt. Psychologische Forschung 4, C. Wetherell, Probabilistic Languages: A Review and Some Open Questions, Computing Surveys, 12(4). M. van Zaanen, Bootstrapping Structure and Recursion Using Alignment-Based Learning, Proceedings International Conference on Machine Learning (ICML'2000), Stanford, California.

Probabilistic Grammars for Music

Probabilistic Grammars for Music Probabilistic Grammars for Music Rens Bod ILLC, University of Amsterdam Nieuwe Achtergracht 166, 1018 WV Amsterdam rens@science.uva.nl Abstract We investigate whether probabilistic parsing techniques from

More information

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * David Ortega-Pacheco and Hiram Calvo Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan

More information

N-GRAM-BASED APPROACH TO COMPOSER RECOGNITION

N-GRAM-BASED APPROACH TO COMPOSER RECOGNITION N-GRAM-BASED APPROACH TO COMPOSER RECOGNITION JACEK WOŁKOWICZ, ZBIGNIEW KULKA, VLADO KEŠELJ Institute of Radioelectronics, Warsaw University of Technology, Poland {j.wolkowicz,z.kulka}@elka.pw.edu.pl Faculty

More information

Extracting Significant Patterns from Musical Strings: Some Interesting Problems.

Extracting Significant Patterns from Musical Strings: Some Interesting Problems. Extracting Significant Patterns from Musical Strings: Some Interesting Problems. Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence Vienna, Austria emilios@ai.univie.ac.at Abstract

More information

Chords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm

Chords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm Georgia State University ScholarWorks @ Georgia State University Music Faculty Publications School of Music 2013 Chords not required: Incorporating horizontal and vertical aspects independently in a computer

More information

Analysis of local and global timing and pitch change in ordinary

Analysis of local and global timing and pitch change in ordinary Alma Mater Studiorum University of Bologna, August -6 6 Analysis of local and global timing and pitch change in ordinary melodies Roger Watt Dept. of Psychology, University of Stirling, Scotland r.j.watt@stirling.ac.uk

More information

MOTIVE IDENTIFICATION IN 22 FOLKSONG CORPORA USING DYNAMIC TIME WARPING AND SELF ORGANIZING MAPS

MOTIVE IDENTIFICATION IN 22 FOLKSONG CORPORA USING DYNAMIC TIME WARPING AND SELF ORGANIZING MAPS 10th International Society for Music Information Retrieval Conference (ISMIR 2009) MOTIVE IDENTIFICATION IN 22 FOLKSONG CORPORA USING DYNAMIC TIME WARPING AND SELF ORGANIZING MAPS ABSTRACT A system for

More information

A COMPARISON OF STATISTICAL AND RULE-BASED MODELS OF MELODIC SEGMENTATION

A COMPARISON OF STATISTICAL AND RULE-BASED MODELS OF MELODIC SEGMENTATION A COMPARISON OF STATISTICAL AND RULE-BASED MODELS OF MELODIC SEGMENTATION M. T. Pearce, D. Müllensiefen and G. A. Wiggins Centre for Computation, Cognition and Culture Goldsmiths, University of London

More information

Tool-based Identification of Melodic Patterns in MusicXML Documents

Tool-based Identification of Melodic Patterns in MusicXML Documents Tool-based Identification of Melodic Patterns in MusicXML Documents Manuel Burghardt (manuel.burghardt@ur.de), Lukas Lamm (lukas.lamm@stud.uni-regensburg.de), David Lechler (david.lechler@stud.uni-regensburg.de),

More information

The ACL Anthology Network Corpus. University of Michigan

The ACL Anthology Network Corpus. University of Michigan The ACL Anthology Corpus Dragomir R. Radev 1,2, Pradeep Muthukrishnan 1, Vahed Qazvinian 1 1 Department of Electrical Engineering and Computer Science 2 School of Information University of Michigan {radev,mpradeep,vahed}@umich.edu

More information

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.

More information

Pitch Spelling Algorithms

Pitch Spelling Algorithms Pitch Spelling Algorithms David Meredith Centre for Computational Creativity Department of Computing City University, London dave@titanmusic.com www.titanmusic.com MaMuX Seminar IRCAM, Centre G. Pompidou,

More information

Towards the Generation of Melodic Structure

Towards the Generation of Melodic Structure MUME 2016 - The Fourth International Workshop on Musical Metacreation, ISBN #978-0-86491-397-5 Towards the Generation of Melodic Structure Ryan Groves groves.ryan@gmail.com Abstract This research explores

More information

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You Chris Lewis Stanford University cmslewis@stanford.edu Abstract In this project, I explore the effectiveness of the Naive Bayes Classifier

More information

Audio Feature Extraction for Corpus Analysis

Audio Feature Extraction for Corpus Analysis Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1 Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends

More information

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

Perceptual Evaluation of Automatically Extracted Musical Motives

Perceptual Evaluation of Automatically Extracted Musical Motives Perceptual Evaluation of Automatically Extracted Musical Motives Oriol Nieto 1, Morwaread M. Farbood 2 Dept. of Music and Performing Arts Professions, New York University, USA 1 oriol@nyu.edu, 2 mfarbood@nyu.edu

More information

Building a Better Bach with Markov Chains

Building a Better Bach with Markov Chains Building a Better Bach with Markov Chains CS701 Implementation Project, Timothy Crocker December 18, 2015 1 Abstract For my implementation project, I explored the field of algorithmic music composition

More information

A MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION

A MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION A MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION Olivier Lartillot University of Jyväskylä Department of Music PL 35(A) 40014 University of Jyväskylä, Finland ABSTRACT This

More information

Automatic meter extraction from MIDI files (Extraction automatique de mètres à partir de fichiers MIDI)

Automatic meter extraction from MIDI files (Extraction automatique de mètres à partir de fichiers MIDI) Journées d'informatique Musicale, 9 e édition, Marseille, 9-1 mai 00 Automatic meter extraction from MIDI files (Extraction automatique de mètres à partir de fichiers MIDI) Benoit Meudic Ircam - Centre

More information

Melody classification using patterns

Melody classification using patterns Melody classification using patterns Darrell Conklin Department of Computing City University London United Kingdom conklin@city.ac.uk Abstract. A new method for symbolic music classification is proposed,

More information

Meter Detection in Symbolic Music Using a Lexicalized PCFG

Meter Detection in Symbolic Music Using a Lexicalized PCFG Meter Detection in Symbolic Music Using a Lexicalized PCFG Andrew McLeod University of Edinburgh A.McLeod-5@sms.ed.ac.uk Mark Steedman University of Edinburgh steedman@inf.ed.ac.uk ABSTRACT This work proposes

More information

EXPLAINING AND PREDICTING THE PERCEPTION OF MUSICAL STRUCTURE

EXPLAINING AND PREDICTING THE PERCEPTION OF MUSICAL STRUCTURE JORDAN B. L. SMITH MATHEMUSICAL CONVERSATIONS STUDY DAY, 12 FEBRUARY 2015 RAFFLES INSTITUTION EXPLAINING AND PREDICTING THE PERCEPTION OF MUSICAL STRUCTURE OUTLINE What is musical structure? How do people

More information

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki Musical Creativity Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki Basic Terminology Melody = linear succession of musical tones that the listener

More information

Etna Builder - Interactively Building Advanced Graphical Tree Representations of Music

Etna Builder - Interactively Building Advanced Graphical Tree Representations of Music Etna Builder - Interactively Building Advanced Graphical Tree Representations of Music Wolfgang Chico-Töpfer SAS Institute GmbH In der Neckarhelle 162 D-69118 Heidelberg e-mail: woccnews@web.de Etna Builder

More information

Computer Coordination With Popular Music: A New Research Agenda 1

Computer Coordination With Popular Music: A New Research Agenda 1 Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,

More information

The Human Features of Music.

The Human Features of Music. The Human Features of Music. Bachelor Thesis Artificial Intelligence, Social Studies, Radboud University Nijmegen Chris Kemper, s4359410 Supervisor: Makiko Sadakata Artificial Intelligence, Social Studies,

More information

Music Information Retrieval Using Audio Input

Music Information Retrieval Using Audio Input Music Information Retrieval Using Audio Input Lloyd A. Smith, Rodger J. McNab and Ian H. Witten Department of Computer Science University of Waikato Private Bag 35 Hamilton, New Zealand {las, rjmcnab,

More information

Algorithmic Music Composition

Algorithmic Music Composition Algorithmic Music Composition MUS-15 Jan Dreier July 6, 2015 1 Introduction The goal of algorithmic music composition is to automate the process of creating music. One wants to create pleasant music without

More information

Modeling memory for melodies

Modeling memory for melodies Modeling memory for melodies Daniel Müllensiefen 1 and Christian Hennig 2 1 Musikwissenschaftliches Institut, Universität Hamburg, 20354 Hamburg, Germany 2 Department of Statistical Science, University

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

Computational Modelling of Harmony

Computational Modelling of Harmony Computational Modelling of Harmony Simon Dixon Centre for Digital Music, Queen Mary University of London, Mile End Rd, London E1 4NS, UK simon.dixon@elec.qmul.ac.uk http://www.elec.qmul.ac.uk/people/simond

More information

An Empirical Comparison of Tempo Trackers

An Empirical Comparison of Tempo Trackers An Empirical Comparison of Tempo Trackers Simon Dixon Austrian Research Institute for Artificial Intelligence Schottengasse 3, A-1010 Vienna, Austria simon@oefai.at An Empirical Comparison of Tempo Trackers

More information

Feature-Based Analysis of Haydn String Quartets

Feature-Based Analysis of Haydn String Quartets Feature-Based Analysis of Haydn String Quartets Lawson Wong 5/5/2 Introduction When listening to multi-movement works, amateur listeners have almost certainly asked the following situation : Am I still

More information

A Probabilistic Model of Melody Perception

A Probabilistic Model of Melody Perception Cognitive Science 32 (2008) 418 444 Copyright C 2008 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1080/03640210701864089 A Probabilistic Model of

More information

Figured Bass and Tonality Recognition Jerome Barthélemy Ircam 1 Place Igor Stravinsky Paris France

Figured Bass and Tonality Recognition Jerome Barthélemy Ircam 1 Place Igor Stravinsky Paris France Figured Bass and Tonality Recognition Jerome Barthélemy Ircam 1 Place Igor Stravinsky 75004 Paris France 33 01 44 78 48 43 jerome.barthelemy@ircam.fr Alain Bonardi Ircam 1 Place Igor Stravinsky 75004 Paris

More information

Robert Alexandru Dobre, Cristian Negrescu

Robert Alexandru Dobre, Cristian Negrescu ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q

More information

Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals

Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals Eita Nakamura and Shinji Takaki National Institute of Informatics, Tokyo 101-8430, Japan eita.nakamura@gmail.com, takaki@nii.ac.jp

More information

Algorithmic Composition: The Music of Mathematics

Algorithmic Composition: The Music of Mathematics Algorithmic Composition: The Music of Mathematics Carlo J. Anselmo 18 and Marcus Pendergrass Department of Mathematics, Hampden-Sydney College, Hampden-Sydney, VA 23943 ABSTRACT We report on several techniques

More information

Music Performance Panel: NICI / MMM Position Statement

Music Performance Panel: NICI / MMM Position Statement Music Performance Panel: NICI / MMM Position Statement Peter Desain, Henkjan Honing and Renee Timmers Music, Mind, Machine Group NICI, University of Nijmegen mmm@nici.kun.nl, www.nici.kun.nl/mmm In this

More information

Human Preferences for Tempo Smoothness

Human Preferences for Tempo Smoothness In H. Lappalainen (Ed.), Proceedings of the VII International Symposium on Systematic and Comparative Musicology, III International Conference on Cognitive Musicology, August, 6 9, 200. Jyväskylä, Finland,

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

Acoustic and musical foundations of the speech/song illusion

Acoustic and musical foundations of the speech/song illusion Acoustic and musical foundations of the speech/song illusion Adam Tierney, *1 Aniruddh Patel #2, Mara Breen^3 * Department of Psychological Sciences, Birkbeck, University of London, United Kingdom # Department

More information

Sudhanshu Gautam *1, Sarita Soni 2. M-Tech Computer Science, BBAU Central University, Lucknow, Uttar Pradesh, India

Sudhanshu Gautam *1, Sarita Soni 2. M-Tech Computer Science, BBAU Central University, Lucknow, Uttar Pradesh, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Artificial Intelligence Techniques for Music Composition

More information

Modeling the Effect of Meter in Rhythmic Categorization: Preliminary Results

Modeling the Effect of Meter in Rhythmic Categorization: Preliminary Results Modeling the Effect of Meter in Rhythmic Categorization: Preliminary Results Peter Desain and Henkjan Honing,2 Music, Mind, Machine Group NICI, University of Nijmegen P.O. Box 904, 6500 HE Nijmegen The

More information

Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network

Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network Indiana Undergraduate Journal of Cognitive Science 1 (2006) 3-14 Copyright 2006 IUJCS. All rights reserved Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network Rob Meyerson Cognitive

More information

Speaking in Minor and Major Keys

Speaking in Minor and Major Keys Chapter 5 Speaking in Minor and Major Keys 5.1. Introduction 28 The prosodic phenomena discussed in the foregoing chapters were all instances of linguistic prosody. Prosody, however, also involves extra-linguistic

More information

LESSON 1 PITCH NOTATION AND INTERVALS

LESSON 1 PITCH NOTATION AND INTERVALS FUNDAMENTALS I 1 Fundamentals I UNIT-I LESSON 1 PITCH NOTATION AND INTERVALS Sounds that we perceive as being musical have four basic elements; pitch, loudness, timbre, and duration. Pitch is the relative

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. Gideon Broshy, Leah Latterner and Kevin Sherwin

THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. Gideon Broshy, Leah Latterner and Kevin Sherwin THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. BACKGROUND AND AIMS [Leah Latterner]. Introduction Gideon Broshy, Leah Latterner and Kevin Sherwin Yale University, Cognition of Musical

More information

METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING

METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING Proceedings ICMC SMC 24 4-2 September 24, Athens, Greece METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING Kouhei Kanamori Masatoshi Hamanaka Junichi Hoshino

More information

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance RHYTHM IN MUSIC PERFORMANCE AND PERCEIVED STRUCTURE 1 On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance W. Luke Windsor, Rinus Aarts, Peter

More information

Singer Recognition and Modeling Singer Error

Singer Recognition and Modeling Singer Error Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing

More information

Music Composition with RNN

Music Composition with RNN Music Composition with RNN Jason Wang Department of Statistics Stanford University zwang01@stanford.edu Abstract Music composition is an interesting problem that tests the creativity capacities of artificial

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2002 AP Music Theory Free-Response Questions The following comments are provided by the Chief Reader about the 2002 free-response questions for AP Music Theory. They are intended

More information

Analysis and Clustering of Musical Compositions using Melody-based Features

Analysis and Clustering of Musical Compositions using Melody-based Features Analysis and Clustering of Musical Compositions using Melody-based Features Isaac Caswell Erika Ji December 13, 2013 Abstract This paper demonstrates that melodic structure fundamentally differentiates

More information

Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,

More information

A Beat Tracking System for Audio Signals

A Beat Tracking System for Audio Signals A Beat Tracking System for Audio Signals Simon Dixon Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria. simon@ai.univie.ac.at April 7, 2000 Abstract We present

More information

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.

Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J. UvA-DARE (Digital Academic Repository) Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J. Published in: Frontiers in

More information

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are In: E. Bruce Goldstein (Ed) Encyclopedia of Perception, Volume 1, Sage, 2009, pp 160-164. Auditory Illusions Diana Deutsch The sounds we perceive do not always correspond to those that are presented. When

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx

Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx Olivier Lartillot University of Jyväskylä, Finland lartillo@campus.jyu.fi 1. General Framework 1.1. Motivic

More information

A probabilistic approach to determining bass voice leading in melodic harmonisation

A probabilistic approach to determining bass voice leading in melodic harmonisation A probabilistic approach to determining bass voice leading in melodic harmonisation Dimos Makris a, Maximos Kaliakatsos-Papakostas b, and Emilios Cambouropoulos b a Department of Informatics, Ionian University,

More information

Student Performance Q&A: 2001 AP Music Theory Free-Response Questions

Student Performance Q&A: 2001 AP Music Theory Free-Response Questions Student Performance Q&A: 2001 AP Music Theory Free-Response Questions The following comments are provided by the Chief Faculty Consultant, Joel Phillips, regarding the 2001 free-response questions for

More information

The Sparsity of Simple Recurrent Networks in Musical Structure Learning

The Sparsity of Simple Recurrent Networks in Musical Structure Learning The Sparsity of Simple Recurrent Networks in Musical Structure Learning Kat R. Agres (kra9@cornell.edu) Department of Psychology, Cornell University, 211 Uris Hall Ithaca, NY 14853 USA Jordan E. DeLong

More information

Similarity matrix for musical themes identification considering sound s pitch and duration

Similarity matrix for musical themes identification considering sound s pitch and duration Similarity matrix for musical themes identification considering sound s pitch and duration MICHELE DELLA VENTURA Department of Technology Music Academy Studio Musica Via Terraglio, 81 TREVISO (TV) 31100

More information

An Integrated Music Chromaticism Model

An Integrated Music Chromaticism Model An Integrated Music Chromaticism Model DIONYSIOS POLITIS and DIMITRIOS MARGOUNAKIS Dept. of Informatics, School of Sciences Aristotle University of Thessaloniki University Campus, Thessaloniki, GR-541

More information

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS Areti Andreopoulou Music and Audio Research Laboratory New York University, New York, USA aa1510@nyu.edu Morwaread Farbood

More information

Melody Retrieval On The Web

Melody Retrieval On The Web Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,

More information

Perception-Based Musical Pattern Discovery

Perception-Based Musical Pattern Discovery Perception-Based Musical Pattern Discovery Olivier Lartillot Ircam Centre Georges-Pompidou email: Olivier.Lartillot@ircam.fr Abstract A new general methodology for Musical Pattern Discovery is proposed,

More information

Harmony and tonality The vertical dimension. HST 725 Lecture 11 Music Perception & Cognition

Harmony and tonality The vertical dimension. HST 725 Lecture 11 Music Perception & Cognition Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani Harmony and tonality The vertical dimension HST 725 Lecture 11 Music Perception & Cognition

More information

Autocorrelation in meter induction: The role of accent structure a)

Autocorrelation in meter induction: The role of accent structure a) Autocorrelation in meter induction: The role of accent structure a) Petri Toiviainen and Tuomas Eerola Department of Music, P.O. Box 35(M), 40014 University of Jyväskylä, Jyväskylä, Finland Received 16

More information

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS Item Type text; Proceedings Authors Habibi, A. Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings

More information

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem Tsubasa Tanaka and Koichi Fujii Abstract In polyphonic music, melodic patterns (motifs) are frequently imitated or repeated,

More information

PLANE TESSELATION WITH MUSICAL-SCALE TILES AND BIDIMENSIONAL AUTOMATIC COMPOSITION

PLANE TESSELATION WITH MUSICAL-SCALE TILES AND BIDIMENSIONAL AUTOMATIC COMPOSITION PLANE TESSELATION WITH MUSICAL-SCALE TILES AND BIDIMENSIONAL AUTOMATIC COMPOSITION ABSTRACT We present a method for arranging the notes of certain musical scales (pentatonic, heptatonic, Blues Minor and

More information

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016 6.UAP Project FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System Daryl Neubieser May 12, 2016 Abstract: This paper describes my implementation of a variable-speed accompaniment system that

More information

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t

2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t MPEG-7 FOR CONTENT-BASED MUSIC PROCESSING Λ Emilia GÓMEZ, Fabien GOUYON, Perfecto HERRERA and Xavier AMATRIAIN Music Technology Group, Universitat Pompeu Fabra, Barcelona, SPAIN http://www.iua.upf.es/mtg

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

AutoChorale An Automatic Music Generator. Jack Mi, Zhengtao Jin

AutoChorale An Automatic Music Generator. Jack Mi, Zhengtao Jin AutoChorale An Automatic Music Generator Jack Mi, Zhengtao Jin 1 Introduction Music is a fascinating form of human expression based on a complex system. Being able to automatically compose music that both

More information

Structure and Interpretation of Rhythm and Timing 1

Structure and Interpretation of Rhythm and Timing 1 henkjan honing Structure and Interpretation of Rhythm and Timing Rhythm, as it is performed and perceived, is only sparingly addressed in music theory. Eisting theories of rhythmic structure are often

More information

Transition Networks. Chapter 5

Transition Networks. Chapter 5 Chapter 5 Transition Networks Transition networks (TN) are made up of a set of finite automata and represented within a graph system. The edges indicate transitions and the nodes the states of the single

More information

Probabilist modeling of musical chord sequences for music analysis

Probabilist modeling of musical chord sequences for music analysis Probabilist modeling of musical chord sequences for music analysis Christophe Hauser January 29, 2009 1 INTRODUCTION Computer and network technologies have improved consequently over the last years. Technology

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;

More information

CPU Bach: An Automatic Chorale Harmonization System

CPU Bach: An Automatic Chorale Harmonization System CPU Bach: An Automatic Chorale Harmonization System Matt Hanlon mhanlon@fas Tim Ledlie ledlie@fas January 15, 2002 Abstract We present an automated system for the harmonization of fourpart chorales in

More information

TOWARDS STRUCTURAL ALIGNMENT OF FOLK SONGS

TOWARDS STRUCTURAL ALIGNMENT OF FOLK SONGS TOWARDS STRUCTURAL ALIGNMENT OF FOLK SONGS Jörg Garbers and Frans Wiering Utrecht University Department of Information and Computing Sciences {garbers,frans.wiering}@cs.uu.nl ABSTRACT We describe an alignment-based

More information

MUSICAL STRUCTURAL ANALYSIS DATABASE BASED ON GTTM

MUSICAL STRUCTURAL ANALYSIS DATABASE BASED ON GTTM MUSICAL STRUCTURAL ANALYSIS DATABASE BASED ON GTTM Masatoshi Hamanaka Keiji Hirata Satoshi Tojo Kyoto University Future University Hakodate JAIST masatosh@kuhp.kyoto-u.ac.jp hirata@fun.ac.jp tojo@jaist.ac.jp

More information

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos Friberg, A. and Sundberg,

More information

MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations

MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations Dominik Hornel dominik@ira.uka.de Institut fur Logik, Komplexitat und Deduktionssysteme Universitat Fridericiana Karlsruhe (TH) Am

More information

EIGENVECTOR-BASED RELATIONAL MOTIF DISCOVERY

EIGENVECTOR-BASED RELATIONAL MOTIF DISCOVERY EIGENVECTOR-BASED RELATIONAL MOTIF DISCOVERY Alberto Pinto Università degli Studi di Milano Dipartimento di Informatica e Comunicazione Via Comelico 39/41, I-20135 Milano, Italy pinto@dico.unimi.it ABSTRACT

More information

Measurement of overtone frequencies of a toy piano and perception of its pitch

Measurement of overtone frequencies of a toy piano and perception of its pitch Measurement of overtone frequencies of a toy piano and perception of its pitch PACS: 43.75.Mn ABSTRACT Akira Nishimura Department of Media and Cultural Studies, Tokyo University of Information Sciences,

More information

Evolutionary jazz improvisation and harmony system: A new jazz improvisation and harmony system

Evolutionary jazz improvisation and harmony system: A new jazz improvisation and harmony system Performa 9 Conference on Performance Studies University of Aveiro, May 29 Evolutionary jazz improvisation and harmony system: A new jazz improvisation and harmony system Kjell Bäckman, IT University, Art

More information

Efficient Processing the Braille Music Notation

Efficient Processing the Braille Music Notation Efficient Processing the Braille Music Notation Tomasz Sitarek and Wladyslaw Homenda Faculty of Mathematics and Information Science Warsaw University of Technology Plac Politechniki 1, 00-660 Warsaw, Poland

More information

Characterization and improvement of unpatterned wafer defect review on SEMs

Characterization and improvement of unpatterned wafer defect review on SEMs Characterization and improvement of unpatterned wafer defect review on SEMs Alan S. Parkes *, Zane Marek ** JEOL USA, Inc. 11 Dearborn Road, Peabody, MA 01960 ABSTRACT Defect Scatter Analysis (DSA) provides

More information

Early Applications of Information Theory to Music

Early Applications of Information Theory to Music Early Applications of Information Theory to Music Marcus T. Pearce Centre for Cognition, Computation and Culture, Goldsmiths College, University of London, New Cross, London SE14 6NW m.pearce@gold.ac.uk

More information

The Human, the Mechanical, and the Spaces in between: Explorations in Human-Robotic Musical Improvisation

The Human, the Mechanical, and the Spaces in between: Explorations in Human-Robotic Musical Improvisation Musical Metacreation: Papers from the 2013 AIIDE Workshop (WS-13-22) The Human, the Mechanical, and the Spaces in between: Explorations in Human-Robotic Musical Improvisation Scott Barton Worcester Polytechnic

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

Toward an analysis of polyphonic music in the textual symbolic segmentation

Toward an analysis of polyphonic music in the textual symbolic segmentation Toward an analysis of polyphonic music in the textual symbolic segmentation MICHELE DELLA VENTURA Department of Technology Music Academy Studio Musica Via Terraglio, 81 TREVISO (TV) 31100 Italy dellaventura.michele@tin.it

More information

A Case Based Approach to the Generation of Musical Expression

A Case Based Approach to the Generation of Musical Expression A Case Based Approach to the Generation of Musical Expression Taizan Suzuki Takenobu Tokunaga Hozumi Tanaka Department of Computer Science Tokyo Institute of Technology 2-12-1, Oookayama, Meguro, Tokyo

More information

On Interpreting Bach. Purpose. Assumptions. Results

On Interpreting Bach. Purpose. Assumptions. Results Purpose On Interpreting Bach H. C. Longuet-Higgins M. J. Steedman To develop a formally precise model of the cognitive processes involved in the comprehension of classical melodies To devise a set of rules

More information

AUTOMATIC MELODIC REDUCTION USING A SUPERVISED PROBABILISTIC CONTEXT-FREE GRAMMAR

AUTOMATIC MELODIC REDUCTION USING A SUPERVISED PROBABILISTIC CONTEXT-FREE GRAMMAR AUTOMATIC MELODIC REDUCTION USING A SUPERVISED PROBABILISTIC CONTEXT-FREE GRAMMAR Ryan Groves groves.ryan@gmail.com ABSTRACT This research explores a Natural Language Processing technique utilized for

More information