Using Natural Language Processing Techniques for Musical Parsing
|
|
- Colin Harrison
- 6 years ago
- Views:
Transcription
1 Using Natural Language Processing Techniques for Musical Parsing RENS BOD School of Computing, University of Leeds, Leeds LS2 9JT, UK, and Department of Computational Linguistics, University of Amsterdam Spuistraat 134, 1012 VB Amsterdam, Holland Abstract. We investigate whether probabilistic parsing techniques from Natural Language Processing (NLP) can be used for musical parsing. As in natural language, a piece of music can be segmented into groups or phrases which can be conveniently represented by a phrase-structure tree (Longuet-Higgins 1976; Tenney & Polansky 1980; Lerdahl & Jackendoff 1983). One of the main challenges for musical parsers is the problem of ambiguity: several different phrase structures may be compatible with a given musical sequence while a listener typically hears only one structure. In this paper we will consider three parsing techniques from the NLP literature that use a probabilistic heuristic to solve ambiguity. We present a new parser which combines two of these techniques, and which can correctly predict up to 85.9% of the phrases for a test set of 1,000 folksongs from the Essen Folksong Collection (Schaffrath 1995). To the best of our knowledge, this work presents the first parsing experiments with the Essen Folksong Collection, which we hope may be used as a baseline for other approaches. Our parser may also be used to speed up the timeconsuming annotation of newly collected folksongs, thereby contributing to the creation of larger musical databases in computer-assisted musicology. Keywords. computer-assisted musicology, natural language processing, music perception, probabilistic grammars, musical databases 1. Introduction We investigate whether probabilistic parsing techniques from Natural Language Processing (NLP) can be used for musical parsing. As in natural language, a listener segments a sequence of notes into groups or phrases that form a grouping structure for the whole piece (Longuet-Higgins 1976; Tenney & Polansky 1980; Lerdahl & Jackendoff 1983). For example, according to Lerdahl & Jackendoff (1983: 37) a listener hears the following grouping structure for the first few bars of melody in the Mozart G Minor Symphony, K. 550.
2 Figure 1. Grouping structure for the opening theme of Mozart's G Minor Symphony Each group is represented by a slur beneath the musical notation. A slur enclosed within a slur indicates that a group is heard as part of a larger group. This hierarhical structure of melody can, without loss of generality, also be represented by a phrase structure tree, as in figure 2. Figure 2. Tree structure for the grouping structure in figure 1 Although visually quite different, it is easy to see that the two representations in figures 1 and 2 are mathematically equivalent. Note the analogy with phrase structure trees in linguistics: a tree describes how parts of the input combine into groups or constituents and how these constituents combine into a representation for the whole input. Apart from this analogy, there is also an important difference: while the nodes in a linguistic tree structure are typically labeled with syntactic categories such as S, NP, VP etc., musical tree structures are unlabeled. This is because in language there are syntactic constraints on how words can be combined into larger constituents (e.g. in English a determiner can be combined with a noun only if it precedes that noun, which is expressed by the rule NP -> Det N), while in music there are no such restrictions: in principle any note may be combined with any other note. This makes the problem of ambiguity in music much harder than in language. Longuet-Higgins and Lee (1987) note that "Any given sequence of note values is in principle infinitely ambiguous, but this ambiguity is seldom apparent to the listener.".
3 To give an example of this ambiguity, the first few bars of Mozart's G Minor Symphony could also be assigned the following, alternative grouping structure (among the many other possible structures): Figure 3. Alternative grouping structure for the opening theme of Mozart's G Minor Symphony While this alternative structure is possible in that it can be perceived, it does not correspond to the structure that is actually perceived by a human listener. There is thus an important research question as to how to select the perceived tree structure from the total, possibly infinite set of tree structures of a musical input. In the field of natural language processing (NLP), the use of probabilistic corpusbased parsing techniques has become increasingly influential for solving ambiguity (see Charniak 1997 or Manning & Schütze 1999 for an overview). Instead of using a predefined set of rules, a probabilistic corpus-based parser learns how to parse new input by generalizing from examples of previously annotated data; in case of ambiguity, such a parser computes the most probable phrase structure for a given input. State-of-the-art probabilistic parsers, which use the Wall Street Journal corpus in the Penn Treebank (Marcus et al. 1993) as a test domain, obtain around 90% correctly predicted phrases (e.g. Collins 2000; Charniak 2000; Bod 2001a). With the current availability of large annotated musical corpora, such as the Essen Folksong Collection (Schaffrath 1995), we may wonder whether such probabilistic corpus-based parsing techniques carry over to musical parsing. In this paper we will test the usefulness of three probabilistic parsing techniques for music: the Treebank grammar technique of Charniak (1996), the Markov grammar technique of Collins (1999), and the Data-Oriented Parsing (DOP) technique of Bod (1998). We develop a new parser which combines two of these techniques, and which correctly predicts up to 85.9% of the phrases for a held-out test set of 1,000 folksongs from the Essen Folksong Collection (Schaffrath 1995). To the best of our knowledge, this paper contains
4 the first parsing experiments on the Essen Folksong Collection; moreover, it also contains the first parsing experiments on a musical test set of non-trivial size. In the following we first describe the Essen Folksong Collection, after which we test a number of probabilistic parsing models on this collection. Since no other parsing results on the Essen Folksong Collection are available, we will only informally compare our technique with other approaches that aim at solving ambiguity in music. 2. The Essen Folksong Collection The Essen Folksong Collection provides a large sample of (mostly) European folksongs that have been collected and encoded under the supervision of Helmut Schaffrath at the University of Essen (see Schaffrath 1993, 1995; Selfridge-Field 1995; or Each of the 6,251 folksongs in the Essen Folksong Collection is annotated with the Essen Associative Code (ESAC) which includes pitch and duration information, meter signatures and explicit phrase markers. The presence of phrase markers makes the Essen Folksong Collection a unique test case for musical parsers. The pitch encodings in the Essen Folksong Collection resemble "solfege": scale degree numbers are used to replace the movable syllables "do", "re", "mi", etc. Thus 1 corresponds to "do", 2 corresponds to "re", etc. Chromatic alterations are represented by adding either a "#" or a "b" after the number. The plus ("+") and minus ("-") signs are added before the number if a note falls resp. above or below the principle octave (thus -1, 1 and +1 refer al to "do", but on different octaves). Duration is represented by adding a period or an underscore after the number. A period (".") increases duration by 50% and an underscore ("_") increases duration by 100%; more than one underscore may be added after each number. If a number has no duration indicator, its duration corresponds to the smallest value. A pause is represented by 0, possibly followed by duration indicators. No loudness or timbre indicators are used in ESAC. Thus, the opening theme of Mozart's G Minor Symphony in figure 1 can be encoded in ESAC as follows (since the piece is in G Minor, all notes are related to G which corresponds to the number 1). 6b55_6b55_6b55_+3b_0_ Figure 4. ESAC encoding for the opening theme of Mozart's G Minor Symphony ESAC uses hard returns to indicate a phrase boundary. To make the Essen annotations readable for our probabilistic parsers, we automatically convert ESAC's phrase boundary indications into bracket representations, where "(" indicates the start of a phrase and ")" the
5 end of a phrase. The phrase structures in figures 1 and 2 would thus correspond to the following bracket representation. ( ( (6b55_) (6b55_) ) (6b55_+3b_0_) ) Figure 5. Bracket representation for the phrase structures in figures 1 and 2 of the opening theme of Mozart's G Minor Symphony The following figure gives an example of an encoding of an actual folksong from the Essen Folksong Collection ("Schlaf Kindlein feste") converted to our bracket representation: (3_221_-5)( _-5)( )( _)(3_221_-5_) Figure 6. Bracket representation for folksong K0029, "Schlaf Kindlein feste" It is important to note that the annotations in the Essen Folksong Collection do not contain hierarchical or nested structures. Differently from the examples in figures 1, 2 and 3, the Essen Folksong annotations represent the basic phrases (or "segmentations") only and neglect any phrase-internal or phrase-external structure (such as motives, periods and sections). Although this results in rather simple annotations, we will see that the Essen Collection is still a very tough test case for our parsers. The following example shows that many phrases in the Essen Folksong Collection could have been further analyzed in terms of subphrases (e.g. the fifth phrase into three very similar subphrases). (3 2 1_1_-5_)(-5_3_3_2_2_1_1_-5_)(-5_1_2_3_1_4 2_)(1_-7_1_2_-5_3 1_) (3_1-5_3_1_1_-5_3_1-5_)(-5_1_2_3_1_4_3_223_1 1_0_) Figure 7. Bracket representation for folksong K0690, "Ruru Rinneken" And a more extreme case of the shallowness of the Essen Folksong Collection is provided by folksong Z0147 ("Besenbinders Tochter und kachelmachers Sohn"): (5_4#_5_3_1 1_3_2_1#_2_-7_-5.)(3_5_4#_5_3_1 1_3_ 221#_2_-7_-5.) (-5_-5_-5_-5-5-5_4 4_3_2_2_3_4_5 +1_)(3_5_4#_5_3_1_-7_1_332_1#_2_3_1 0 ) (-5_-5_-5_-5_444_4_3_2_2_3_4_5 +1_)(3_5_4#_5_3_1_1_1_3_2_1#_2_3_1.) (3_5_4#_5_3_1_1_1_3_2_1#_2_3_1 1_)(3_5_4#_5_3_1_-7_1_3_2_1#_2_3_1 1_0_) (-5_-5_-5_-5_444_4_3_2_2_3_4_5 +1_)(3_5_4#_5_3_1_1_1_3_2_1#_2_3_1 )
6 Figure 8. Bracket representation for folksong Z0147, "Besenbinders Tochter und kachelmachers Sohn" We believe that every phrase in this folksong could have been further analyzed into subphrases. Yet, the annotation in figure 8 is not wrong; it just represents the most basic phrase structure of the piece only. We want to emphasize that for our experiments in section 3 we did not add (or modify) any structure in the Essen annotations. As we will see, despite (or perhaps due to) its shallow annotations, the Essen Folksong Collection is quite an interesting test case. This brings us to the problem of evaluation. To evaluate our probabilistic parsers for music, we employed the blind testing method which has been widely used in evaluating natural language parsers (see Manning & Schütze 1999). This method dictates that a collection of annotated data is randomly divided into a training set and a test set, where the annotations in the training set are used to "train" the parser, while the unannotated strings in the test set are used as input to test the parser. The degree to which the predicted structures for the test set strings match with the correct structures in the test set is a measure for the accuracy of the parser. For our experiments in section 3, we randomly divided the Essen Folksong Collection into a training set of 5,251 folksongs and a test set of 1,000 folksongs. There is an important question as to what kind of evaluation measure is most appropriate to compare the phrase structures proposed by the parser with the correct phrase structures in the test set. A widely used evaluation scheme in natural language parsing is the PARSEVAL scheme, which is based on the notions of precision and recall (see Black et al. 1991). PARSEVAL compares a proposed parse P with the corresponding test set parse T as follows: Precision = # correct phrases in P # phrases in P Recall = # correct phrases in P # phrases in T A phrase is correct if both the start and the end of the phrase is correctly predicted. Note that these measures "punish" a parser which assigns too many phrases to a folksong: for example, an extremely overgenerating parser which assigns phrases to any combination of notes would trivially include all correct phrases, resulting in an excellent recall, but its precision would be very low. On the other hand, a very conservative parser which predicts
7 very few, though correct phrases, will receive a high precision, but its recall will be low. A good parser will thus need to obtain both a high precision and a high recall. (It goes probably without saying that for computing the precision and recall for all test set strings, one needs to divide the total number of correctly predicted phrases in all proposed parses P by the total number of phrases in respectively all parses P and T.) The precision and recall scores are often combined into a single measure of performance, known as the F-score (see Manning & Schütze 1999): F-score = 2 Precision Recall Precision + Recall We will use these three measures of Precision, Recall and F-score to quantitatively evaluate our probabilistic parsing models for music. As a final pre-processing step, we (automatically) added to each phrase in the folksong the label "P" and to each whole song the label "S", so as to obtain conventional parse trees. Thus the annotation in figure 6 becomes: S( P(3_221_-5) P( _-5) P( ) P( _) P(3_221_-5_) ) Figure 9. Labeled-bracketing annotation for the structure in figure 6 Note that this labeled-bracketing annotation is equivalent to the following visual tree representation. S P P P P P 3_221_ _ _ 3_221_-5_ Figure 10. Visual tree structure for the labeled-bracketing annotation in figure 9
8 The advantage of labeled-bracketing annotations is that we can now directly apply existing probabilistic parsing models to the Essen Folksong Collection. 3. Parsing the Essen Folksong Collection In this section, we test three probabilistic parsing models from the literature on the Essen Folksong Collection: the Treebank grammar technique of Charniak (1996), the Markov grammar technique of Seneff (1992) and Collins (1999), and the Data-Oriented Parsing (DOP) technique of Bod (1993, 1998). Unless stated differently, we used the same random split of the Essen Folksong Collection into a training set of 5,251 folksongs and a test set of 1,000 folksongs. 3.1 The Treebank Grammar Technique The Treebank grammar technique is an extremely simple learning technique: it reads all context-free rewrite rules from the training set structures, and assigns each rule a probability proportional to its frequency in the training set. For example, the following context-free rules can be extracted from the structure in figure 9: S -> PPPPP P -> 3_221_-5 P -> _-5 P -> P -> _ P -> 3_221_-5_ Next, each rewrite rule is assigned a probability by dividing the number of occurrences of a particular rule in the training set by the total number of occurrences of rules that expand the same nonterminal as the particular rule. For instance, if we take folksong in figure 9 as our only training data, then the probability of the rule P -> 3_221_-5 is equal to 1/5 since this rule occurs once among a total of 5 rules that expand the nonterminal P. A Treebank grammar extracted in this way from the training set corresponds to a socalled Probabilistic Context-Free Grammar or PCFG (Booth 1969). A crucial assumption underlying PCFGs is that the context-free rules are statistically independent. Thus, given the probabilities of the individual rules, we can calculate the probability of a parse tree by taking the product of the probabilities of each rule used therein. PCFGs have been extensively studied in the literature (cf. Wetherell 1980; Charniak 1993), and the efficient parsing algorithms that exist for Context-Free Grammars carry over to PCFGs (see Charniak 1993 or Manning & Schütze 1999 for the relevant algorithms).
9 Any probabilistic grammar extracted from a training set faces the problem of datasparseness: many of the rules in the training set are so infrequent that their observed probabilities are very bad estimates of their true probabilities. A widely used method to cope with this problem is the Good-Turing method (Good 1953). In general, Good-Turing estimates the expected population frequency f* of a type by adjusting its observed sample frequency f. In order to estimate f*, Good-Turing uses an additional notion, n f, which is defined as the number of types which occur f times in an observed sample. Thus, n f can be understood as the frequency of frequency f. The Good-Turing estimator uses this extra information for computing the adjusted frequency f* as f* = ( f+1) n f+1 nf We thus compute the probabilities of our context-free rules in the Treebank grammar from their adjusted frequencies rather than from their raw observed frequences. For an instructive paper on Good-Turing, together with a proof of the formula, see Church & Gale (1991). The Treebank grammar that was obtained in this way from the 5,251 training folksongs was used to parse the 1,000 folksongs in the test set. We computed for each test folksong the most probable parse using a standard best-first parsing algorithm based on Viterbi optimization (see Charniak 1993; Manning & Schütze 1999). Although we may already foresee that a Treebank grammar is doomed to misparse folksongs if it is does not find the correct rule in the training set, it will serve as the basis for our more sophisticated parsing techniques in the following sections. Using the evaluation measures given in section 2, our Treebank grammar obtained a precision of 68.7%, a recall of 3.4%, and an F-score of 6.5%. Although the precision score may seem reasonable, the recall score is extremely low, which indicates that the Treebank grammar technique is a very conservative learner: it predicts very few phrases from the total number of phrases in the Essen Folksong Collection, resulting in a very low F-score. As noted, one of the problems with the Treebank grammar technique is that it only learns those context-free rules that literally occur in the training set, which is evidently not a very robust technique for musical parsing (while it has been shown to perform quite well in natural language parsing -- see Charniak 1996). We will see, however, that the results improve significantly if we slightly loosen the way of extracting rules from the training set. 3.2 The Markov Grammar Technique A technique which overcomes the conservativity of Treebank grammars is the Markov grammar technique (Seneff 1992; Collins 1999). While a Treebank grammar can only assign probabilities to context-free rules that have been seen in the training set, a Markov
10 grammar can in principle assign a probability to any possible context-free rule, thus resulting in a more robust model. This is accomplished by decomposing a rule and its probability by a Markov process (see Collins 1999: 44-48). For example, a third-order Markov process estimates the probability p of a rule P -> by: p(p -> 12345) = p(1) p(2 1) p(3 1, 2) p(4 1, 2, 3) p(5 2, 3, 4) p(end 3, 4, 5). The conditional probability p(end 3, 4, 5) encodes the probability that a rule ends after the notes 3, 4, 5. Thus even if the rule P -> does not literally occur in the training set, we can still estimate its probability by using a Markov history of three notes. The extension to larger Markov histories follows from obvious generalization of the above example. However, also a Markov grammar suffers from data-sparseness: we may get low counts, including zero counts, for many Markov histories. Zero counts are especially problematic: if one of the conditional probabilities in the formula above has a zero occurrence in the training set, then the whole rule is assigned a zero probability. A widely used technique to solve the data-sparseness problem in Markov models is the linear interpolation technique (see Manning & Schütze 1999: ). This technique smooths a Markov history by taking into account its shorter histories. Let n 1, n 2 and n 3 denote three notes, then the conditional probability p(n 1 n 2, n 3 ) is smoothed ("interpolated") as p(n 1 n 2, n 3 ) = λ 1 p(n 1 ) + λ 2 p(n 1 n 2 ) + λ 3 p(n 1 n 2, n 3 ). where 0 λ i 1 and λ 1 + λ 2 + λ 3 = 1. These λ-weights may be set by hand, but in general one wants to find the combination of weights λ i which works best. A simple algorithm that finds the optimal weights is Powell's algorithm (see Press et al. 1988), which is also discussed in Manning & Schütze (1999: 218). We used this algorithm to assign weights to the lambdas in the linear interpolation technique, which in turn was used to estimate the conditional probabilities in the Markov grammar technique. Furthermore, each of the probabilities p(n 1 ), p(n 1 n 2 ) and p(n 1 n 2, n 3 ) were not directly estimated from their observed relative frequencies in the training set, but were adjusted by the Good-Turing method, just as with Treebank grammars (section 3.1). Note that the extension to larger Markov histories follows from obvious generalization of the formulas above. The probability of a parse tree of a musical piece is computed by the product of the probabilities of the rules that partake in the parse tree, just as with Treebank grammars. For our experiments, we used a Markov grammar with a history of four notes. This grammar obtained a precision of 63.1%, a recall of 80.2%, and an F-score of 70.6%. These results are to some extent complementary to the Treebank grammar: although the precision is
11 somewhat lower, the recall is (much) higher than for the Treebank grammar. Thus, while the Treebank grammar predicts too few phrases, the Markov grammar predicts (a bit) too many phrases. The combined F-score of 70.6% shows an immense improvement over the Treebank grammar technique. Experiments with higher or lower order Markov models diminished our results. 3.3 Extending the Markov Grammar Technique with the DOP Technique Although the Markov grammar technique obtained considerably better scores than the Treebank grammar technique, it does not take into account any global context in computing the probability of a parse tree. Knowledge of global context, such as the number of phrases that occur in a folksong, is likely to be important for predicting the correct segmentations for new folksongs. In order to include global context, we conditioned over the S-rule higher in the structure in computing the probability of a P-rule. This approach corresponds to the Data-Oriented Parsing (DOP) technique (Bod 1993, 1998) which can condition over any higher or lower rule in a tree, and which has recently been integrated with the Markov grammar technique (Sima'an 2000). In the original DOP technique, any fragment seen in the training set, regardless of size, is used as a productive unit. But in the Essen Folksong Collection we have only two levels of constituent structure in each tree, which results in a much simpler probabilistic model. As an example take again the rule P -> and a higher S-rule such as S -> PPPP; a DOP-Markov model based on a history of three notes computes the (conditional) probability of this rule as: p(p -> S -> PPPP) = p(1 S -> PPPP) p(2 S -> PPPP, 1) p(3 S -> PPPP, 1, 2) p(4 S -> PPPP, 1, 2, 3) p(5 S -> PPPP, 2, 3, 4) p(end S -> PPPP, 3, 4, 5). The extension to larger histories follows from obvious generalization of the above example. For our experiments, we used a history of four notes, extended with the same smoothing techniques as in section 3.2 (i.e. linear interpolation combined with Good-Turing). The most probable parse of a folksong is again computed by maximizing the product of the rule probabilities that generate the folksong. Using the same training/test set division as before, this DOP-Markov parser obtained a precision of 76.6%, a recall of 85.9%, and an F-score of 81.0%. The F-score is an improvement of 10.4% over the Markov parser. Note that the DOP-Markov parser is relatively well-balanced: it is neither terribly conservative nor does it predict too many redundant phrases -- keeping in mind the idiosyncracy of the Essen Folksong annotations. While there is no reason to expect a near to 100% accuracy for the shallowly annotated Essen Folksong Collection, our results show the importance of including global context in
12 computing the probability of a parse. We also checked the statistical signifance of our results, by testing on 9 additional random splits of the Essen Folksong Collection (into training sets of 5,251 folksongs and a test sets of 1,000 folksongs). On these splits, the DOP-Markov parser obtained an average F-score of 80.7% with a standard deviation of 1.9%, while the Markov parser obtained an average F-score of 70.8% with a standard deviation of 2.2%. These differences were statistically significant according to paired t- testing. Finally, we were interested in testing the impact of the training size on the F-score. In the following experiments we started with an initial training set of only 500 folksongs (randomly chosen from the full training set of 5,251 folksongs). We then increased the size of this initial training set with 500 folksongs each time (randomly chosen from the full training set). The test set was kept constant at 1,000 folksongs. The results are shown in table 1. Training F-score % 1, % 1, % 2, % 2, % 3, % 3, % 4, % 4, % 5, % 5, % Table 1. F-score as a function of training set size The table shows that the F-score rapidly increases when the size of the training set is enlarged from 500 to 2,000 folksongs. The accuracy continues to increase at a lower rate if the training set is further enlarged. We may thus expect that the accuracy of our parser further increases if we have access to larger musical corpora. This is important if we want to use our parser for the semi-automatic annotation of musical databases. Starting with an initial, relatively small set of hand-annotated pieces, our parser can use these annotations as its training set on the basis of which the annotations for a new set of musical pieces can be predicted. The predicted annotations will need to be corrected by hand, but once we have added these corrected annotations to the training set, our parser will more accurately predict
13 the annotations for fresh folksongs. Table 1 suggests that the amount of human correction decreases if more training data becomes available. We thus expect that our parser can be used to speed up the time-consuming annotation of musical pieces, thereby contributing to the creation of larger databases in computer-assisted musicology. 4. Other approaches to musical parsing There exists an extensive literature in the field of computational models of music analysis (see Cambouropoulos 1998, or Camouropoulos et al for an overview). Most if not all approaches to musical parsing are non-probabilistic and are based on the assumption that the perceived phrase structure of a musical piece can be predicted on the basis of a combination of low-level phenomena, such as the Gestalt phenomena of proximity and similarity, and higher-level phenomena, such as melodic parallelism and internal harmony. For example, Tenney & Polansky (1980), Lerdahl & Jackendoff (1983), Handel (1989) and Cambouropoulos (1996, 1997) use the Gestalt rules of Wertheimer (1923) to predict the low-level grouping structure of a piece: phrase boundaries preferably fall on larger time intervals, larger pitch intervals, etc. While most models also incorporate higherlevel phenomena, such as melodic parallelism and harmony, these phenomena remain often unformalized. For example, Lerdahl & Jackendoff (1983) do not provide any systematic description of higher-level musical parallelism, and Narmour's Implication-Realization model (Narmour 1990, 1992) relies on factors such as meter, harmony and similarity which are not fully described by the model. As a result, these models have not been evaluated against test sets of non-trivial size, such as the Essen Folksong Collection. Only very few, hand-selected passages are typically used to evaluate these models, which questions the objectivity of the results. More importantly, perhaps, is the fact that the Gestalt principles, which were originally proposed for visual perception (Wertheimer 1923), do not straightforwardly carry over to music perception. Elsewhere (Bod 2001b), we have shown that more than 15% of the phrase boundaries in the Essen Folksong Collection fall before or after large pitch/time intervals (as in the folksong of figure 7), rather than on such intervals, and that phrase boundaries even appear between identical notes. This goes against the predictions of any Gestalt-based parser, which assigns phrase boundaries exactly on large intervals rather than before or after them. Moreover, we have shown in Bod (2001b) that higher-level phenomena, such as melodic parallelism and internal harmony, are not of any help for predicting the correct phrase boundaries for these 15% "exceptional" phrases. On the contrary, for almost all these phrases (98.7%), melodic parallelism and internal harmony reinforced the incorrect predictions made by the Gestalt principles. It is noteworthy that our DOP-Markov parser, on the other hand, performed equally well on both "exceptional" phrases and "normal" phrases
14 (where boundaries do fall on large pitch/time intervals). While our parser is still far from perfect, we believe that a probabilistic, corpus-based approach is more apt to musical parsing as it considers counts of any note sequence that has been observed with a certain structure, thereby taking into account the entire continuum between "exceptional" and "normal" phrases, rather than trying to capture this gradiency by a few formal rules. We fully admit that a fair comparison between our parser and a Gestalt-based/parallelism-based parser should await further experimental evaluation, but we hope to have made clear that musical parsing models should be tested on large corpora of musical annotations such as the Essen Folksong Collection (otherwise "exceptional" phrases may easily remain unnoted). If we wish to propose a corpus-based approach to musical parsing as a serious alternative to a Gestalt-based approach, we should address the question of how any structure can be acquired if we do not have any structured pieces in our corpus to start with. With an already analyzed corpus, we can at best simulate adult music perception -- as with an analyzed corpus of natural language (see Bod 1998). We conjecture that the acquisition of a structured corpus may be the result of a bootstrapping process where the discovery of recurrent patterns and distributional regularities plays an important role. As soon as a sequence of notes appears more than once, it may be hypothesized as a group, and may be used as a productive unit to analyze new pieces. The frequency with which a pattern occurs is used to decide between conflicting groups. Much research in unsupervised language learning is concerned with bootstrapping syntactic structure on the basis of pattern similarity and statistics from large (unannotated) language corpora (e.g. Finch & Chater 1994; Brent and Cartwright 1996; van Zaanen 2000). One of our future goals is to investigate whether such unsupervised learning techniques carry over to bootstrapping musical structure, and whether the learned structure corresponds to the structure as perceived by human listeners. On the other hand, there is already a considerable amount of work on unsupervised musical pattern induction (e.g. Cope 1990; Crawford et al. 1998; Rolland & Ganascia 2000). We hope to assess these models, along with unsupervised models of natural language learning, for the task of bootstrapping structure in a large musical corpus. Once an initial corpus of musical patterns has been bootstrapped, these patterns can be used by our probabilistic models to more efficiently parse new pieces. Only for completely new sequences of notes that have never appeared before, unsupervised methods need still to be invoked. The exact interplay between unsupervised and supervised (or memory-based) aspects of musical parsing needs to await further investigation. 5. Conclusion
15 We have shown that probabilistic parsing models from Natural Language Processing can be successfully applied to musical parsing. We have tested three models that parse musical pieces by combining fragments from structures of previously encountered pieces. In case of ambiguity, these models compute the analysis that can be considered the most probable one on the basis of the occurrence-frequencies of the fragments. We developed a new parser which combines two of these techniques (i.e. the Markov grammar technique and the DOP technique), and which can correctly predict up to 85.9% of the phrases for a test set of 1,000 folksongs from the Essen Folksong Collection. We hope that our results may serve as a baseline for other computational models of music analysis. Our parser may also be used to speed up the time-consuming annotation of newly collected folksongs, thereby contributing to the creation of larger musical databases in computer-assisted musicology. References E. Black, S. Abney, D. Flickinger, C. Gnadiec, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini and T. Strzalkowski, A Procedure for Quantitatively Comparing the Syntactic Coverage of English, Proceedings DARPA Speech and Natural Language Workshop, Pacific Grove, Morgan Kaufmann. R. Bod, Using an Annotated Language Corpus as a Virtual Stochastic Grammar. Proceedings AAAI'93, Morgan Kaufmann, Menlo Park. R. Bod, Beyond Grammar: An Experience-Based Theory of Language, Stanford, CSLI Publications (distributed by Cambridge University Press). R. Bod, 2001a. What is the Minimal Set of Fragments that Achieves Maximal Parse Accuracy? Proceedings ACL'2001, Toulouse, France. R. Bod, 2001b. Evidence against the Gestalt Principles in Music. Proceedings International Computer Music Conference 2001 (ICMC'2001), Havana, Cuba. (to appear in September 2001) T. Booth, Probabilistic Representation of Formal Languages, Tenth Annual IEEE Symposium on Switching and Automata Theory. M. Brent and T. Cartwright, Distributional Regularity and Phonotactic Contraints are Useful for Segmentation, Cognition, 61, E. Cambouropoulos, A Formal Theory for the Discovery of Local Boundaries in a Melodic Surface. Proceedings of the Troisièmes Journées d'informatique Musicale (JIM-96), Caen, France. E. Cambouropoulos, Musical Rhythm: A Formal Model for Determining Local Boundaries, Accents and Meter in a Melodic Surface, in M. Leman (ed.), Music, Gestalt and Computing - Studies in Systematic and Cognitive Musicology, Berlin, Springer-Verlag. E. Cambouropoulos, Towards a General Computational Theory of Musical Structure, Ph.D. thesis, University of Edinburgh, UK.
16 E. Cambouropoulos, T. Crawford and C. Iliopoulos, Pattern Processing in Melodic Sequences: Challenges, Caveats and Prospects. Computers and the Humanities 35: E. Charniak, Statistical Language Learning, Cambridge, The MIT Press. E. Charniak, Tree-bank Grammars, Proceedings AAAI-96, Menlo Park, Ca. E. Charniak, Statistical Techniques for Natural Language Parsing, AI Magazine, Winter 1997, E. Charniak, A Maximum-Entropy-Inspired Parser. Proceedings ANLP-NAACL'2000, Seattle, Washington. K. Church and W. Gale, A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams, Computer Speech and Language 5, M. Collins, Head-Driven Statistical Models for Natural Language Parsing, PhD-thesis, University of Pennsylvania, PA. M. Collins, Discriminative Reranking for Natural Language Parsing, Proceedings ICML-2000, Stanford, Ca. D. Cope, Pattern-Matching as an Engine for the Computer Simulation of Musical Style, Proceedings ICMC'1990, Glasgow, UK. R. Crawford, C. Iliopoulos, and R. Raman, String Matching Techniques for Musical Similarity and Melodic Recognition, Computing in Musicology 11, S. Finch and N. Chater Distributional Bootstrapping: From Word Class to Proto-Sentence, Proceedings 16th Annual Cognitive Science Society, , Hillsdale, Lawrence Erlbaum. I. Good, The Population Frequencies of Species and the Estimation of Population Parameters, Biometrika 40, S. Handel, Listening. An Introduction to the Perception of Auditory Events. Cambridge, The MIT Press. F. Lerdahl and R. Jackendoff, A Generative Theory of Tonal Music. Cambridge, The MIT Press. H. Longuet-Higgins, Perception of Melodies. Nature 263, October 21, H. Longuet-Higgins and C. Lee, The Rhythmic Interpretation of Monophonic Music. In: Mental Processes: Studies in Cognitive Science, Cambridge, The MIT Press. C. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. Cambridge, The MIT Press. M. Marcus, B. Santorini and M. Marcinkiewicz, Building a Large Annotated Corpus of English: the Penn Treebank, Computational Linguistics 19(2). E. Narmour, The Analysis and Cognition of Basic Melodic Structures: The Implication-Realization Model, The University of Chicago Press, Chicago. E. Narmour, The Analysis and Cognition of Melodic Complexity, The University of Chicago Press, Chicago. W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C. Cambridge University Press.
17 P. Rolland and J. Ganascia, Musical Pattern Extraction and Similarity Assessment, in E. Miranda (ed.) Readings in Music and Artificial Intelligence, Harwood Academic Publishers. H. Schaffrath, Repräsentation einstimmiger Melodien: computerunterstützte Analyse und Musikdatenbanken. In B. Enders and S. Hanheide (eds.) Neue Musiktechnologie, , Mainz, B. Schott's Söhne. H. Schaffrath, The Essen Folksong Collection in the Humdrum Kern Format. D. Huron (ed.). Menlo Park, CA: Center for Computer Assisted Research in the Humanities. E. Selfridge-Field, The Essen Musical Data Package. Menlo Park, California: Center for Computer Assisted Research in the Humanities (CCARH). S. Seneff, TINA: A Natural Language System for Spoken Language Applications. Computational Linguistics 18(1), K. Sima'an, Tree-gram Parsing: Lexical Dependencies and Structural Relations, Proceedings ACL'2000, Hong Kong, China. J. Tenney and L. Polansky, Temporal Gestalt Perception in Music, Journal of Music Theory, 24, M. Wertheimer, Untersuchungen zur Lehre von der Gestalt. Psychologische Forschung 4, C. Wetherell, Probabilistic Languages: A Review and Some Open Questions, Computing Surveys, 12(4). M. van Zaanen, Bootstrapping Structure and Recursion Using Alignment-Based Learning, Proceedings International Conference on Machine Learning (ICML'2000), Stanford, California.
Probabilistic Grammars for Music
Probabilistic Grammars for Music Rens Bod ILLC, University of Amsterdam Nieuwe Achtergracht 166, 1018 WV Amsterdam rens@science.uva.nl Abstract We investigate whether probabilistic parsing techniques from
More informationAutomatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *
Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * David Ortega-Pacheco and Hiram Calvo Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan
More informationN-GRAM-BASED APPROACH TO COMPOSER RECOGNITION
N-GRAM-BASED APPROACH TO COMPOSER RECOGNITION JACEK WOŁKOWICZ, ZBIGNIEW KULKA, VLADO KEŠELJ Institute of Radioelectronics, Warsaw University of Technology, Poland {j.wolkowicz,z.kulka}@elka.pw.edu.pl Faculty
More informationExtracting Significant Patterns from Musical Strings: Some Interesting Problems.
Extracting Significant Patterns from Musical Strings: Some Interesting Problems. Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence Vienna, Austria emilios@ai.univie.ac.at Abstract
More informationChords not required: Incorporating horizontal and vertical aspects independently in a computer improvisation algorithm
Georgia State University ScholarWorks @ Georgia State University Music Faculty Publications School of Music 2013 Chords not required: Incorporating horizontal and vertical aspects independently in a computer
More informationAnalysis of local and global timing and pitch change in ordinary
Alma Mater Studiorum University of Bologna, August -6 6 Analysis of local and global timing and pitch change in ordinary melodies Roger Watt Dept. of Psychology, University of Stirling, Scotland r.j.watt@stirling.ac.uk
More informationMOTIVE IDENTIFICATION IN 22 FOLKSONG CORPORA USING DYNAMIC TIME WARPING AND SELF ORGANIZING MAPS
10th International Society for Music Information Retrieval Conference (ISMIR 2009) MOTIVE IDENTIFICATION IN 22 FOLKSONG CORPORA USING DYNAMIC TIME WARPING AND SELF ORGANIZING MAPS ABSTRACT A system for
More informationA COMPARISON OF STATISTICAL AND RULE-BASED MODELS OF MELODIC SEGMENTATION
A COMPARISON OF STATISTICAL AND RULE-BASED MODELS OF MELODIC SEGMENTATION M. T. Pearce, D. Müllensiefen and G. A. Wiggins Centre for Computation, Cognition and Culture Goldsmiths, University of London
More informationTool-based Identification of Melodic Patterns in MusicXML Documents
Tool-based Identification of Melodic Patterns in MusicXML Documents Manuel Burghardt (manuel.burghardt@ur.de), Lukas Lamm (lukas.lamm@stud.uni-regensburg.de), David Lechler (david.lechler@stud.uni-regensburg.de),
More informationThe ACL Anthology Network Corpus. University of Michigan
The ACL Anthology Corpus Dragomir R. Radev 1,2, Pradeep Muthukrishnan 1, Vahed Qazvinian 1 1 Department of Electrical Engineering and Computer Science 2 School of Information University of Michigan {radev,mpradeep,vahed}@umich.edu
More informationHowever, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene
Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.
More informationPitch Spelling Algorithms
Pitch Spelling Algorithms David Meredith Centre for Computational Creativity Department of Computing City University, London dave@titanmusic.com www.titanmusic.com MaMuX Seminar IRCAM, Centre G. Pompidou,
More informationTowards the Generation of Melodic Structure
MUME 2016 - The Fourth International Workshop on Musical Metacreation, ISBN #978-0-86491-397-5 Towards the Generation of Melodic Structure Ryan Groves groves.ryan@gmail.com Abstract This research explores
More informationTake a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University
Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You Chris Lewis Stanford University cmslewis@stanford.edu Abstract In this project, I explore the effectiveness of the Naive Bayes Classifier
More informationAudio Feature Extraction for Corpus Analysis
Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1 Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends
More informationMusic Radar: A Web-based Query by Humming System
Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,
More informationPerceptual Evaluation of Automatically Extracted Musical Motives
Perceptual Evaluation of Automatically Extracted Musical Motives Oriol Nieto 1, Morwaread M. Farbood 2 Dept. of Music and Performing Arts Professions, New York University, USA 1 oriol@nyu.edu, 2 mfarbood@nyu.edu
More informationBuilding a Better Bach with Markov Chains
Building a Better Bach with Markov Chains CS701 Implementation Project, Timothy Crocker December 18, 2015 1 Abstract For my implementation project, I explored the field of algorithmic music composition
More informationA MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION
A MULTI-PARAMETRIC AND REDUNDANCY-FILTERING APPROACH TO PATTERN IDENTIFICATION Olivier Lartillot University of Jyväskylä Department of Music PL 35(A) 40014 University of Jyväskylä, Finland ABSTRACT This
More informationAutomatic meter extraction from MIDI files (Extraction automatique de mètres à partir de fichiers MIDI)
Journées d'informatique Musicale, 9 e édition, Marseille, 9-1 mai 00 Automatic meter extraction from MIDI files (Extraction automatique de mètres à partir de fichiers MIDI) Benoit Meudic Ircam - Centre
More informationMelody classification using patterns
Melody classification using patterns Darrell Conklin Department of Computing City University London United Kingdom conklin@city.ac.uk Abstract. A new method for symbolic music classification is proposed,
More informationMeter Detection in Symbolic Music Using a Lexicalized PCFG
Meter Detection in Symbolic Music Using a Lexicalized PCFG Andrew McLeod University of Edinburgh A.McLeod-5@sms.ed.ac.uk Mark Steedman University of Edinburgh steedman@inf.ed.ac.uk ABSTRACT This work proposes
More informationEXPLAINING AND PREDICTING THE PERCEPTION OF MUSICAL STRUCTURE
JORDAN B. L. SMITH MATHEMUSICAL CONVERSATIONS STUDY DAY, 12 FEBRUARY 2015 RAFFLES INSTITUTION EXPLAINING AND PREDICTING THE PERCEPTION OF MUSICAL STRUCTURE OUTLINE What is musical structure? How do people
More informationMusical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki
Musical Creativity Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki Basic Terminology Melody = linear succession of musical tones that the listener
More informationEtna Builder - Interactively Building Advanced Graphical Tree Representations of Music
Etna Builder - Interactively Building Advanced Graphical Tree Representations of Music Wolfgang Chico-Töpfer SAS Institute GmbH In der Neckarhelle 162 D-69118 Heidelberg e-mail: woccnews@web.de Etna Builder
More informationComputer Coordination With Popular Music: A New Research Agenda 1
Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,
More informationThe Human Features of Music.
The Human Features of Music. Bachelor Thesis Artificial Intelligence, Social Studies, Radboud University Nijmegen Chris Kemper, s4359410 Supervisor: Makiko Sadakata Artificial Intelligence, Social Studies,
More informationMusic Information Retrieval Using Audio Input
Music Information Retrieval Using Audio Input Lloyd A. Smith, Rodger J. McNab and Ian H. Witten Department of Computer Science University of Waikato Private Bag 35 Hamilton, New Zealand {las, rjmcnab,
More informationAlgorithmic Music Composition
Algorithmic Music Composition MUS-15 Jan Dreier July 6, 2015 1 Introduction The goal of algorithmic music composition is to automate the process of creating music. One wants to create pleasant music without
More informationModeling memory for melodies
Modeling memory for melodies Daniel Müllensiefen 1 and Christian Hennig 2 1 Musikwissenschaftliches Institut, Universität Hamburg, 20354 Hamburg, Germany 2 Department of Statistical Science, University
More informationCS229 Project Report Polyphonic Piano Transcription
CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project
More informationComputational Modelling of Harmony
Computational Modelling of Harmony Simon Dixon Centre for Digital Music, Queen Mary University of London, Mile End Rd, London E1 4NS, UK simon.dixon@elec.qmul.ac.uk http://www.elec.qmul.ac.uk/people/simond
More informationAn Empirical Comparison of Tempo Trackers
An Empirical Comparison of Tempo Trackers Simon Dixon Austrian Research Institute for Artificial Intelligence Schottengasse 3, A-1010 Vienna, Austria simon@oefai.at An Empirical Comparison of Tempo Trackers
More informationFeature-Based Analysis of Haydn String Quartets
Feature-Based Analysis of Haydn String Quartets Lawson Wong 5/5/2 Introduction When listening to multi-movement works, amateur listeners have almost certainly asked the following situation : Am I still
More informationA Probabilistic Model of Melody Perception
Cognitive Science 32 (2008) 418 444 Copyright C 2008 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1080/03640210701864089 A Probabilistic Model of
More informationFigured Bass and Tonality Recognition Jerome Barthélemy Ircam 1 Place Igor Stravinsky Paris France
Figured Bass and Tonality Recognition Jerome Barthélemy Ircam 1 Place Igor Stravinsky 75004 Paris France 33 01 44 78 48 43 jerome.barthelemy@ircam.fr Alain Bonardi Ircam 1 Place Igor Stravinsky 75004 Paris
More informationRobert Alexandru Dobre, Cristian Negrescu
ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q
More informationCharacteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals
Characteristics of Polyphonic Music Style and Markov Model of Pitch-Class Intervals Eita Nakamura and Shinji Takaki National Institute of Informatics, Tokyo 101-8430, Japan eita.nakamura@gmail.com, takaki@nii.ac.jp
More informationAlgorithmic Composition: The Music of Mathematics
Algorithmic Composition: The Music of Mathematics Carlo J. Anselmo 18 and Marcus Pendergrass Department of Mathematics, Hampden-Sydney College, Hampden-Sydney, VA 23943 ABSTRACT We report on several techniques
More informationMusic Performance Panel: NICI / MMM Position Statement
Music Performance Panel: NICI / MMM Position Statement Peter Desain, Henkjan Honing and Renee Timmers Music, Mind, Machine Group NICI, University of Nijmegen mmm@nici.kun.nl, www.nici.kun.nl/mmm In this
More informationHuman Preferences for Tempo Smoothness
In H. Lappalainen (Ed.), Proceedings of the VII International Symposium on Systematic and Comparative Musicology, III International Conference on Cognitive Musicology, August, 6 9, 200. Jyväskylä, Finland,
More informationHidden Markov Model based dance recognition
Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,
More informationAcoustic and musical foundations of the speech/song illusion
Acoustic and musical foundations of the speech/song illusion Adam Tierney, *1 Aniruddh Patel #2, Mara Breen^3 * Department of Psychological Sciences, Birkbeck, University of London, United Kingdom # Department
More informationSudhanshu Gautam *1, Sarita Soni 2. M-Tech Computer Science, BBAU Central University, Lucknow, Uttar Pradesh, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Artificial Intelligence Techniques for Music Composition
More informationModeling the Effect of Meter in Rhythmic Categorization: Preliminary Results
Modeling the Effect of Meter in Rhythmic Categorization: Preliminary Results Peter Desain and Henkjan Honing,2 Music, Mind, Machine Group NICI, University of Nijmegen P.O. Box 904, 6500 HE Nijmegen The
More informationBach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network
Indiana Undergraduate Journal of Cognitive Science 1 (2006) 3-14 Copyright 2006 IUJCS. All rights reserved Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network Rob Meyerson Cognitive
More informationSpeaking in Minor and Major Keys
Chapter 5 Speaking in Minor and Major Keys 5.1. Introduction 28 The prosodic phenomena discussed in the foregoing chapters were all instances of linguistic prosody. Prosody, however, also involves extra-linguistic
More informationLESSON 1 PITCH NOTATION AND INTERVALS
FUNDAMENTALS I 1 Fundamentals I UNIT-I LESSON 1 PITCH NOTATION AND INTERVALS Sounds that we perceive as being musical have four basic elements; pitch, loudness, timbre, and duration. Pitch is the relative
More informationResearch Article. ISSN (Print) *Corresponding author Shireen Fathima
Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)
More informationTHE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. Gideon Broshy, Leah Latterner and Kevin Sherwin
THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. BACKGROUND AND AIMS [Leah Latterner]. Introduction Gideon Broshy, Leah Latterner and Kevin Sherwin Yale University, Cognition of Musical
More informationMETHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING
Proceedings ICMC SMC 24 4-2 September 24, Athens, Greece METHOD TO DETECT GTTM LOCAL GROUPING BOUNDARIES BASED ON CLUSTERING AND STATISTICAL LEARNING Kouhei Kanamori Masatoshi Hamanaka Junichi Hoshino
More informationOn time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance
RHYTHM IN MUSIC PERFORMANCE AND PERCEIVED STRUCTURE 1 On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance W. Luke Windsor, Rinus Aarts, Peter
More informationSinger Recognition and Modeling Singer Error
Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing
More informationMusic Composition with RNN
Music Composition with RNN Jason Wang Department of Statistics Stanford University zwang01@stanford.edu Abstract Music composition is an interesting problem that tests the creativity capacities of artificial
More informationStudent Performance Q&A:
Student Performance Q&A: 2002 AP Music Theory Free-Response Questions The following comments are provided by the Chief Reader about the 2002 free-response questions for AP Music Theory. They are intended
More informationAnalysis and Clustering of Musical Compositions using Melody-based Features
Analysis and Clustering of Musical Compositions using Melody-based Features Isaac Caswell Erika Ji December 13, 2013 Abstract This paper demonstrates that melodic structure fundamentally differentiates
More informationTranscription of the Singing Melody in Polyphonic Music
Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,
More informationA Beat Tracking System for Audio Signals
A Beat Tracking System for Audio Signals Simon Dixon Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria. simon@ai.univie.ac.at April 7, 2000 Abstract We present
More informationPredicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J.
UvA-DARE (Digital Academic Repository) Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies Janssen, B.D.; Burgoyne, J.A.; Honing, H.J. Published in: Frontiers in
More informationAuditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are
In: E. Bruce Goldstein (Ed) Encyclopedia of Perception, Volume 1, Sage, 2009, pp 160-164. Auditory Illusions Diana Deutsch The sounds we perceive do not always correspond to those that are presented. When
More informationAutomatic Rhythmic Notation from Single Voice Audio Sources
Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung
More informationAutomated extraction of motivic patterns and application to the analysis of Debussy s Syrinx
Automated extraction of motivic patterns and application to the analysis of Debussy s Syrinx Olivier Lartillot University of Jyväskylä, Finland lartillo@campus.jyu.fi 1. General Framework 1.1. Motivic
More informationA probabilistic approach to determining bass voice leading in melodic harmonisation
A probabilistic approach to determining bass voice leading in melodic harmonisation Dimos Makris a, Maximos Kaliakatsos-Papakostas b, and Emilios Cambouropoulos b a Department of Informatics, Ionian University,
More informationStudent Performance Q&A: 2001 AP Music Theory Free-Response Questions
Student Performance Q&A: 2001 AP Music Theory Free-Response Questions The following comments are provided by the Chief Faculty Consultant, Joel Phillips, regarding the 2001 free-response questions for
More informationThe Sparsity of Simple Recurrent Networks in Musical Structure Learning
The Sparsity of Simple Recurrent Networks in Musical Structure Learning Kat R. Agres (kra9@cornell.edu) Department of Psychology, Cornell University, 211 Uris Hall Ithaca, NY 14853 USA Jordan E. DeLong
More informationSimilarity matrix for musical themes identification considering sound s pitch and duration
Similarity matrix for musical themes identification considering sound s pitch and duration MICHELE DELLA VENTURA Department of Technology Music Academy Studio Musica Via Terraglio, 81 TREVISO (TV) 31100
More informationAn Integrated Music Chromaticism Model
An Integrated Music Chromaticism Model DIONYSIOS POLITIS and DIMITRIOS MARGOUNAKIS Dept. of Informatics, School of Sciences Aristotle University of Thessaloniki University Campus, Thessaloniki, GR-541
More informationSHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS
SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS Areti Andreopoulou Music and Audio Research Laboratory New York University, New York, USA aa1510@nyu.edu Morwaread Farbood
More informationMelody Retrieval On The Web
Melody Retrieval On The Web Thesis proposal for the degree of Master of Science at the Massachusetts Institute of Technology M.I.T Media Laboratory Fall 2000 Thesis supervisor: Barry Vercoe Professor,
More informationPerception-Based Musical Pattern Discovery
Perception-Based Musical Pattern Discovery Olivier Lartillot Ircam Centre Georges-Pompidou email: Olivier.Lartillot@ircam.fr Abstract A new general methodology for Musical Pattern Discovery is proposed,
More informationHarmony and tonality The vertical dimension. HST 725 Lecture 11 Music Perception & Cognition
Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani Harmony and tonality The vertical dimension HST 725 Lecture 11 Music Perception & Cognition
More informationAutocorrelation in meter induction: The role of accent structure a)
Autocorrelation in meter induction: The role of accent structure a) Petri Toiviainen and Tuomas Eerola Department of Music, P.O. Box 35(M), 40014 University of Jyväskylä, Jyväskylä, Finland Received 16
More informationDELTA MODULATION AND DPCM CODING OF COLOR SIGNALS
DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS Item Type text; Proceedings Authors Habibi, A. Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings
More informationMelodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem
Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem Tsubasa Tanaka and Koichi Fujii Abstract In polyphonic music, melodic patterns (motifs) are frequently imitated or repeated,
More informationPLANE TESSELATION WITH MUSICAL-SCALE TILES AND BIDIMENSIONAL AUTOMATIC COMPOSITION
PLANE TESSELATION WITH MUSICAL-SCALE TILES AND BIDIMENSIONAL AUTOMATIC COMPOSITION ABSTRACT We present a method for arranging the notes of certain musical scales (pentatonic, heptatonic, Blues Minor and
More information6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016
6.UAP Project FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System Daryl Neubieser May 12, 2016 Abstract: This paper describes my implementation of a variable-speed accompaniment system that
More information2 2. Melody description The MPEG-7 standard distinguishes three types of attributes related to melody: the fundamental frequency LLD associated to a t
MPEG-7 FOR CONTENT-BASED MUSIC PROCESSING Λ Emilia GÓMEZ, Fabien GOUYON, Perfecto HERRERA and Xavier AMATRIAIN Music Technology Group, Universitat Pompeu Fabra, Barcelona, SPAIN http://www.iua.upf.es/mtg
More informationInstrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationAutoChorale An Automatic Music Generator. Jack Mi, Zhengtao Jin
AutoChorale An Automatic Music Generator Jack Mi, Zhengtao Jin 1 Introduction Music is a fascinating form of human expression based on a complex system. Being able to automatically compose music that both
More informationStructure and Interpretation of Rhythm and Timing 1
henkjan honing Structure and Interpretation of Rhythm and Timing Rhythm, as it is performed and perceived, is only sparingly addressed in music theory. Eisting theories of rhythmic structure are often
More informationTransition Networks. Chapter 5
Chapter 5 Transition Networks Transition networks (TN) are made up of a set of finite automata and represented within a graph system. The edges indicate transitions and the nodes the states of the single
More informationProbabilist modeling of musical chord sequences for music analysis
Probabilist modeling of musical chord sequences for music analysis Christophe Hauser January 29, 2009 1 INTRODUCTION Computer and network technologies have improved consequently over the last years. Technology
More information19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;
More informationCPU Bach: An Automatic Chorale Harmonization System
CPU Bach: An Automatic Chorale Harmonization System Matt Hanlon mhanlon@fas Tim Ledlie ledlie@fas January 15, 2002 Abstract We present an automated system for the harmonization of fourpart chorales in
More informationTOWARDS STRUCTURAL ALIGNMENT OF FOLK SONGS
TOWARDS STRUCTURAL ALIGNMENT OF FOLK SONGS Jörg Garbers and Frans Wiering Utrecht University Department of Information and Computing Sciences {garbers,frans.wiering}@cs.uu.nl ABSTRACT We describe an alignment-based
More informationMUSICAL STRUCTURAL ANALYSIS DATABASE BASED ON GTTM
MUSICAL STRUCTURAL ANALYSIS DATABASE BASED ON GTTM Masatoshi Hamanaka Keiji Hirata Satoshi Tojo Kyoto University Future University Hakodate JAIST masatosh@kuhp.kyoto-u.ac.jp hirata@fun.ac.jp tojo@jaist.ac.jp
More informationQuarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos Friberg, A. and Sundberg,
More informationMELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations
MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations Dominik Hornel dominik@ira.uka.de Institut fur Logik, Komplexitat und Deduktionssysteme Universitat Fridericiana Karlsruhe (TH) Am
More informationEIGENVECTOR-BASED RELATIONAL MOTIF DISCOVERY
EIGENVECTOR-BASED RELATIONAL MOTIF DISCOVERY Alberto Pinto Università degli Studi di Milano Dipartimento di Informatica e Comunicazione Via Comelico 39/41, I-20135 Milano, Italy pinto@dico.unimi.it ABSTRACT
More informationMeasurement of overtone frequencies of a toy piano and perception of its pitch
Measurement of overtone frequencies of a toy piano and perception of its pitch PACS: 43.75.Mn ABSTRACT Akira Nishimura Department of Media and Cultural Studies, Tokyo University of Information Sciences,
More informationEvolutionary jazz improvisation and harmony system: A new jazz improvisation and harmony system
Performa 9 Conference on Performance Studies University of Aveiro, May 29 Evolutionary jazz improvisation and harmony system: A new jazz improvisation and harmony system Kjell Bäckman, IT University, Art
More informationEfficient Processing the Braille Music Notation
Efficient Processing the Braille Music Notation Tomasz Sitarek and Wladyslaw Homenda Faculty of Mathematics and Information Science Warsaw University of Technology Plac Politechniki 1, 00-660 Warsaw, Poland
More informationCharacterization and improvement of unpatterned wafer defect review on SEMs
Characterization and improvement of unpatterned wafer defect review on SEMs Alan S. Parkes *, Zane Marek ** JEOL USA, Inc. 11 Dearborn Road, Peabody, MA 01960 ABSTRACT Defect Scatter Analysis (DSA) provides
More informationEarly Applications of Information Theory to Music
Early Applications of Information Theory to Music Marcus T. Pearce Centre for Cognition, Computation and Culture, Goldsmiths College, University of London, New Cross, London SE14 6NW m.pearce@gold.ac.uk
More informationThe Human, the Mechanical, and the Spaces in between: Explorations in Human-Robotic Musical Improvisation
Musical Metacreation: Papers from the 2013 AIIDE Workshop (WS-13-22) The Human, the Mechanical, and the Spaces in between: Explorations in Human-Robotic Musical Improvisation Scott Barton Worcester Polytechnic
More informationA QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM
A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr
More informationToward an analysis of polyphonic music in the textual symbolic segmentation
Toward an analysis of polyphonic music in the textual symbolic segmentation MICHELE DELLA VENTURA Department of Technology Music Academy Studio Musica Via Terraglio, 81 TREVISO (TV) 31100 Italy dellaventura.michele@tin.it
More informationA Case Based Approach to the Generation of Musical Expression
A Case Based Approach to the Generation of Musical Expression Taizan Suzuki Takenobu Tokunaga Hozumi Tanaka Department of Computer Science Tokyo Institute of Technology 2-12-1, Oookayama, Meguro, Tokyo
More informationOn Interpreting Bach. Purpose. Assumptions. Results
Purpose On Interpreting Bach H. C. Longuet-Higgins M. J. Steedman To develop a formally precise model of the cognitive processes involved in the comprehension of classical melodies To devise a set of rules
More informationAUTOMATIC MELODIC REDUCTION USING A SUPERVISED PROBABILISTIC CONTEXT-FREE GRAMMAR
AUTOMATIC MELODIC REDUCTION USING A SUPERVISED PROBABILISTIC CONTEXT-FREE GRAMMAR Ryan Groves groves.ryan@gmail.com ABSTRACT This research explores a Natural Language Processing technique utilized for
More information