THE INTERSECTION OF COMPUTATIONAL ANALYSIS AND MUSIC MANUSCRIPTS: A NEW MODEL FOR BACH SOURCE STUDIES OF THE 21ST CENTURY

THE INTERSECTION OF COMPUTATIONAL ANALYSIS AND MUSIC MANUSCRIPTS: A NEW MODEL FOR BACH SOURCE STUDIES OF THE 21ST CENTURY Masahiro Niitsuma Tsutomu Fujinami Yo Tomita School of Muisc and Sonic Arts, Queen s University, Belfast School of Knowledge Science, Japan Advanced Institute of Science and Technology (JAIST) niizuma@nak.ics.keio.ac.jp, fuji@jaist.ac.jp, y.tomita@qub.ac.uk ABSTRACT This paper addresses the intersection of computational analysis and musicological source studies. In musicology, scholars often find themselves in the situation where their methodologies are inadequate to achieve their goals. Their problems appear to be twofold: (1) the lack of scientific objectivity and (2) the over-reliance on new source discoveries. We propose three stages to resolve these problems, a preliminary result of which is shown. The successful outcome of this work will have a huge impact not only on musicology but also on a wide range of subjects. 1. INTRODUCTION Recent developments in computer and information technology have brought significant changes to the ways in which we conduct research in a wide range of domains, and musicology is not an exception. Yet in historical musicology the majority of scholars still conduct their research without making full use of this technological advancement, thus creating huge potential for future advancement. By nature, their research methods are less scientific, i.e. they tend not to, or find it impossible to disclose all the information they used in order to arrive at their conclusions, and hence it is often difficult to verify their findings regardless of whether or not there are elements of subjective judgment in them. There is a separate problem in musicology in that the majority of source-based studies heavily rely on the rediscovery of new sources. 1 Thus, if a new source is not found, there is often little discussion to challenge the existing interpretation offered by scholars in the past. Is there really no way of improving the theories unless a new source is Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. c 2009 International Society for Music Information Retrieval. 1 Sources refer to manuscript sources, that is written scores by hand. Before the invention of printing, music was preserved either by oral transmission or by MS copies. discovered? How can a computer assist musicologists in analysing the information contained in the known sources? The main objective of this study is to solve such problems in historical musicology by addressing the following questions: 1. Can computational analysis offer the same conclusions as those arrived at by historical musicologists? 2. Are there any oversights in the musicologists analysis of the sources? To achieve our objectives, it is necessary to address the following issues: 1. How to define a data structure for storing Bach s manuscripts in digital format; 2. How to extract information from the digitised manuscripts; 3. How to analyse the extracted information. This paper is structured as follows: Section 2 describes the relationship between the proposed methods and existing scholarly debates in the field; Section 3 discusses the research methods to be employed; Section 4 shows a preliminary result of the proposed method; Section 5 illustrates the contribution that the proposed research will make; and Section 6 offers concluding remarks. 2. PREVIOUS RESEARCH There are numerous research projects dealing with computation in musicology and different kinds of data formats have been proposed to encode musical data [1 3]. However, all of them deal with limited musical information such as pitch or rhythm derived from printed scores, and the majority of previous research on computational music analysis [4 9] is based on those data formats. There is also a certain amount of research related to automatic music analysis using the signal-processing technique with acoustic sources [10 14], which record musical performance from published scores. But if we investigate only published scores, rather than the original manuscripts, we miss important information that has been lost in the process of creating an edition. 519

Poster Session 3 Recent journal articles or proceedings of ISMIR [15 17] includes a considerable number of researches on the Optical Music Recognition (OMR). Most of them deal with staff removal algorithm, which eases the preprocessing of the digitised images of the manuscripts such as the music symbol recognition. With regard to the research related to manuscript analysis, Tomita developed a database of variants and errors which supposedly lists all the extant manuscripts and early prints of the Well-Tempered Clavier II, a work well known for its complex history of compilation, revision and transmission [18]. The database contains all kinds of information extracted from manuscripts not only musical variants but also notational errors and variants that may have been inherited from its model or may cause errors when fresh copies were made from it giving us many insights into how the future database should be developed. 3. METHODOLOGY There are three stages in this project: 1. To define of a data structure for storing Bach s manuscripts in digital format; 2. To develop a methodology to automatically extract data from the digitised images of music manuscripts; 3. To develop a methodology to analyse these data to find significant information for musicological study. In the first instance, a data structure that is appropriate to be analysed by computers needs to be defined. This data structure should be designed in such a way that it can encode all the information extracted from manuscripts not only musical aspects such as pitch or rhythm, but also the physical aspects of the manuscript which may account for the scribe s unintentional omissions, misplacement, superfluous symbols that were somehow caused by the appearance of its exemplar. This has been investigated with the collaboration of musicologists. Secondly, a method will be developed to harvest the information useful for research from the digitised images of the manuscripts. At the moment, we consider primarily the visible information such as the direction of stems or the position of note-heads. The first task is the recognition of each music symbol such as staff line, bar line, note stem, note head and clef. The Gamera [19] framework will be used for this task. Finally, a method to analyse the data will be proposed. In order to achieve this, powerful machine learning methods such as bagging [20], boosting [21], and random forest [22] will be adopted. Figure 1 illustrates how the proposed method operates. First, a digitised image file is created by physically scanning the manuscripts. Secondly, symbolic data is extracted from the digitised image file. Thirdly, computational analysis is carried out using the symbolic data. Start Physical manuscript data Scanning Digitised image file Data extraction Symbolic data Computational analysis End Figure 1. Flowchart of the proposed method 4. PRELIMINARY EXPERIMENT 4.1 An overview of the preliminary experiment This sections presents a preliminary result of the third stage described under 3. Methodology. Currently, the first and second stages are conducted manually, while the program was developed for the third stage. To demonstrate the performance of the latter, the simplest example would be to examine the origin and authenticity of variants. Because WTC II was so popular among Bach s pupils and admirers during and after his lifetime, numerous manuscripts were made, copied and edited, which not only increased the number of errors or variant readings, but also resulted in introducing contamination to the texts in some sources [23, 24]. This program produces a source affiliation diagram showing how closely these sources were related, taking into account the differences that may be caused either by accident or on purpose while being copied. In this paper, we focus on the sources of Viennese origin, which are considered to have been originated from a copy that was brought from Berlin to Vienna in 1777 by Gottfried van Sweieten (1734-1803). How the unique text of the Viennese sources evolved up has been the principal interest for musicologists, for this was the state of musical text which Mozart learned in 1782. In [25], Tomita investigated the Viennese sources, thereby proposing a source affiliation diagram of them, an excerpt of which is shown in Figure 2. 4.2 Preliminary result We describe one approach to this task using the database developed by Tomita [24], an excerpt of which is shown in Figure 3, where S/N is the serial number given to each examination point; Bar indicates in which measure(s) the elements are examined; V, bt/pos stands for Voice, Beat and Position, respectively; Element specifies the target of enquiry; Spec. Loc gives graphic representation of information under examination; Classified suggests text-critical significance. Firstly, the distance between two manuscripts should be defined. The simplest way is to count the number of different factors between two manuscripts. In Figure 3, Q11731 has no different factors from 520

cluster analysis using a set of dissimilarities calculated on the basis of Equation (1). Initially, each manuscript is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster. At each stage distances between clusters are recomputed by the Lance-Williams dissimilarity update formula according to the complete linkage method. Cluster Dendrogram No. 543 Figure 2. Score affiliation diagram of the Well-tempered Clavier Book II, generated by human analysis (excerpted from [18]) those of, thus the distance between Q11731 and is 0. On the other hand, Q11731 has three factors which are different from those of, thus the distance between Q11731 and is 3. However, such observation dose not reflect the reality sufficiently. To improve the accuracy of observation, we should consider how easily each factor can change. For instance, notational factors such as the direction of the stem or position of the note-head are more likely to change than musical factors such as pitch or duration. Taking this into consideration, genealogical distance is defined by the following equation, SNX D(MSS1, MSS2) = α Typei I(MSS1[i], MSS2[i]) (1) i=1 where, M SS1 and M SS2 denote two different manuscripts, MSS[i] denotes the ith content of MSS, α Typei is the weight considering the fluidity of each type of the content, and I(x,y) is the indicator function which returns 0 if x = y else 1. In this paper, all α Typei were equalized, leaving an adjustment of α Typei as a future task. Figure 3. Database used for the experiment (excerpted from [18]). Secondly, manuscripts are clustered by a hierarchical Height 22 24 26 28 30 32 34 S.M.210 dist hclust (*, "complete") Figure 4. Score affiliation diagram of Fugue No.22 in B minor from the Well-tempered Clavier Book II, generated by computational analysis Figure 4 illustrates an example of source affiliation diagram automatically generated by the proposed algorithm. Manuscripts of Fugues 10, 12 and 14 were used to calculate the distance between each manuscript. This result is almost consistent with that of human analysis, while the position of (Berea) is considered to be different. This result indicates that this database is sufficient to achieve a rough classification; but to achieve a more reliable classification or for further analysis, it is necessary to develop a new data structure that is suitable for a more detailed computational analysis. The manual weighting of α Typei can reflect the expert knowledge of musicologists; however it could also reflect their own subjectivity. To exclude it, a method for automatic weighting of these factors should be investigated. There are numerous possibilities of using these databases for analysis and the potential is far-reaching. Figure 5 shows biplot of the result of the principle component analysis. This reveals that there exists a large gap between Add.35021 (Bach s autograph manuscript) and the Viennese sources. Figure 6 shows the result of the variable importance estimation for the classification of the manuscripts of Fugue 23 by random forest, where y-axis corresponds to S/N of the text critical database. This indicates that S/N 475, and 136 are important for computer to classify them. These analyses using appropriate databases are considered Q10782 Q11731 521

Poster Session 3 PC2 0.8 0.6 0.4 0.2 0.0 0.2 0.4 10 5 0 5 V121 V502 V343V165 Federhofer V475 V36 V177 V1 V181 V7 S.M.210.2 V89 V383 V497 V291 V230 V39 V220 V213 Q11731 V98 V151 V278 V451 V296 V242 Q10782 V56 V204 V406 V265 V319 V250 V422 V501V444 V465 V470 V345 V485 V363 V499 V129 V309 V254 V293 V85 V21 V10 V67 V342 V370 V65 V318 V271 V264 V94V235 V367 V150 V60 V28 V30 V29 V471 V166 V16 V35 V316 V381 V188 V325 V327 V194 V147 V216 V156 V143V362 V122 V163 V227 V224 V300 V435 V359 V294 V307 V308 V47 V61 V41 V64 V99 V297 V399V320 V423 V335 V360 V260 V261 V246 V207 V149 V176 V439 V430 V450 V494 V68 V105 V144 V9 V70 V75 V43 V52 V125 V172V141 V106 V51 V44V100 V74 V113 V110 V4V5 V13 V20 V117 V101 V130 V66 V62 V50 V55V54 V72 V73 V63 V78 V80 V81 V58 V53 V69V82 V104 V96 V32 V48 V83 V84 V119 V86 V103 V112 V131 V133 V154V135 V90 V95 V97 V76 V114 V111 V19 V79 V88 V17 V26 V91 V102 V108 V109 V92 V115 V120 V123 V124 V127 V155 V174 V178 V192 V210 V321 V302 V180 V179 V160 V175 V128 V134 V140 V142 V138 V152 V185 V202 V311V322 V340 V445 V341 V401 V407 V411 V239 V282 V283 V217 V205 V253 V208 V228 V198 V212 V139 V158 V229 V187 V189 V190 V234 V209 V168 V226 V182 V183 V184 V173 V222 V200 V201 V161 V233 V237 V231 V232 V243 V251 V195 V238 V273 V256 V258 V259 V249 V240 V263 V262 V219 V153 V137 V244 V267 V366 V352 V284 V413 V373 V375 V382 V277 V332V292 V324 V326 V298 V301 V304 V221 V203 V191 V270 V266V268 V274 V305 V328 V426 V334 V351 V379 V380 V337 V385 V387 V427 V466 V493 V457 V520 V483 V480 V448 V513 V436 V491V474 V431 V478 V455 V458 V323 V338 V306 V346 V312 V336 V347 V349 V314 V329 V286 V279 V290 V376 V378 V392 V421 V393 V434 V400 V408 V410 V412 V414 V415 V416 V418 V419 V420 V395 V396 V397 V350 V403 V354 V355 V357 V368 V369 V374 V384 V386 V361 V425 V437 V280 V116 V199 V118V223 V276 V288 V285 V289 V330 V377 V389 V371 V372 V333 V364 V365 V344 V390 V424V428 V398 V402 V331 V299 V356V257 V241 V157 V42 V145 V196 V107 V214 V248 V255 V252 V193 V225 V245 V170 V31 V206 V148 V236 V272 V287 V310 V315 V295 V317 V339 V303 V358 V348V388 V432 V391 V394 V409 V433 V440 V441 V442 V469 V500 V456 V452 V460 V477 V464 V516 V509 V482 V484 V498 V503 V504 V505 V487 V472 V461 V510V492 V486 V463 V479 V495 V511 V512 V518 V488 V519 V429 V506 V507 V508 V481 V489 V462 V447 V476 V467 V313 V404 V417 V438 V496 V514 V446 V38 V159 V218 V215 V453 V281 V517 V59 V71 V12 V93 V77 V57 V169 V269 V353V146V162 V164 V211 V473 V405 V449 V468 V443 V87 V515 V11 V126 V171V167 V275 V459 V132 V186 V454 V197 S.M.210.1 V15 V490 V18 V136 V247 0.8 0.6 0.4 0.2 0.0 0.2 0.4 PC1 Add.35021 10 5 0 5 V475 V447 V340 V435 V308 V465 V446 V489 V318 V61 V430 V515 V2 V19 V93 V195 V265 V272 V429 V98 V89 V7 V16 V69 V162 V186 V221 V224 V290 V297 0.40 0.50 0.60 MeanDecreaseAccuracy V136 V242 V10 V295 V486 V35 V315 V171 V297 V447 V74 V11 V248 V316 V93 V470 V15 V490 V288 V285 V62 V121 V115 V126 V68 V497 V299 V311 V159 V471 0.00 0.02 0.04 MeanDecreaseGini Figure 6. Result of variable importance estimation for the classification of Viennese sources by a random forest, where y-axis corresponds to S/N of Fugue 23 shown in [18]: for example, V475 is notation difference of rest in bar 89; V136 is the existence of accidental in bar32. Figure 5. Biplot produced from the output of the principle component analysis of the text critical database of Fugue 23 to bring the objectivity and new findings to historical musicology. Another area of investigation is an automatic handwriting analysis. The method for identifying handwriting in noisy document images [26] cannot directly be applied to music manuscripts. This is because handwriting identification needs not only visual information such as curvature (which represents the shape of the curves or bending angle) but also multifaceted information such as the purpose for which a manuscript was written, the scribe s habits, the conditions under which the manuscript was made, and so on. The proposed method is expected to overcome such difficulties by taking into account the multifaceted information with the appropriate database for computational analysis. 5. CONTRIBUTION This research makes main contributions in the following areas: 1. The proposed method will provide a way to verify previous research in historical musicology; 2. It will be possible to offer new information about the sources from the already known sources; 3. The proposed method can be a prototype of an empirical research method. The result of the proposed research has a good potential for becoming a road map for musicological research of the future, and empirical research method would offer an alternative to the previous research methods often criticised for their inherent subjectivism. Consequently, it is hoped that the majority of previous research may be reworked by using the proposed methods. In this process, new discoveries can still be made that would shed new light on the musical works concerned without requiring the rediscovery of new sources. Moreover, the results of the proposed research may also serve as a prototype in other areas of research, such as archaeology, historical literature or other social science subjects that involve the study of historical sources. 6. CONCLUSION In this paper, we have shown the necessity of using the computational approach in source studies. We also addressed the problems of subjective attitudes and its overreliance on new source discoveries in traditional research methods in musicology. Three stages that may resolve these problems have been discussed. The outcome of this work should affect not only musicology but also a wide range of subjects. 522

7. REFERENCES [1] Content-based unified interfaces and descriptors for audio/music databases available. http://www.cuidado.mu. [2] Online music recognition and searching. http://www.elec.qmul.ac.uk/research/projects/ nsf 9905842 omras.html. [3] Web delivering of music. http://www.wedelmusic.org. [4] Greg Aloupis, Thomas Fevens, Stefan Langerman, Tomomi Matsui, Antonio Mesa, Yurai Nuez, David Rappaport, Godfried, and Toussaint. Algorithms for computing geometric measures of melodic similarity. Computer Music Journal, 30(3):67 76, 2006. [5] David Huron. Music information processing using the humdrum toolkit: Concepts, examples and lessons. Computer Music Journal, 26(2):11 26, 2002. [6] Steven Jan. Meme hunting with the humdrum toolkit: Principles, problems and prospects. Computer Music Journal, 28(4):68 84, 2004. [7] David Meredith. The ps13 pitch spelling algorithm. Journal of New Music Research, 35(2):121 159, 2006. [8] Wai Man Szeto and Man Hon Wong. A graphtheoretical approach for pattern matching in posttonal music analysis. Journal of New Music Research, 35(4):304 321, 2006. [9] Heinrich Taube. Automatic tonal analysis: Toward the implementation of a music theory workbench. Computer Music Journal, 23(4):16 32, 1999. [10] Lee Kyogu and Slaney Malcolm. Acoustic chord transcription and key extraction from audio using keydependent hmms trained on synthesized audio. IEEE Transactions on audio, speech, and language processing, 16(2):291 301, 2008. [11] Olivier Lartillot. A musical pattern discovery system founded on a modeling of listening strategies. Computer Music Journal, 28(3):56 67, 2004. [12] Pierre Leveau, Emmanuel Vincent, and Gal Richard. Instrument-specic harmonic atoms for mid-level music representation. IEEE Transactions on audio, speech, and language processing, 16(1):116 128, 2008. [13] Rui Pedro Paiva, Teresa Mendes, and Amlcar Cardoso. Melody detection in polyphonic musical signals: Exploiting perceptual rules, note salience, and melodic smoothness. Computer Music Journal, 30(4):80 89, 2006. [14] Li Yipeng and Wang DeLiang. Musical sound separation using pitch-based labeling and binary timefrequency masking. Proceedings of International Conference on Acoustics Speech and Signal Processing, pages 173 176, 2008. [15] C. Dalitz, B. Czerwinski M. Droettboom, and I. Fujinaga. A comparative study of staff removal algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5):753 66, 2008. [16] J. Hockman Pugin, L., J.A. Burgoyne, and I. Fujinaga. Gamera versus aruspix: Two optical music recognition approaches. Proceedings of the International Conference on Music Information Retrieval, pages 419 424, 2008. [17] John Ashley Burgoyne, Johanna Devaney, Laurent Pugin, and Ichiro Fujinag. Enhanced bleedthrough correction for early music documents with recto-verso registration. Proceedings of the International Conference on Music Information Retrieval, pages 407 412, 2008. [18] Yo Tomita. J. S. Bach s Das Wohltemperierte Clavier II A Critical Commentary. Volume II: All the Extant Manuscripts. Leeds: Household World, 1995. [19] The Gamera framework for building custom recognition systems. Michael droettboom and karl macmillan and ichiro fujinaga. Proceedings of the Symposium on Document Image Understanding Technologies, pages 275 86, 2003. [20] L Breiman. Bagging predictors. Machine Learning, 24(2):123 140, 1996. [21] Y Freud and R.E Schapire. Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference on Machine Learning, pages 148 156, 1996. [22] L Breiman. Random forests. Machine Learning, 45(1):5 32, 2001. [23] Yo Tomita. Breaking the limits: some consideration on an e-science approach to source studies. Musicology and Globalization. Proceedings of the international Congress of the Musicological Society of Japan, pages 233 237, 2002. [24] Yo Tomita and Tsutomu Fujinami. Managing a large text-critical database of Johann Sebastian Bach s Well- Tempered Clavier II with XML and relational database. Proceedings of the International Musicological Conference, 2002. [25] Yo Tomita. The Sources of J.S. Bach s Well-Tempered Clavier II in Vienna, 1777-1801. BACH: Journal of the Riemenschneider Bach Institute, 29(2):8 79, 1998. [26] Yefeng Zheng, Huiping Li, and David Doermann. Machine printed text and handwriting identification in noisy document images. Pattern Analysis and Machine Intelligence, 26(3):337 353, 2004. 523