A Critical Examination of David Epstein s Phase Synchrony Theory of Rubato. Bruno H. Repp Haskins Laboratories

Similar documents
DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Computer Coordination With Popular Music: A New Research Agenda 1

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

Analysis of local and global timing and pitch change in ordinary

Smooth Rhythms as Probes of Entrainment. Music Perception 10 (1993): ABSTRACT

Music Performance Panel: NICI / MMM Position Statement

Notes on David Temperley s What s Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered By Carley Tanoue

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

A Beat Tracking System for Audio Signals

Tempo and Beat Analysis

On the contextual appropriateness of performance rules

Temporal coordination in string quartet performance

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Augmentation Matrix: A Music System Derived from the Proportions of the Harmonic Series

Human Preferences for Tempo Smoothness

Polyrhythms Lawrence Ward Cogs 401

THE INTERACTION BETWEEN MELODIC PITCH CONTENT AND RHYTHMIC PERCEPTION. Gideon Broshy, Leah Latterner and Kevin Sherwin

Kant: Notes on the Critique of Judgment

Measurement of overtone frequencies of a toy piano and perception of its pitch

Perceiving temporal regularity in music

Timing variations in music performance: Musical communication, perceptual compensation, and/or motor control?

MUCH OF THE WORLD S MUSIC involves

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

The Environment and Organizational Effort in an Ensemble

Realizing Waveform Characteristics up to a Digitizer s Full Bandwidth Increasing the effective sampling rate when measuring repetitive signals

Bach-Prop: Modeling Bach s Harmonization Style with a Back- Propagation Network

(Refer Slide Time 1:58)

How to Obtain a Good Stereo Sound Stage in Cars

EMBODIED EFFECTS ON MUSICIANS MEMORY OF HIGHLY POLISHED PERFORMANCES

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Structure and Interpretation of Rhythm and Timing 1

THESIS MIND AND WORLD IN KANT S THEORY OF SENSATION. Submitted by. Jessica Murski. Department of Philosophy

The Relationship Between Auditory Imagery and Musical Synchronization Abilities in Musicians

Construction of a harmonic phrase

Precision testing methods of Event Timer A032-ET

Human Hair Studies: II Scale Counts

Conclusion. One way of characterizing the project Kant undertakes in the Critique of Pure Reason is by

The influence of musical context on tempo rubato. Renee Timmers, Richard Ashley, Peter Desain, Hank Heijink

Quarterly Progress and Status Report. Matching the rule parameters of PHRASE ARCH to performances of Träumerei : a preliminary study

Acoustic and musical foundations of the speech/song illusion

Spatial-frequency masking with briefly pulsed patterns

Improving Piano Sight-Reading Skills of College Student. Chian yi Ang. Penn State University

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

Musical Entrainment Subsumes Bodily Gestures Its Definition Needs a Spatiotemporal Dimension

Activation of learned action sequences by auditory feedback

6.5 Percussion scalograms and musical rhythm

Variations on a Theme by Chopin: Relations Between Perception and Production of Timing in Music

Authentication of Musical Compositions with Techniques from Information Theory. Benjamin S. Richards. 1. Introduction

THE BASIS OF JAZZ ASSESSMENT

Hidden Markov Model based dance recognition

Assessing and Measuring VCR Playback Image Quality, Part 1. Leo Backman/DigiOmmel & Co.

A Computational Model for Discriminating Music Performers

PHL 317K 1 Fall 2017 Overview of Weeks 1 5

Quantify. The Subjective. PQM: A New Quantitative Tool for Evaluating Display Design Options

TEMPO AND BEAT are well-defined concepts in the PERCEPTUAL SMOOTHNESS OF TEMPO IN EXPRESSIVELY PERFORMED MUSIC

A Case Based Approach to the Generation of Musical Expression

Comparison, Categorization, and Metaphor Comprehension

The Tone Height of Multiharmonic Sounds. Introduction

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Understanding the Limitations of Replaying Relay-Created COMTRADE Event Files Through Microprocessor-Based Relays

On music performance, theories, measurement and diversity 1

Music Radar: A Web-based Query by Humming System

EFFECT OF REPETITION OF STANDARD AND COMPARISON TONES ON RECOGNITION MEMORY FOR PITCH '

Extreme Experience Research Report

The Beat Alignment Test (BAT): Surveying beat processing abilities in the general population

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

1 Introduction to PSQM

Lab #10 Perception of Rhythm and Timing

Interface Practices Subcommittee SCTE STANDARD SCTE Measurement Procedure for Noise Power Ratio

Analysis and Clustering of Musical Compositions using Melody-based Features

Student Performance Q&A:

PART II METHODOLOGY: PROBABILITY AND UTILITY

HST 725 Music Perception & Cognition Assignment #1 =================================================================

Course Report Level National 5

ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1

Computational Parsing of Melody (CPM): Interface Enhancing the Creative Process during the Production of Music

by Staff Sergeant Samuel Woodhead

A Bayesian Network for Real-Time Musical Accompaniment

A-LEVEL Music. MUS2A Mark scheme June Version 1.0: Final Mark Scheme

PHYSICS OF MUSIC. 1.) Charles Taylor, Exploring Music (Music Library ML3805 T )

Before I proceed with the specifics of each etude, I would like to give you some general suggestions to help prepare you for your audition.

Experiment 4: Eye Patterns

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Introduction to Performance Fundamentals

LESSON 1 PITCH NOTATION AND INTERVALS

Chapter 40: MIDI Tool

Please fax your students rhythms from p.7 to us AT LEAST THREE DAYS BEFORE the video conference. Our fax number is

Can scientific impact be judged prospectively? A bibliometric test of Simonton s model of creative productivity

Rhythmic Dissonance: Introduction

REPORT ON THE NOVEMBER 2009 EXAMINATIONS

Communication Studies Publication details, including instructions for authors and subscription information:

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

Transcription:

A Critical Examination of David Epstein s Phase Synchrony Theory of Rubato Bruno H. Repp Haskins Laboratories First complete draft, April 16, 1996

Repp: Epstein critique 2 ABSTRACT Epstein s phase synchrony theory of rubato, presented in Chapter 11 of his recent book, Shaping Time (1995), postulates two independent timing systems: a rigid ground beat and a flexible pulse that characterizes rubato. According to the theory, rubato is controlled by resolving the dissynchrony between the two timing systems at strategic points, namely phrase boundaries, at which phase synchrony is (re-) established. Epstein reports empirical data suggesting that outstanding pianists performances exhibit phase synchrony to a large degree. The present article presents a critique of Epstein s theory and methodology, as well as an attempt to replicate his measurements of two performances. The theory is shown to rest on untenable or highly implausible assumptions, the methodology is found to be flawed, and the phase fits are revealed to conform to chance expectations, besides being difficult to replicate due to measurement error.

Repp: Epstein critique 3 INTRODUCTION The present article is concerned with a theory of expressive timing (or rubato) proposed by David Epstein (1995) in his recent book, Shaping Time (ST: Chapter 11) and with the empirical data presented by him in support of the theory. The theory was developed in the context of a broader framework, dealing with the control of tempo and timing in music and with the underlying psychobiological mechanisms, in particular oscillatory processes in the brain. One fundamental distinction drawn by Epstein is that between beat and pulse, more generally between meter and rhythm, and even more generally between clock time and experienced time. Metrical beats are assumed to be strictly regular and periodic, whereas rhythmic pulses are more variable and may expand and contract, precede or lag behind beats. A misalignment between these two levels of temporal organization creates tension which needs to be resolved from time to time, usually at structural boundaries such as the ends of phrases. The resolution consists of a temporary re-establishment of synchrony. This, in essence, is Epstein s phase synchrony theory of rubato, particularly of Romantic rubato in which there is often no regular beat at the acoustic/auditory surface. 1 Epstein makes two crucial assumptions here: First, he assumes that, despite the absence of metrical regularity at the musical surface, a beat nevertheless ticks away within our system in stable metric fashion (ST: p. 549), unheard... but still felt (ST: p. 29). He calls this the ground beat. Second, he assumes that, even though the ground beat is covert, it nevertheless manifests itself briefly, namely at or near the beginning of a phrase. Therefore, the duration of the ground beat is assumed to be measurable, as is the total duration of a phrase. 2 From these assumptions follows the crucial prediction of integral fit : If there is phase synchrony at the beginning and end of a phrase, then there must be an integral number of ground beats within a phrase.

Repp: Epstein critique 4 Three important qualifications are added, however, which bear on the empirical testability of the hypothesis. First, the number of ground beats in a phrase need not be equal to the number of notated beats: Epstein admits any number of ground beats as evidence for phase synchrony. Second, the integral fit need not be exact: Epstein is willing to accept deviations of up to 0.1 from any integral number as supporting his hypothesis. Third, Epstein suggests that phase synchrony may characterize only good, perhaps only outstanding, performances. In support of his hypothesis, Epstein presents detailed analyses of expressive timing in performances of three compositions (Chopin s Mazurka in A minor, op. 17, No. 4, and Waltz in A-flat major, op. 69, No. 1, and Brahms s Intermezzo, op. 76, No. 4), each by a distinguished pianist (Guiomar Novaes, Dinu Lipatti, and Walter Gieseking, respectively). In each case, and for nearly every phrase, he was able to identify and measure a ground beat at or near the phrase beginning that yielded an integral fit to the phrase duration within the limits of acceptability adopted by him. These results seem to provide strong support for the phase synchrony theory. The present critique will proceed in three stages. First, the assumptions underlying Epstein s theory will be examined more closely. Second, his empirical methodology will be subjected to similar scrutiny. Third, his data will be compared to the results of reanalyses of two of the performances. A final discussion will summarize the results of these investigations. 3 CRITIQUE OF EPSTEIN S THEORY It must be pointed out, first of all, that the phase synchrony theory is not a complete theory of rubato. In fact, it concerns only one aspect of rubato, namely the timing of the beginnings and ends of phrases. The theory has nothing to say about the expressive timing within a phrase, where most of the action is. Epstein may be

Repp: Epstein critique 5 assuming implicitly that the goal of achieving phase synchrony at the end of a phrase constrains the rubato within a phrase in some way, but these constraints are not spelled out or tested in ST. 4 In principle, a performer could achieve the desired phase fit (if there is such a goal) by making a small local timing adjustment at the end of a phrase, during the final ritard. Clearly, Epstein is concerned more with the achievement of phase synchrony than with the control of phase dissynchrony (i.e., rubato). The basic idea underlying the theory a conflict between two timing systems, one mechanically precise, the other expressively modulated is intriguing. However, three questions arise in connection with the hypothetical covert ground beat: (1) How is it initiated? (2) How is it maintained without any external support? (3) How is it terminated? None of these fundamental issues is discussed in ST. The answer to the first question may be that the ground beat is initiated by an overt action of the musician. This would simultaneously rationalize why the ground beat must surface at the beginning of a phrase. This is not what Epstein says, however. On the contrary, he briefly considers the possibility of a totally hidden ground beat that is suggested only, played with, played against, and played around (ST: p. 388). 5 This leaves unanswered the questions of how the ground beat is initiated and why a hidden ground beat should ever be manifest at the surface. The question about the ground beat s maintenance raises more serious problems. In general, an oscillatory process needs to be entrained and sustained by some external input (Large & Kolen, 1994; Large & Jones, submitted). Without such input, the process will degrade and soon cease to operate. Since the hypothetical ground beat is not confirmed until the very end of a phrase and, until that point, is in conflict with the performer s overt actions and their acoustic consequences, it is not clear what enables it to persist during the phrase. Epstein seems to assume a totally autonomous process that is unaffected by external events and is maintained without any loss of accuracy. Such processes do occur at various levels of physiological activity (heart beat, brain waves,

Repp: Epstein critique 6 etc.), but they rest on specialized mechanisms, and their periodicity is not under voluntary control. Epstein s ground beat is determined by the performer, yet it is assumed to have the characteristics of an autonomous process, which seems paradoxical. If the ground beat persists easily through a whole phrase, then it should also continue through the next phrase (perhaps indefinitely), especially after having been confirmed by the phase synchrony at the phrase boundary. However, Epstein argues on the basis of his data that the ground beat changes from phrase to phrase. Leaving aside for the moment the question of why there should be such a change, it must be asked: What happens to the old ground beat? And how is a new ground beat initiated when the old ground beat is still active? Epstein seems to assume that ground beats can not only be initiated but also terminated instantaneously and at will. These assumptions, together with that of the autonomous maintenance of the ground beat, are oblivious of the time constants and memory-like properties of dynamic oscillatory systems (see Large & Jones, submitted) and therefore are unrealistic. Epstein s ground beat seems to be a purely mental construct without any physical or physiological embodiment, much like the concept of meter in music theory. Finally, Epstein s admittance of an arbitrary number of ground beats per phrase is counterintuitive, to say the least. It is not easy to understand what it means for, say, 19 ground beats to occur in a phrase containing 24 nominal beats. Surely, in the hierarchical metrical scheme of that phrase (which may be part of musicians and listeners cognitive representations), the phrase boundary occurs after 24 beats, not 19. Even though Epstein discusses metrical schemes in considerable detail elsewhere in ST, it seems that the hypothetical ground beat is divorced from nominal meter. It is particularly strange for Epstein to permit (as he does by virtue of his empirical findings) the number of ground beats to be smaller than the number of nominal beats, in view of

Repp: Epstein critique 7 his statement that distorted rhythmic timings, caused by rubato, elongate the duration of the performed phrase (ST: p. 375). Even if Epstein s assumptions are granted, the theory has psychological implications that seem implausible. If there is a covert ground beat, it should be possible to confirm its presence during a phrase by a variety of methods. The most direct method would be to ask a musician to count along with the ground beat or tap it out with the foot while playing rubato. Yet, these activities are likely to be difficult to carry out simultaneously: Regular counting or tapping is likely to interfere with expressive timing, and/or the rubato will make the performer count or tap irregularly with the overt pulse. This may be so because the whole body is engaged in the rhythmic activity of performance. In that case, more indirect methods may be used: The performer could be asked to synchronize a metronome with the ground beat, or to judge whether or not a running metronome matches the ground beat, or to judge whether a single click during a phrase coincides with a ground beat. The same tasks, as well as overt counting or tapping, could be carried out by a listener who is not engaged in the performance. Some of these experiments should perhaps be done, but it seems unlikely that they will provide evidence for a hidden ground beat. Furthermore, a highly competent listener (such as Epstein himself) should be able to judge whether or not phase synchrony has been achieved at the end of a phrase. Epstein clearly implies this when he says: This return into phase is one of the high points of rubato playing... Failure to achieve this synchrony could be a source of irritation for the sensitive listener (ST: p. 375). Yet it seems implausible that such a judgment can actually be made. As will become clear soon, the hypothetical phase synchrony can be destroyed by minute timing changes inside a phrase that are below the discrimination threshold and presumably leave the quality of the rubato unchanged. Epstein may have exercised keen judgment about the quality of the rubato in the three performances he selected for analysis, but it is doubtful that this judgment

Repp: Epstein critique 8 was based on phase synchrony as such. It would be easy to demonstrate that any of these performances is indistinguishable from a slightly altered version in which all phase fits have been destroyed by modifying (editing on a computer) one or two interonset intervals in each phrase. By a similar procedure, any execrable performance by a beginning piano student could easily be manipulated to exhibit perfect phase fits, without changing its quality at all. These armchair experiments suggest that phase synchrony is neither necessary nor sufficient for good rubato. CRITIQUE OF EPSTEIN S METHODOLOGY While the foregoing criticisms seem serious and potentially fatal to Epstein s theory, they are open to debate and empirical test. In this section I turn to an examination of the methods Epstein used to collect data in support of his theory. It will be seen that these methods are just as problematic as the theory itself. The first issue to be discussed concerns measurement accuracy. Epstein s basic data are the intervals between tone onsets in acoustically recorded piano performances. Although digital waveform editors have been available for more than two decades now, Epstein employed instead an antiquated method of analog tape measurement (see ST: p. 160). This involved locating attack points by slowly pulling the tape back and forth over the playback head and making pencil marks on the tape. Later, the distances between the pencil marks were measured with a millimeter ruler and converted into milliseconds. On the basis of the ruler s resolution, Epstein estimated the measurements to be accurate within 5 ms. However, his accuracy of locating attack points is not known, as he did not conduct multiple independent measurements of the same performance. Apart from the (probably small) error involved in marking the tape, larger errors can arise in selecting and locating attack points. The onsets of low-intensity tones are often very difficult to hear accurately, and this is especially true for low-

Repp: Epstein critique 9 pitched piano tones which also have relatively slow amplitude rise times (Repp, 1995a). Also, when several tone onsets occur simultaneously, they are often not in exact synchrony, and a choice must be made: Should the very first onset be measured, or that in a particular hand or voice (soprano melody or bass)? During rubato, some performers introduce rather large asynchronies between the hands, so that differences of tens or even hundreds of milliseconds may arise from changing the measurement criterion. These issues are not discussed by Epstein in ST, and so it is not clear what he actually measured. Now let us consider the criterion for deciding that an empirically obtained ratio of phrase duration to ground beat duration is integral. Epstein s criterion is ±0.1; for example, any ratio between 18.9 and 19.1 is considered as acceptable evidence for 19 ground beats in a phrase. 6 The criterion is admittedly artificial, but Epstein considers it a tough standard (ST: p. 377) in comparison to the psychophysical discrimination threshold for temporal intervals as long as a phrase (i.e., many seconds). This seems the wrong reference interval, however. The appropriate reference interval is the hypothetical ground beat, which Epstein assumes to be maintained during the phrase without loss of accuracy, and which in that case should have a discrimination threshold of about 5%. Actually, however, the issue is not interval discrimination but phase synchrony, and the discrimination threshold for this judgment (effectively a successiveness threshold) may be even smaller. Of course, a performer or listener may decide to find even perceptible asynchronies tolerable. Therefore, the choice of a criterion remains an arbitrary matter. 7 In fact, the choice of a criterion is far less important than a consideration of the chance probability of obtaining evidence in favor of phase synchrony for any arbitrary criterion, a topic not addressed by Epstein. Given that the theory permits any number of ground beats in a phrase, and given any acceptability criterion c, so that ratios between (n - c) and (n + c) are accepted as evidence for the integral ratio n, then the probability that any randomly chosen rational

Repp: Epstein critique 10 number will yield an integral fit is 2c. Thus, for a criterion of 0.1 the probability is 0.2, as can easily be proven. The probability that no fit will be obtained is (1-2c). Now suppose two numbers are chosen at random. The probability that neither of them will yield a fit is (1-2c) 2. Therefore, the probability that at least one of them will yield a fit is [1 - (1-2c) 2 ]. More generally, then, the probability that at least one of n possible ground beats will fit a phrase integrally is [1 - (1-2c) n ]. This chance probability grows very rapidly as n increases. For example, for c = 0.1 and n = 2, it is 0.36; for n = 4, it is 0.59; for n = 6, it is 0.74. This needs to be taken into account when considering several possible ground beat candidates. The quantity c need not be fixed but can be varied continuously or in steps to derive a probability distribution of phase fits against which obtained data can be compared. In his performance analyses, Epstein does consider multiple ground beat candidates. For example, for the 15 phrases of the Chopin Mazurka in A minor, op. 17, No. 4, he accepts six different units as ground beats : 8 the initial beat, the initial measure, the second measure, the first two measures, the third and fourth measures, and the first motive. If these six different candidate units were considered in every phrase, 74% of all phrases (or 11 out of 15) should have yielded a phase fit within 0.1 for at least one of the six units by chance alone. Epstein found phase fits in 14 phrases. This is better than chance expectations but of marginal statistical significance. (The probability of finding at least 14 phase fits is 15(0.26)(0.74) 14 15 + (0.74) = 0.07.) There is nothing in ST to prevent the reader from inferring that Epstein s strategy was to search for a temporal interval near the beginning of each phrase that would yield an integral phase fit, starting with the most plausible candidates (initial beat, initial measure) and proceeding to less likely ones, and then to provide some post-hoc rationalization of the more unusual ground beats found. Epstein (personal communication, January 1996) has assured the author that this was not what he did. Rather, he considered only musically plausible and perceptually strong candidates,

Repp: Epstein critique 11 which were more restricted in the earlier than in the later phrases. Thus, the chance probability may have been lower than stated here, but its exact magnitude remains uncertain in the absence of explicit a priori hypotheses about the size and location of the ground beat in each phrase. The point is that Epstein did not say what the ground beat candidates were before he obtained his measurements. There are additional problems with Epstein s liberal choice of ground beat candidates. First, he ignores the phenomenon of phrase-initial lengthening which is often observed in performance (Todd, 1985, 1995). It is the likely reason for his finding (see below) that the number of ground beats is often smaller than the number of nominal beats, especially when the ground beat is the initial beat or bar. Second, it seems paradoxical to search for an integral fit of beats or bars other than the first with the total phrase duration, because in these cases phase synchrony apparently did not exist at the very beginning of the phrase. Third, the choice of so many different ground beats that change from phrase to phrase is psychologically and musically implausible; they hardly could have been predicted by a theory of rubato. In summary, Epstein s methodological seems highly problematic. Still, despite the uncertainty about chance probabilities, his data seem to reveal interesting numerical relationships. The following section focuses on the accuracy and replicability of the actual data. CRITIQUE OF EPSTEIN S DATA The first and most important requirement of any set of data is that they be accurate and replicable. Epstein s ear-and-hand tape measurements, despite all good intentions, may have been less accurate than he believed. Such measurements unavoidably contain some human error, and this error is likely to be larger than that in

Repp: Epstein critique 12 acoustic waveform measurements, where the eye comes to the aid of ear and hand. The investigator s experience in conducting such measurements also plays a role. Small errors in inter-onset interval measurements can change the supposed phase synchrony substantially, especially when the ground beat is short (i.e., at the beat level). Suppose the true ground beat is 0.7 s long and fits exactly 20 times into a phrase of 14 s duration. If the limits of acceptability are ±0.1, so that empirical phase fits between 19.9 and 20.1 are acceptable, then the acceptable ground beat durations range from 14/20.1 = 0.6965 s to 14/19.9 = 0.7035 s. Thus the allowable measurement error is only ±3.5 ms. This is almost certainly smaller than the actual measurement error in Epstein s data, so that even he himself might not be able to replicate his phase fits in an independent re-measurement. In an attempt to replicate Epstein s findings, two of the three performances analyzed by him were re-measured using a digital waveform editor. They were Chopin s Mazurka in A minor, op. 17, No. 4, played by Guiomar Novaes (Vox PL 7920 [1961]), and Chopin s Waltz in A-flat major, op. 69, No. 1, played by Dinu Lipatti (Angel 35439 [1950]). 9 Method The performances were copied from the long-playing records onto digital audio tape and then were input at a sampling rate of 22.255 khz to a Macintosh Quadra 660AV computer. SoundEdit 16 software was used to display the digitized waveform, play it back, and label event onsets. The onset times were stored in a text file which then was imported into a spreadsheet program for calculation of various interonset intervals. The relevant tone onsets were labeled by positioning a cursor and typing in a numeric label. The waveform could be displayed at various levels of resolution, together with a spectrogram, if necessary. Tone onsets were located simultaneously by eye and by ear. The usual strategy was to select a waveform segment extending from

Repp: Epstein critique 13 the last labeled onset to beyond the next targeted onset, play it back to verify that the next onset was clearly audible, and then to repeat this sequence while moving the right edge of the segment to the left in small steps until no trace of the targeted sound could be heard. Sometimes it was necessary to repeat the procedure several times to arrive at a decision. There were a number of places in which the right and left hands were not in synchrony. Since Epstein does not discuss what he did in these cases, it was decided to measure both right-hand (melody) and left-hand (accompaniment) onsets whenever they could be distinguished. The measurements were conducted independently on separate copies of the sound file by the author (BR) and his research assistant (LR), a musicologist who had some previous experience in conducting waveform measurements, though less than the author. BR used a level of resolution of about 50 ms per inch, which made it possible to position the cursor within about 1 ms of the target point. LR chose to use a coarser resolution, with a positioning accuracy of about 3 ms, which is more comparable to Epstein s accuracy. Comparison of BR s and LR s measurements also revealed a difference in criterion: While BR marked the earliest audible and/or visible evidence of tone onsets, LR generally listened for a definite pitch and thus tended to mark onsets somewhat later. Results Chopin Mazurka. Before conducting a detailed comparison with Epstein s data, it was necessary to correct for an apparent difference in playback speed: Epstein reports a duration of 228.483 s for the whole piece (not including the four introductory bars), whereas the present measurements showed 230.560 s. To make absolute durations comparable with Epstein s published data, they were multiplied by a correction factor of 0.991. (Ratios, of course, were unaffected by the correction.)

Repp: Epstein critique 14 Table I compares the measured durations of the sixteen 8-bar phrases in the piece. 10 The first two columns show the phrase durations measured by BR and LR, based on right-hand (melody) onset times. The third column shows the differences between these measurements. With two exceptions, they were within 26 ms. The two large differences (phrases 4 and 5) arose because LR, but not BR, marked a substantially delayed right-hand onset at the beginning of phrase 5. (LR s left-hand onset measurement was in close agreement with BR s single measurement.) The fourth column shows Epstein s phrase durations (ST: pp. 384-386), and the following two columns list their differences from the present measurements. There are a number of large discrepancies. ------------------------- Insert Table I here ------------------------- It is possible that Epstein did not consistently measure the distances between right-hand (r) onsets but sometimes considered left-hand (l) onsets or some point between the two in cases of asynchrony. If either the beginning or the end of a phrase shows an asynchrony, then there are two ways of measuring the phrase duration (r-r and l-r, or r-r and r-l); if both the beginning and the end are asychronous, then there are four ways (r-r, r-l, l-r, and l-l). Therefore, phrase durations were determined in all possible ways from the measurements of both BR and LR and compared to Epstein s durations, in an effort to reduce the discrepancies. The results are indicated in the last column of Table I ( Min diff ). If there is no entry in that column, it indicates that the minimal discrepancy is shown in one of the two preceding columns. An entry that is not enclosed in parentheses indicates that the discrepancy could be reduced by considering left-hand onsets. Thus, the two rather large differences for phrases 12 and 13 could be eliminated by assuming that Epstein took the onset of phrase 13 to be in the left hand, so that phrase 12 ranged from a right-hand to a left-hand onset (r-l) and phrase 13

Repp: Epstein critique 15 ranged from a left-hand to a right-hand onset (l-r). Similarly, the discrepancies for phrases 6 and 7 could be reduced in this way, but that for phrase 6 nevertheless remained very large. The entries in parentheses indicate that a discrepancy could be reduced, but only at the expense of increasing that for an adjacent phrase. 11 Thus, even when allowance is made for a variable criterion in dealing with asynchronies, some large differences remain between Epstein s measurements and the present data. It is possible that these differences are due to inaccuracies in BR s and LR s measurements. Therefore, the measurements were repeated about two months later on new, unmarked copies of the audio file. With the exception of phrases 4 and 5, BR replicated his durations within 1 ms, and LR (who was less experienced in waveform measurements and used a coarser resolution) within 11 ms. For phrases 4 and 5 there were larger discrepancies, due to the difficulty of determining the right-hand onset of phrase 5. (The left-hand onset was closely replicated.) The remeasured durations of phrase 4 were 17.146 s (BR) and 17.122 s (LR), while those of phrase 5 were 13.605 s (BR) and 13.629 s (LR). It can be seen that the previous disagreement between BR and LR was much reduced, but the discrepancy with Epstein s measurements was not. Therefore, it can now be stated with some confidence that the differences between Epstein s durations (whose reliability is unknown) and the present measurements are not due to inaccuracies in the latter. Table II makes the same comparisons for the durations of the initial bars in the first 15 phrases. Even though they are rather long in duration, Epstein considered these units as plausible candidates for a ground beat. Again, there are considerable differences with Epstein s measurements, few of which can be resolved by considering left-hand onsets. 12 There are enormous discrepancies in phrases 2 and 3, where it seems that Epstein omitted a whole beat from the first bar. Other differences are within the range of variation observed in Table I. Those in phrases 4 and 6 may indicate that Epstein measured only up to the onset of the grace note that precedes the downbeat of

Repp: Epstein critique 16 the second bar. If so, this indicates an inconsistency in measurement criteria, for the same grace note occurs in phrases 1 and 12, where the differences were in the opposite direction. (BR and LR never marked the grace note onsets.) -------------------------- Insert Table II here -------------------------- The re-measurement by BR and LR of the first-bar durations revealed discrepancies with their first set of measurements in some phrases, up to 34 ms for BR and up to 43 ms for LR. Note onsets initiating the second bar were generally more difficult to determine than those initiating the first bar. However, the second measurements of BR and LR were generally in better agreement than their first measurements, whereas the differences with Epstein s durations were reduced in only a few phrases, such as Nos. 1, 5, and 11, where they were not very large to begin with. The larger discrepancies cannot be explained by measurement errors in this laboratory. However, the re-measurement demonstrates that a certain amount of measurement error was inevitable in the first-bar durations. Table III presents the comparisons for the initial beat durations in each phrase, which are Epstein s prime candidates for a ground beat, so that accuracy is of special importance. However, the differences are as large as in the previous tables. They are particularly large in phrases 2, 3, and 4, where BR and LR happened to be in close agreement. The differences can be reduced by assuming left-hand phrase onsets, at the cost of increasing some phrase duration discrepancies. Most of the other differences are within 30 ms. BR s and LR s second measurements revealed a measurement error of about the same magnitude. --------------------------- Insert Table III here ---------------------------

Repp: Epstein critique 17 However, differences in beat durations much smaller than 30 ms can wreak havoc with phase synchrony ratios. For example, in phrase 1 Epstein found that the first beat (709 ms) fit 19.06 times into the phrase duration (13.512 s), which was within his 0.1 acceptability limit. In the present data for right-hand onsets (Tables I and III), the first beat (BR: 0.694 s; LR: 0.698 s) fits 19.32 or 19.21 times into the phrase duration (BR: 13.406 s; LR: 13.409 s); neither value is within the acceptability limit. Note also that the 4-ms difference in measured beat durations between BR and LR causes a difference of 0.11 in the phase fit ratio! It should be quite clear by now that any attempt to replicate Epstein s phase fit ratios is doomed to failure. The human measurement error is larger than the degree of accuracy required, particularly where beat durations are concerned. Some of the variability, however, is between investigators, and the possibility remains (however unlikely it may seem in view of his tape-marking method) that Epstein s measurements are more valid and reliable than those of BR and LR. In that case, his data should provide stronger evidence for integral phase fits than do BR s and LR s measurements. Therefore, phase fit ratios were calculated for all phrases, using the first beat and the first bar in each phrase as the most plausible ground beat candidates. Only the durations based on right-hand onset times (Tables I III) were considered, because they are what the author would normally have chosen. The ratios are shown in Table IV. They are compared with Epstein s values, as copied from his Example 11.1d (ST: p. 386) or computed from his Example 11.1c (ST: p. 384). Epstein found significant phase fits of the initial beat in four phrases. None of these is replicated in the present data. BR s measurements suggest first-beat phase fits in two phrases, LR s measurements in five. Since the chance expectation is three, neither the present data nor Epstein s represent convincing evidence for phase synchrony at the beat level. The situation is a little better at the bar level, which is less affected by measurement error. Here Epstein found five significant phase fits, BR six, and LR five. The present data fully confirm two of

Repp: Epstein critique 18 Epstein s integral fits (phrases 8 and 14), and one partially (phrase 10). However, they also suggest phase fits in phrases 4 and 7, where Epstein found none, and they do not replicate his findings in phrases 2 and 9. (Note that his ratios are way off in phrases 2 and 3, due to the abnormally short first-bar durations measured there.) The chance probability of obtaining six or more phase fits by chance (from a table of the binomial distribution) is about 0.07. None of the three sets of data thus provides convincing support for initial-bar phase fits, and Epstein s data are no stronger in that regard than are BR s and LR s. They are all likely to be chance findings, and that conclusion is quite independent of any discrepancies due to measurement error. ---------------------------- Insert Table IV here ---------------------------- In five phrases, Epstein found significant phase synchrony with respect to other units that a priori seem rather implausible candidates for a ground beat. For the sake of completeness, these ratios were also computed and are listed in Table V. They confirm Epstein s findings in two phrases, and partially in two more. His phase fit for bar 2 in phrase 6 is not replicated. Of course, the larger the unit of the hypothetical ground beat, the less the critical ratio is affected by measurement error. This is the reason why there is better agreement with Epstein s data at this level than at the firstbar and first-beat levels. Despite this partial agreement, the data do not demonstrate that phase synchrony exceeds chance expectations. For each of the four different ground beat candidates considered in Table V, there is a chance probability of 0.2 for finding a phase fit. If all four candidates are considered simultaneously, the probability is 0.59 (see above). Thus, the chance expectation is that three of five selected phrases will show a phase fit for at least one of the candidate units, and even Epstein s finding of phase fits in all five phrases is not significantly above chance. (The probability of this occurring is

Repp: Epstein critique 19 0.59 5 = 0.07.) Since Epstein may have considered ground beat candidates beyond those that ultimately proved to fit, the probability may be even higher. --------------------------- Insert Table V here --------------------------- From this detailed examination of the Chopin Mazurka data, it may be concluded that, in this performance at least, the phase synchrony theory is entirely without empirical support. Chopin Waltz. The results for the Lipatti performance of the Chopin Waltz fully confirm the preceding observations and therefore will be described only briefly. This performance seemed easier to measure than the Novaes Mazurka, so that closer agreement among the three sets of measurements was expected. We will start here with a table of the phase fit ratios and work backwards from it to trace some of the differences. Epstein considered a variety of different ground beat candidates here, seven in all: initial beat, initial bar, initial two bars, as well as other individual beats in bars 1 and 2. The phase fits are shown in Table VI. Epstein found significant fits in all but one of the phrases. (Two alternative fits are listed for phrase 9.) Two of these fits are problematic, however: In phrase 8, the final beat was omitted from the phrase duration for the dubious reason that it was elongated and out-of-phase (ST: p. 403), and the duration of the last phrase was apparently determined by measuring to the cessation of the final chord, which is not only a questionable landmark but also one whose phase relationship to the hypothetical ground beat seems irrelevant. Here, the last phrase was measured to the onset of the final bar, which explains the smaller ratios (which did not yield integral phase fits). Also, the final beat was included in phrase 9, which may explain the absence of phase fits there. In phrase 6, BR and LR were unable to determine

Repp: Epstein critique 20 the onset of the second beat, due to considerable rhythmic liberties in Lipatti s playing at that point. ---------------------------- Insert Table VI here ---------------------------- Epstein s results were replicated in several phrases (3, 7, 10, 11, 12, 15). In others, however, there were large discrepancies, not only in the degree of phase fit but also in the size of the ratios. In phrase 2, Epstein reports a ground beat (bar 2, beat 1) of 0.698 s, whereas BR and LR measured 0.457 and 0.431 s, respectively. It is not clear what may have caused such a large error in this position. In phrase 8, Epstein s ground beat candidate (the first bar) is 1.417 s long, whereas BR and LR find 1.910 and 1.930 s, respectively. Here Epstein may have measured only to the onset of the grace note preceding the second beat (bar 57), whereas BR and LR included the grace note in the beat duration. In phrase 1, BR and LR found the ground beat candidate (the first beat) to be 0.668 and 0.666 s in duration, respectively, compared to Epstein s 0.656; this was sufficient to result in a nonsignificant phase fit. Consideration of left-hand onsets would have increased the discrepancy in this case. Similarly, in phrase 4, BR and LR find 0.597 and 0.603 s, respectively, for the second beat of bar 2, compared to Epstein s 0.619 s. A more detailed comparison of the Chopin Waltz measurements does not seem necessary in view of the fundamental problems with Epstein s theory and data. Suffice it to say that his phase fit ratios again could easily represent a chance finding. The probability of finding an integral fit for any of seven different ground beat candidates is 1 (0.8) 7 or 0.79. The probability of obtaining phase fits for all 15 phrases then is 0.03 (from a table of the binomial distribution). Since at least two of Epstein s phase fits are dubious, his results cannot be accepted as significant evidence. Of course, the situation would be different if Epstein had predicted the specific ground beats on independent theoretical or perceptual grounds. This, however, he did not do, and it cannot be done

Repp: Epstein critique 21 properly ex post facto. In fact, it would seem difficult under any circumstances to rationalize some of the ground beats chosen. DISCUSSION A close examination of Epstein s phase synchrony theory and data has led to the conclusion that the theory rests on implausible or untenable assumptions, that Epstein s empirical methodology is flawed, and that his data do not provide any evidence in support of his theory. The following specific criticisms were presented: (1) The theory is inadequate as a theory of rubato because it has no implications for the timing within a phrase. (2) The theory is based on unrealistic assumptions concerning the underlying biological timing mechanism, particularly the initiation, maintenance, and cessation of a covert ground beat. (3) The theory has implausible psychological implications with regard to the possible coexistence of two timing systems. (4) The assumption that the ground beat can be found and measured at the acoustic surface is questionable. (5) The admittance of an arbitrary number of ground beats in a phrase and of ground beats of various sizes and locations weakens the theory and leads to implausible and musically meaningless results. (6) Epstein s tests of the theory do not take into account the high chance probabilities that the assumptions under (5) engender in the absence of specific predictions. His phase fits are likely to be due to chance alone. (7) Epstein s measurements contain inaccuracies as well as unavoidable measurement error, both of which make his data difficult to replicate.

Repp: Epstein critique 22 To these criticisms must now be added several additional points that have barely been touched upon so far. One has to do with artists ability to replicate their own performances. Given the accuracy required to achieve phase synchrony, it seems extremely unlikely that an artist would be able to maintain phase synchrony in successive performances, even though these performances may be perceived as identical. While the replicability of individual timing profiles can be quite high (see, e.g., Repp, 1995b), there is nevertheless some uncontrolled variability in every artist s performance, amounting to at least 10% of the variance. Epstein did not examine repeated performances by the same artist, but the Chopin pieces do contain repeated passages that might be expected to show the same ground beats and phase fits. For example, in the Mazurka, phrases 1, 2, 3, 4, 6, 7, 12, and 13 are all closely related, differing only in ornamentation. Yet, they yield very different ground beats in Epstein s analysis (see Tables IV and V). Phrases 3, 6, and 12 are virtually identical, as are phrases 2 and 7 and phrases 8 and 10. Yet, they generally do not have the same ground beat, and if they do, the number of ground beats they supposedly contain is different. Similar observations may be made about the Chopin Waltz. Still, it could be argued that the differences indicate that the artist did not intend to play these passages the same way each time. Perhaps more consistency would be obtained when repeated performances of the whole piece were examined in which the artist tried to realize the same expressive intentions, as would be the case in multiple takes for a recording. Epstein is aware of this (ST: pp. 368 369), but instead of actually obtaining such multiple takes and checking pianists consistency, he gratuitously generalizes to repeated performances, apparently assuming that they would yield identical phase fits. This assumption is likely to be unwarranted. Admittedly, it is difficult to obtain repeated performances by a famous artist. Instead, let us examine briefly some performances by the author, an amateur pianist, which were recorded in close succession on a digital piano in MIDI format. While these

Repp: Epstein critique 23 performances are obviously not at the same artistic level as those of Novaes and Lipatti (they were not even polished renditions but spontaneous, unrehearsed readings), they have the advantage of being free of human measurement error and of being realizations of the same expressive intentions. To the author at least, their expressive timing sounds pleasing and appropriate. Their timing profiles at the beat level were determined from the onsets of the highest note in each chord, which is usually in the right hand. The correlation of the complete timing profiles (interbeat intervals) was 0.92 for each pair of performances. Thus, about 85% of the beat-level variance was under the author s control. Table VII shows the phase fits of initial beats and initial bars in the three performances. The average number of phase fits is consistent with chance expectations (three phrases out of 15), as it was in Novaes s performance. The main point of the table, however, is that the phase fits are not replicable from one performance to the next. The variability of phase fit ratios at the beat level usually extends across two or three integers; the variability at the bar level is about one third of that at the beat level, but still way too large to yield replicable ratios. Epstein may be inclined to dismiss these data as not being worthy of consideration because they derive from an amateur. Indeed, a professional pianist who has rehearsed and memorized the piece may show smaller variability. However, his or her accuracy would have to be about 30 times smaller to insure replicable phase fit ratios, especially at the beat level. It seems extremely unlikely that such high accuracy can be achieved by anyone, because artists are not machines. Therefore, phase fit ratios, especially at the beat level but probably also at the bar level, are necessarily a product of chance. ----------------------------- Insert Table VII here -----------------------------

Repp: Epstein critique 24 It may also be noted that the author s second performance showed 10 phase fits at the beat and bar levels combined whereas his third performance showed only one. According to Epstein s theory, the second performance should have been better than the third. In fact, however, the two performances were extremely similar in quality and probably indistinguishable to the ear. Their different numbers of phase fits simply represent random variability. A final point to be made in connection with these performances is that, even though they were spontaneous and unrehearsed, they do instantiate controlled rubato. If there had been no timing control, their timing profiles would not have been as similar as they were. Epstein s theory fails as a theory of rubato (in addition to all its other failings) because it is intended to apply only to the performances of truly outstanding artists. Its aim is not to explain the control of expressive timing in general but to provide an objective index of aesthetic quality and possibly a recipe for achieving such quality through design when musical intuitions are weak. However, it fails equally with respect to this narrow goal. What form might a proper theory of rubato take? There is currently no complete model, but Todd (1985, 1992, 1995) has made significant strides. In the author s view, Epstein s fundamental error is his assumption of a rigid ground beat when in fact a flexible rhythmic framework is required. The constraints governing rubato are inherent in that flexible framework itself, not in an external reference beat. Todd models flexible timing in terms of a prototypical timing curve, originally assumed to be parabolic (Todd, 1985) but later modified to consist of two branches representing linear acceleration and deceleration of tempo (Todd, 1992, 1995). The point(s) at which acceleration changes to deceleration and the extent of tempo change are free parameters. The tempo curve is replicated at several levels of a hierarchic grouping structure, and the surface timing is the product of several superimposed timing functions. This model captures major features of expressive timing that are tied to

Repp: Epstein critique 25 phrase structure, such as initial acceleration and final ritard, and it describes the manner in which such tempo changes tend to be executed by fine artists. Explanations of detailed timing microstructure and of individual differences among artists are beyond the reach of any current model. In fact, Repp (1995b) has argued that researchers should model typical expressive timing (such as an average timing profile) before attempting to explain individual differences, which may be conceptualized as deviations from some norm or prototype. The expressive timing of outstanding artists is often unusual and original, hence not representative (Repp, submitted). Whatever the constraints on rubato are, should not the typical, average musician be expected to be more constrained than the most imaginative artists? Higgins (1991) has pointed out that many theorists tend to focus on the resolution of tension rather than on the prolongation and enjoyment of tension itself. Epstein s focus on phase synchrony at phrase boundaries is a prime example of this tendency. According to him, rubato merely add[s] excitement to a performance, whereas a large part of the gratification in good rubato playing lies in... return to phase synchrony (ST: p. 373). This focus seems to be misplaced. It is the continuous temporal shaping of the unfolding music that provides gratification (if it is well done). Contrary to Epstein s assumption, phrase boundaries are probably the least important points in the precise control of rubato: Although they delimit and weakly constrain the general shape of the tempo curve, the slowing in tempo at phrase boundaries implies a local relaxation of constraints, in agreement with Fitt s law in movement and Weber s law in perception. In other words, performers have considerable latitude in choosing the extent of a final ritard because differences in duration are difficult to discriminate at that point. In fact, if there were an underlying ground beat, it would make more sense to hypothesize that phase synchrony is established in the middle of phrases. There may well be a baseline tempo underlying rubato performances that is reached intermittently but deviated from, primarily by slowing down, especially at phrase boundaries.

Repp: Epstein critique 26 However, it is not necessary to postulate phase synchrony in this process; a simple tempo memory is sufficient. Epstein s phase synchrony theory of rubato is an outgrowth of the larger theoretical framework presented in his book, which is primarily concerned with justifying tempo proportionality in performance. The bankruptcy of his phase synchrony theory of rubato suggest that his theory of tempo proportionality is also in need of a critical examination. 13

Repp: Epstein critique 27 References Epstein, D. (1995). Shaping time: Music, the brain, and performance. New York: Schirmer Books. Higgins, K. M. (1991). The music of our lives. Philadelphia, PA: Temple University Press. Large, E. W., & Jones, M. R. (submitted). The dynamics of attending: How we track time varying events. Large, E. W., & Kolen, J. F. (1994). Resonance and the perception of musical meter. Connection Science, 6, 177 208. Repp, B. H. (1995a). Acoustics, perception, and production of legato articulation on the piano. Journal of the Acoustical Society of America, 97, 3862-3874. Repp, B. H. (1995b). Expressive timing in Schumann s Träumerei : An analysis of performances by graduate student pianists. Journal of the Acoustical Society of America, 98, 2413 2427. Repp, B. H. (in press). Review of Shaping time: Music, the brain, and performance by David Epstein. Music Perception. Repp, B. H. (submitted). The aesthetic quality of a quantitatively average music performance: Two preliminary experiments. Todd, N. [P. McA.] (1985). A model of expressive timing in tonal music. Music Perception 3, 33 58. Todd, N. P. McA. (1992). The dynamics of dynamics: A model of musical expression. Journal of the Acoustical Society of America, 91, 3540 3550. Todd, N. P. McA. (1995). The kinematics of musical expression. Journal of the Acoustical Society of America, 97, 1940 1949.