The biology and evolution of music: A comparative perspective

Similar documents
The Evolution of Music in Comparative Perspective

What Can Experiments Reveal About the Origins of Music? Josh H. McDermott

Screech, Hoot, and Chirp: Natural Soundscapes and Human Musicality

Review of Bug Music: How Insects Gave Us Rhythm and Noise. David Rothenberg Picador pp., Paperback

PETER MARLER. 24 february july 2014 BETHANY DANIELS / COLLEGE OF BIOLOGICAL SCIENCES / UC DAVIS

THESE SEVEN COMMENTARIES, KINDLY SOLICITED THOUGHTS ON AN EMPIRICAL APPROACH TO THE EVOLUTIONARY ORIGINS OF MUSIC


Rhythm and Melody Aspects of Language and Music

Toward a New Comparative Musicology. Steven Brown, McMaster University

Third Grade Music Curriculum

BIBB 060: Music and the Brain Tuesday, 1:30-4:30 Room 117 Lynch Lead vocals: Mike Kaplan

Patel, Iversen, Bregman & Schulz 1. Studying synchronization to a musical beat in nonhuman animals

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Supplemento n. 6 a «Illuminazioni» n. 18 (ottobre-dicembre 2011) Alessandra Anastasi THE SINGING OF PRIMATES

Creativity in Performance

The evolution of the music faculty: a comparative perspective

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

Embodied music cognition and mediation technology

"The mind is a fire to be kindled, not a vessel to be filled." Plutarch

The Reference Book, by John Hawthorne and David Manley. Oxford: Oxford University Press 2012, 280 pages. ISBN

Current Issues in Pictorial Semiotics

Music Performance Panel: NICI / MMM Position Statement

PRESCOTT UNIFIED SCHOOL DISTRICT District Instructional Guide January 2016

Social Mechanisms and Scientific Realism: Discussion of Mechanistic Explanation in Social Contexts Daniel Little, University of Michigan-Dearborn

Primates have been laughing for 10m years

FROM THE PERSPECTIVE of cognitive science, THE ORIGINS OF MUSIC: INNATENESS, UNIQUENESS, AND EVOLUTION

Instrumental Music Curriculum

Standard 1 PERFORMING MUSIC: Singing alone and with others

Musical Entrainment Subsumes Bodily Gestures Its Definition Needs a Spatiotemporal Dimension

SocioBrains THE INTEGRATED APPROACH TO THE STUDY OF ART

K-12 Performing Arts - Music Standards Lincoln Community School Sources: ArtsEdge - National Standards for Arts Education

The laughing brain - Do only humans laugh?

Sexual Selection I. A broad overview

Improving Piano Sight-Reading Skills of College Student. Chian yi Ang. Penn State University

The Cognitive Nature of Metonymy and Its Implications for English Vocabulary Teaching

PROFESSORS: Bonnie B. Bowers (chair), George W. Ledger ASSOCIATE PROFESSORS: Richard L. Michalski (on leave short & spring terms), Tiffany A.

MUSIC (MUS) Music (MUS) 1

Acoustic and musical foundations of the speech/song illusion

Sight and Sensibility: Evaluating Pictures Mind, Vol April 2008 Mind Association 2008

Significant Differences An Interview with Elizabeth Grosz

Standard 1: Singing, alone and with others, a varied repertoire of music

AUD 6306 Speech Science

Bas C. van Fraassen, Scientific Representation: Paradoxes of Perspective, Oxford University Press, 2008.

The social and cultural significance of Paleolithic art

Department of Art, Music, and Theatre

ARISTOTLE AND THE UNITY CONDITION FOR SCIENTIFIC DEFINITIONS ALAN CODE [Discussion of DAVID CHARLES: ARISTOTLE ON MEANING AND ESSENCE]

Computer Coordination With Popular Music: A New Research Agenda 1

Björn Merker: neuroscientist/zoömusicologist

Journal of Experimental Psychology: Animal Learning and Cognition

AN INTRODUCTION TO PERCUSSION ENSEMBLE DRUM TALK

What is music as a cognitive ability?

A Note on: Lumaca & Baggio (2017) Cultural Transmission and Evolution of Melodic Structures in

Speaking in Minor and Major Keys

What do our appreciation of tonal music and tea roses, our acquisition of the concepts

62. Mustapha Tettey Addy (Ghana) Agbekor Dance (for Unit 6: Further Musical Understanding)

KINDS (NATURAL KINDS VS. HUMAN KINDS)

Image and Imagination

Music, Culture and the Evolution of the Human Mind: Looking Beyond Dichotomies

What is Character? David Braun. University of Rochester. In "Demonstratives", David Kaplan argues that indexicals and other expressions have a

Student Performance Q&A: 2001 AP Music Theory Free-Response Questions

Subjective evaluation of common singing skills using the rank ordering method

Manuel Bremer University Lecturer, Philosophy Department, University of Düsseldorf, Germany

Chapter Two: Long-Term Memory for Timbre

Articulation Clarity and distinct rendition in musical performance.

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

TERMS & CONCEPTS. The Critical Analytic Vocabulary of the English Language A GLOSSARY OF CRITICAL THINKING

Incommensurability and Partial Reference

Natika Newton, Foundations of Understanding. (John Benjamins, 1996). 210 pages, $34.95.

Olga Feher, PhD Dissertation: Chapter 4 (May 2009) Chapter 4. Cumulative cultural evolution in an isolated colony

1. Content Standard: Singing, alone and with others, a varied repertoire of music Achievement Standard:

Grade Level Expectations for the Sunshine State Standards

Improvisation and Ethnomusicology Howard Spring, University of Guelph

Course Overview. Assessments What are the essential elements and. aptitude and aural acuity? meaning and expression in music?

Visual Argumentation in Commercials: the Tulip Test 1

Logic and Philosophy of Science (LPS)

Peter Johnston: Teaching Improvisation and the Pedagogical History of the Jimmy

Agreed key principles, observation questions and Ofsted grade descriptors for formal learning

(as methodology) are not always distinguished by Steward: he says,

A Musical Species. By Caroline Atkinson

Curriculum Mapping Subject-VOCAL JAZZ (L)4184

Triune Continuum Paradigm and Problems of UML Semantics

Music (MUS) Courses. Music (MUS) 1

OVER THE YEARS, PARTICULARLY IN THE PAST

specialneedsinmusic.com Goals and Objectives for Special Needs and Other Students

Version 5: August Requires performance/aural assessment. S1C1-102 Adjusting and matching pitches. Requires performance/aural assessment

Teacher: Adelia Chambers

Rational Agency and Normative Concepts by Geoffrey Sayre-McCord UNC/Chapel Hill [for discussion at the Research Triangle Ethics Circle] Introduction

Audio Feature Extraction for Corpus Analysis

ANALYSIS OF THE PREVAILING VIEWS REGARDING THE NATURE OF THEORY- CHANGE IN THE FIELD OF SCIENCE

Celine Granjou The Friends of My Friends

Section E. Match each section with the correct heading. Questions

On time: the influence of tempo, structure and style on the timing of grace notes in skilled musical performance

Commentary on David Huron s On the Role of Embellishment Tones in the Perceptual Segregation of Concurrent Musical Parts

To what extent can we apply the principles of evolutionary theory to storytelling?

WHAT S LEFT OF HUMAN NATURE? A POST-ESSENTIALIST, PLURALIST AND INTERACTIVE ACCOUNT OF A CONTESTED CONCEPT. Maria Kronfeldner

School of Church Music Southwestern Baptist Theological Seminary

Primary Music Objectives (Prepared by Sheila Linville and Julie Troum)

SHORT TERM PITCH MEMORY IN WESTERN vs. OTHER EQUAL TEMPERAMENT TUNING SYSTEMS

Arts Education Essential Standards Crosswalk: MUSIC A Document to Assist With the Transition From the 2005 Standard Course of Study

Music s Place in Evolutionary Psychology

Transcription:

Cognition xxx (2006) xxx xxx www.elsevier.com/locate/cognit The biology and evolution of music: A comparative perspective W. Tecumseh Fitch * School of Psychology, University of St. Andrews, St. Andrews, Fife, KY16 9JP, UK Abstract Studies of the biology of music (as of language) are highly interdisciplinary and demand the integration of diverse strands of evidence. In this paper, I present a comparative perspective on the biology and evolution of music, stressing the value of comparisons both with human language, and with those animal communication systems traditionally termed song. A comparison of the design features of music with those of language reveals substantial overlap, along with some important differences. Most of these differences appear to stem from semantic, rather than structural, factors, suggesting a shared formal core of music and language. I next review various animal communication systems that appear related to human music, either by analogy (bird and whale song ) or potential homology (great ape bimanual drumming). A crucial comparative distinction is between learned, complex signals (like language, music and birdsong) and unlearned signals (like laughter, ape calls, or bird calls). While human vocalizations clearly build upon an acoustic and emotional foundation shared with other primates and mammals, vocal learning has evolved independently in our species since our divergence with chimpanzees. The convergent evolution of vocal learning in other species offers a powerful window into psychological and neural constraints influencing the evolution of complex signaling systems (including both song and speech), while ape drumming presents a fascinating potential homology with human instrumental music. I next discuss the archeological data relevant to music evolution, concluding on the basis of prehistoric bone flutes that instrumental music is at least 40,000 years old, and perhaps much older. I end with a brief review of adaptive functions proposed for music, concluding that no one selective force (e.g., sexual selection) is adequate to explaining all aspects of human music. I suggest that questions about * Fax: +44 1334 463042. E-mail address: wtsf@st-andrews.ac.uk. 0010-0277/$ - see front matter Ó 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.cognition.2005.11.009

2 W.T. Fitch / Cognition xxx (2006) xxx xxx the past function of music are unlikely to be answered definitively and are thus a poor choice as a research focus for biomusicology. In contrast, a comparative approach to music promises rich dividends for our future understanding of the biology and evolution of music. Ó 2005 Elsevier B.V. All rights reserved. 1. Introduction There has recently been a surge of interest in the biology and evolution of music or biomusicology (Avanzini, Faienza, Minciacchi, Lopez, & Majno, 2003; Peretz & Zatorre, 2003; Wallin, Merker, & Brown, 2000; Zatorre & Peretz, 2001). Human music is based upon a diverse set of perceptual mechanisms, some shared with most other vertebrates, and therefore with a very long evolutionary history, and some potentially unique to our species. For example, goldfish can learn to distinguish baroque music from blues (Chase, 2001), suggesting that some mechanisms involved in music perception date back to the earliest jawed vertebrates (some 500 million years ago). In contrast, even nonhuman primates seem unable to recognize melodies as purely relational structures, as does a newborn child, suggesting that this aspect of music perception evolved in the last few million years (DÕAmato, 1988; Hauser & McDermott, 2003). Although all of the mechanisms involved in music perception and production may be grouped together, for convenience, as the music faculty or the capacity for music, it is important to remember that different components of this capacity may have different evolutionary histories. Thus, discussing Music as an undifferentiated whole, or as a unitary cognitive module, risks overlooking the fact that music integrates a wide variety of domains (cognitive, emotional, perceptual, motor,...), may serve a variety of functions (mother-infant bonding, mate choice, group cohesion...) and may share key components with other systems like language or speech. Thus, questions like When did music evolve? or What is music for? seem unlikely to have simple unitary answers. In this paper, I will discuss the biology and evolution of music from an explicitly multicomponent perspective, distinguishing particularly between vocal song and instrumental music. I offer some ways of defining these components, and review the comparative literature for evidence as to when, how and why, some of them evolved. I will argue that the comparative database, while still far from complete, is sufficiently rich today to lead to some interesting conclusions and further hypotheses about human music. I will stress the importance of the pluralistic approach to biological questions urged by Tinbergen (Tinbergen, 1963), who distinguished four categories of answers: mechanistic, developmental, phylogenetic and functional. Thus, the simple question Why do birds sing? has many correct answers, all of which are important for a complete understanding of birdsong. At a mechanistic level, birds sing because they have a complex vocal organ and neural song circuits that are activated when hormone levels are high. Other mechanistic possibilities (e.g., the bird sings because singing feels good, or to enjoy the songõs beauty), although harder to address experimentally, may also be valid. Developmentally, the bird sings because it was raised in an environment full of conspecific songs, which

W.T. Fitch / Cognition xxx (2006) xxx xxx 3 it learned (many birds will not sing properly otherwise). Phylogenetically, all birds share a syrinx, indicating that this unique vocal organ evolved near the beginning of bird evolution. Finally, functionally (adaptively), birds sing because their ancestors who sang out-reproduced those that did not. This may have been because they attracted more or better mates, defended better territories, or both. Crucially, Tinbergen stressed that all four types of question are equally valid and interesting. The four domains are essentially orthogonal, and their answers independent. Although adaptive questions in the last category are fascinating, they are typically the hardest of TinbergenÕs four categories to address. One reason for this is the lability of function: the evolutionary function(s) of a particular trait often change substantially over time (cf. Reeve & Sherman, 1993) a phenomenon termed exaptation (Gould, 1991; Gould & Vrba, 1982). Three examples directly relevant to music include the mammalian middle ear bones, which started as jaw supports but now function in audition, the vertebrate laryngeal cartilages, which began as gill supports but now function in sound production, and the lungs, which are homologous to the swim bladder in fish, a floatation control system, but are used in breathing and vocalization in tetrapods. Such lability has led some theorists to advocate a notion of adaptation which ignores past function (Reeve & Sherman, 1993), but while the difficulty of extrapolating from current to past function is widely acknowledged, most conceptions of adaptation still involve historical function in some way (West- Eberhard, 1992). Thus, although experiments addressing current utility can be performed (e.g., for birdsong, see below), these do not necessarily demonstrate ancestral function. Even for birdsong, which has been intensively studied, the outcomes are often ambiguous (Kroodsma & Byers, 1991). For human music, an activity whose current utility is quite obscure, adaptive questions appear even more challenging. However, it is important to realize that functional questions concerning adaptation (particularly past or original function) do not need to be answered for productive research to proceed on mechanistic, developmental and phylogenetic questions. Thus, although I will briefly review adaptive hypotheses of music in this paper, I will stress their tentative nature, the lack of convincing evidence concerning any of them, and the independence of progress in biomusicology from their answers. Another theme of this paper will be the value of comparisons between music and language to a richer biological understanding of both. For example, vocal learning is a prerequisite of both song and speech, and the study of vocal learning in animals thus may shed light on both of these distinctive human behaviours. There are both deep similarities and quite obvious differences between music and language, both of which are uniquely human considered in toto, but each of which involves ancient mechanisms shared with other animals. Recent trends in biolinguistic research offer some valuable lessons for the newer field of biomusicology, and suggest that the similarities and differences between music and language could profitably be more intensively exploited, both theoretically and experimentally. Ultimately, we might expect more rapid progress in understanding the biology of music, because music has better analogs in the natural world (e.g., bird or whale song, convergently evolved) and because several human musical abilities have plausible homologous precursors in our primate relatives. But at present, discussions of the biology and evolution of lan-

4 W.T. Fitch / Cognition xxx (2006) xxx xxx guage have progressed further, and thus biolingustics provides a context and framework for my discussion here (Fitch, 2005; Hauser, Chomsky, & Fitch, 2002). 2. Comparative approaches to music A comparative approach to music has at least three sides. The first and oldest type of comparison is inter-cultural, among human musics: the topic of a vast literature in comparative musicology and ethnomusicology. Cross-cultural data are clearly an important prerequisite to the search for musical universals, one arm of a biological approach to characterizing the capacity for music. However, as many book-length reviews of this field are already available I will not attempt to review this large topic here, but will simply make use of its results (see Blacking, 1976; Kunst, 1974; Merriam, 1964; Nettl, 1983; Sachs, 1940; Titon, Koetting, McAllester, Reck, & Slobin, 1984). The second comparison, also within our own species, is that between music and language (and other allied systems like dance and poetry). I will discuss this intraspecific comparison between cognitive domains in some detail in the next section. Finally, the comparisons between human music, both vocal and instrumental, and musical behaviours of various sorts among animals will be the second and main focus of my review here (which will focus on vertebrates). I will argue that, when properly delimited, animal signaling systems provide a rich source of insights into the biology of human music, with both homologous systems (e.g., bimanual drumming in great apes) and analogous systems (e.g., birdsong) available for further detailed study. 2.1. Linguistic comparisons: Design features of human music In a classic paper, (Hockett, 1960) outlined a number of characteristics of language which he called design features. These are listed in Table 1. By singling out particular aspects of language as points of comparison with animal communication systems, HockettÕs paper provided a spur to further research in animal communication, with the outcome that several features he believed to be unique to human language were subsequently documented in animals. Hockett also discussed instrumental music in his paper but excluded vocal music (without explanation), and his discussion thus forms a reasonable starting place for a dissection of music into sub-components. For this comparison, I must obviously exclude lyrical music, which because it incorporates language thus automatically inherits any linguistic design features. Thus, in this context, I mean by vocal music all music generated by the vocal tract but lacking distinct words (e.g., humming, jazz scat singing, Central Asian formant singing, etc.). As is clear in Table 1, most of HockettÕs design features of language are shared by music (interchangeability is an interesting exception). Furthermore, most of the nonshared features appear to derive from one core difference between music and language: referentiality or semanticity. Language can be used to convey an unlimited set of discrete, propositional meanings, and music cannot. While music is typically

W.T. Fitch / Cognition xxx (2006) xxx xxx 5 Table 1 HockettÕs (1960) design features of language Language Music Innate human calls Design feature Instrumental Vocal 1. Vocal auditory channel No Yes Yes 2. Broadcast transmission Yes Yes Yes 3. Rapid fading Yes Yes Yes 4. Interchangeability No Yes? Yes 5. Total feedback Yes Yes Yes 6. Specialization Yes Yes Yes 7. Semanticity No No No? 8. Arbitrariness No No No? 9. Displacement No No No 10. Duality of patterning No No No 11. Productivity Yes Yes No 12. Discreteness Yes Yes No 13. Cultural transmission Yes Yes No Design features of language Hockett (1960). Thirteen features all argued by Hockett to be present in spoken language, as compared with instrumental and vocal music, and innate human calls (e.g., laughter, crying, screaming, moaning...). Brief explanations of non-obvious terms (see Hockett (1960) for detailed description and discussion): 4, interchangeability (one can say anything one can understand vs. not all saxophone listeners can play the instrument); 5, total feedback (you hear what youõre saying or playing); 6, specialization (signal triggers desired results with negligible direct energy expenditure; unlike forcing someone manually); 7, semanticity (words associated with things); 9, discreteness (digital vs. analog); 10, displacement (a capacity to refer to non-present objects or events); 11, productivity (novelty, also tied to counterfactuality); 12, duality of patterning (meaningless elements combine to produce a large number of meaningful elements). composed of a discrete set of fundamental units (notes and beats), these do not map onto equally discrete meanings semantically. The lack of this form of meaning in music directly leads to the absence of arbitrariness, displacement and duality of patterning (combination of meaningless elements into meaningful words, and thence to sentences). Thus, for example, music lacks duality of patterning by definition, because neither A# nor a sequence of notes means anything in the same way that dog does. Although certain types of music may blur this distinction (e.g., the use of motives to signify a character or season in Western music, or the use of drummed patterns or whistled speech to imitate language) these seem both clearly marginal to human music in general, and plausibly parasitic on language in the same way as is lyrical music. Thus as a first approximation, HockettÕs framework would characterize vocal music as speech minus meaning. This difference in referentiality between music and language does not imply that music has no meaning, of course, but simply that the mapping between signal and interpretation is quite different in music and language. Indeed, it seems likely that the affective and aesthetic power of music derives from these differences. Music, rather than being semantically deficient relative to language, encourages a complementary mode of interpretation that is a major source of its appeal (Cross, 2003). Thus, it is more accurate to characterize music as being like language without propositional, combinatorial meaning.

6 W.T. Fitch / Cognition xxx (2006) xxx xxx Of course, from a musicological point of view, characterizing music by what it lacks relative to language seems limited, if not derogatory. Music has its own unique features as well, elements that language lacks. As a first step in the direction of characterizing human music in its own terms, I will now propose some design features of music, based on results from ethnomusicology (e.g., Arom, 2000; Nettl, 2000). Unfortunately, ethnomusicologists have traditionally been wary or even hostile to the search for musical universals for largely historical and sociological reasons (Nettl, 1983, 2000), in rather sharp contrast to linguistics where the search for universals is considered productive and respectable. Thus, this list of design features of music is intended to provide a concrete basis for the comparative discussion below, and for future discussions, but is not intended to be definitive or exhaustive. In Table 2 I present a list of some suggested basic design features of music. The first three features are shared with language. By complexity I mean that musical signals (like linguistic signals) are more complex than the various innate vocalization available in our species (groans, sobs, laughter and shouts). Although complexity can be measured and quantified in various ways (see, e.g., Homer & Selman, 2001; Shmulevich & Povel, 2000; Simon, 1972; Weng, Bhalla, & Iyengar, 1999), there is no single widely used metric applicable to all musics (Pressing, 1998), so it would be premature to specify any absolute threshold for complexity at present. Oddly, Hockett did not include complexity on his list of design features, although its necessity for language is implicit in his discussion. A signal at a certain level of complexity is clearly a prerequisite for conveying an unlimited number of complex meanings, and therefore a pre-requisite for human language. Second, music, like language, is generative: it uses rule-governed combinations and permutations of a limited number of notes or syllables to generate an unlimited number of hierarchically structured signals (Merker, 2002). Note that a second component of linguistic generativity (in the technical sense), a symmetry between listener and speaker that Hockett termed interchangeability, is not typically present in instrumental music. One can understand and appreciate a viola or an oboe performance despite being unable to play either instrument. In contrast, any human has a basic capacity to sing, and an ability to reproduce basic melodies vocally, if not beautifully, appears to be an Table 2 Design features of music Design feature Language? Innate calls? 1. Complexity Yes No 2. Generativity Yes No 3. Culturally transmitted Yes No 4. Discrete Pitches No No 5. Isochronic No No 6. Transposability Yes? 7. Performative context No No 8. Repeatable (repertoire) No No 9. A-referentially expressive No Yes Proposed design features of human music (see text Section 2.1 for explanation and discussion).

W.T. Fitch / Cognition xxx (2006) xxx xxx 7 early developing human universal. Finally, musical styles, like individual languages, are learned by experience and thus culturally transmitted. Music also differs from language (and innate calls) in several important ways (cf. Merker, 2002). The most obvious is that most of the worldõs musics rely on a discrete set of pitches a scale from which notes are chosen to build melodies (Nettl, 2000). Although there are exceptions (e.g., some singing styles, or in some rhythmic African music where defined pitch is absent), this is a key feature differentiating song (with discrete pitch) from speech (with continuously variable pitch). Second, in the temporal domain, music tends to be isochronic, meaning that there is a regular periodic pulse (also termed the beat, or tactus) which provides a reference framework for other temporal features of the music (Arom, 2000; Merker, 2002). Note that isochronicity is a relative feature: virtually no music is perfectly isochronous, and some musical styles rather freely vary the underlying pulse. Although there are some styles of speech which use an isochronic framework (e.g., ritualistic speech or poetry), and certain non-isochronic musical genres (e.g., sung lament), isochronicity is a core feature of most of the worldõs musics. Thus, the clearest differences between music and language are that music relies on a discretization of both pitch and time, while in language production both are free to vary continuously. Discrete time and pitch make music more acoustically predictable than language, and thus enhance acoustic integration between multiple individuals in an ensemble, or between notes in harmonic music. A feature of musical melody that is shared with language is that musical pitch structures are transposable: a melody is considered the same when it is performed or sung on a higher starting note. This is because, in human music, a melody is defined by the relationships between notes, not just the absolute frequencies of the individual notes. Thus, two singers with very different pitch ranges (e.g., a man and a woman) can still sing the same melody, despite using a different set of pitches to do so. The same is true of speech (although pitches are not discrete in speech): a sentence spoken by a woman is the same as one spoken by a man an octave lower. This free transposability may represent a key difference between human and animal melody perception (DÕAmato, 1988; Hauser & McDermott, 2003) (but see Wright, Rivera, Hulse, Shyan, & Neiworth, 2000). It is less clear that innate calls possess transposability: is high-pitched tittering laughter the same as a similar signal at a lower pitch? I have left this undecided in Table 2. Three further proposed design features of music differentiate music from language less clearly. First, human music typically occurs in specific performative contexts: particular songs or styles recur in specific social contexts, especially ritualistic contexts stressing supernatural or mystical themes (Arom, 2000; Cross, 2003; Nettl, 2000). These contexts vary considerably from culture to culture: Western classical music may have very specific contexts (e.g., the opera house) compared to folk musics (Nettl, 1995), but all cultures seem to differentiate celebratory music from dirges or laments, menõs music from womenõs music, lullabies from work songs, or draw some similar distinctions. This leads to a second unusual feature of music: that songs or performances are typically repeated (often with great frequency) in the

8 W.T. Fitch / Cognition xxx (2006) xxx xxx appropriate context: a musical system contains a repertoire of different, identifiable pieces (Nettl, 1983). Unlike spoken utterances, musical performances are typically repeatable, without any obvious decrement (and sometimes an increment) in enjoyment. One might listen to the Beatles Let it Be or BeethovenÕs Moonlight Sonata scores of times, and keep coming back for more. Language is different. One may enjoy seeing the same play or movie several times, and children sometimes seem to have an endless appetite for repeated stories, but the vast majority of linguistic utterances are uttered once and never repeated. There is an interesting area of overlap between these two features of music and language however, in ritual language such as prayers, blessings, invocations, etc., or in theatrical performances and traditional storytelling. Finally, so called phatic communication such as greetings or farewells, are highly repeatable. However, such formulaic utterances have often been singled out by linguists as peculiar (Wray, 2002), and their very similarity to music seems to differentiate them from ordinary language. The final proposed design feature is easily the most difficult to pinpoint, and the topic of a vast, ancient and controversial literature: the question of meaning in music (see also Jackendoff & Lerdahl, in press). On the one hand, as discussed above, music is clearly not meaningful in the way language is (able to convey an unlimited number of propositional thoughts or meanings with arbitrary specificity). On the other hand, music is not meaningless: music is expressive in some different, hard-to-define sense. It is often said that music expresses the emotions. It is clear both intuitively, and from an increasing body of experimental work, that music can have profound effects on arousal and mood (e.g., Blood & Zatorre, 2001; Thompson, Schellenberg, & Husain, 2001), and this is clearly an important component of the meaning of music. However, limiting musical meaning to emotion seems insufficiently general, if not procrustean, since music can also be abstracted away from emotion, and the complex cognitive processing involved in music perception is, in an important sense, prior to any experienced emotion (cf. Sloboda, 1985). Furthermore, there is commonly a mapping between musicõs acoustic form and movement, especially dance (another human universal, (Nettl, 1983)), and this music M dance mapping is not easily subsumed by the term emotion (Cross, 2003). Movement provides objectively measurable ways of examining the expressive nature of music. We can easily match the tempo and gestural form (Bierwisch, 1979) of a piece of music to an appropriate sequence of movements in dance, or reject inappropriate mappings, and this demonstrates that some kind of non-arbitrary mappings between the sonic domain of music and various other domains exists (see also Clynes, 1977, 1995; Juslin & Sloboda, 2001; Trainor & Schmidt, 2003). Ian Cross has suggested that the difficulty in pinning down the nature of this mapping actually reflects a crucial aspect of musical meaning (Cross, 2003), which is the capacity of music to imbue any situation with meaningfulness, which is nonetheless potentially quite different for different participants; Cross has dubbed this floating intentionality. Without entering into further discussion I will cautiously denote this final design feature of music, a gestural form which includes flexible mappings to both mood and movement, as its capacity to be a-referentially expressive. It is worth noting that, although the primary expressive mode of language is referential and propositional, speech can also express emo-

W.T. Fitch / Cognition xxx (2006) xxx xxx 9 tionality through paralinguistic aspects of prosody (often appropriately termed musical aspects of speech), and there are profound similarities between speech and music in this respect (reviewed in Juslin & Laukka, 2003). 2.2. Biological comparisons: Defining animal song and instrumental music As both Darwin and Tinbergen stressed, a crucial component of research on phylogeny and function is the comparative method: the use of data from living species to draw inferences about extinct ancestors and adaptive function. Because the songbird syrinx does not fossilize, the comparative method provides the only evidence about the structure and function of the ancestral syrinx, and much the same can be said of the human larynx. Two types of similarities must be distinguished, homology and analogy, because they support different types of inferences. Homologous traits are present in two or more species by virtue of common descent: they are inherited from a common ancestor (although perhaps in changed form). Homologous characters are critical for deducing phylogeny (systematic relationships among species) and thus are a traditional focus of taxonomic research. For example, we share a large number of homologous characters with our nearest relatives, the chimpanzees, including a propensity to use simple tools. We can thus conclude that our last common ancestor with chimpanzees (the LCA), who lived in Africa some 5 7 million years ago, had a propensity to use tools. The second class of trait is equally important, though typically less well-studied: Analogous traits. These are similar traits that were not present in a common ancestor, but have evolved independently in two lineages. Flight in bats and birds is a good example: the common ancestor of bats and birds was an early terrestrial reptile that could not fly. The many similarities between bats and birds (wings, light weight, high metabolism, etc.) result from parallel selection pressures to excel in the aerial niche, and are the clearest evidence of adaptation to this niche. An important example of an analogous trait is vocal learning in humans and birds, which has evolved separately in multiple lineages (see below). In general, analogous traits provide evidence regarding adaptation while homologous traits reflect ancestry, and both types play important roles in the comparative method. In the next section, I will offer a more detailed review of some complex animal behaviors that are often considered musical or termed song. But before doing so I will offer some provisional definitions to justify what I do not review. I will not attempt to define human music, considered as a monolithic whole. Despite a long history of attempts no uncontroversial definition is currently accepted, and in any case music as a term is both a new one in English, and not shared in various other languages, which may lump music with dance or celebration as a single term (cf. Merker, 2000, 2002; Nettl, 2000). Nonetheless, music in most cultures is easily and unambiguously singled out (from both language, and other vocalizations and activities) by members of that culture (Nettl, 2000). Hence defining human music seems both difficult and unnecessary: we should focus rather on the subcomponents of the music faculty. Here, I will single out two subcomponents of animal musical behaviour that can be readily defined: song and instrumental music (particularly percussion).

10 W.T. Fitch / Cognition xxx (2006) xxx xxx 2.2.1. Vocal music or song The term music has traditionally been freely applied to birdsong. Aristotle observed that bird song was learned (Aristotle, 350 BC), and by DarwinÕs time, the similarities between human and bird music were well-enough understood for him to suggest that they are evolutionary analogs (Darwin, 1871). The term bird song is used not only by biologists but also musicologists, and this tradition is old (e.g., bird music in Scholes, 1938) and continues unbroken until the present day ( animal music, Marler, 2000; Marler & Slabbekoorn, 2004; Slater, 2001). It was once thought that birdsong is the only form of animal music: But for humans, birds are perhaps NatureÕs only musicians, (Scholes, 1938, p. 107), but the recent discovery of complex, learned vocalizations in marine mammals invalidated this belief (Payne & McVay, 1971). The justification for traditionally singling out birdsong from other animal vocalization (such as frog or cricket calls) has varied. However, two continuous threads can be discerned: birdsong is complex, and it is learned. Various other issues, including tonality, diatonicity or accompaniment by dance are often brought up as incidental or aesthetic criteria, but not as definitive of song. Following in this tradition, animal song can be defined simply as complex, learned vocalization. Almost coincidentally, this definition of song (based on findings in ethology) also applies to humans, with one caveat that music lacks composite, propositional meaning necessary only to distinguish it from spoken language. Some objective definition of this sort is required for productive discussion and further analysis of animal song. Any definition linking music specifically to humans would be useless in this context, defining away animal song by fiat, and requiring us to coin some new term for bird or whale vocalizations. In the same way that defining flight independently of phylogeny allows the exploration of convergent adaptation in birds and bats, an objective definition of song is a first step towards rigorous comparative analysis. The definition above is simple, including almost all the phenomena traditionally called song (in humans, birds and more recently whales) and excluding most of the inappropriate contenders (e.g., frog or cricket song, neither of which are learned). Learning can obviously be verified experimentally (e.g., Marler, 1970b; Owren, Dieter, Seyfarth, & Cheney, 1993). The term complexity may seem slippery, but other reasonable possibilities (e.g., generativity, Marler, 2000) are quite difficult to measure objectively, while complexity can be quantified by various metrics (minimum description length is particularly attractive, e.g. Pressing, 1998; Rissanen, 1997; Weng et al., 1999). Thus, no aesthetic criteria or matters of taste need enter into this definition, and it rejects nothing by fiat: if the complex 36-syllable vocalizations of the Madagascan frog Boophis (Narins, Lewis, & McClelland, 2000) were shown to be learned, this would constitute frog song. For the remainder of this article I will drop the quotes and use animal song to denote complex, learned vocalizations. By this definition, song has evolved repeatedly in vertebrates. The classic example, birdsong, is based on an ability to learn vocalizations that has evolved at least three times in birds (songbirds, parrots and hummingbirds (Doupe & Kuhl, 1999)). I will focus here on songbirds, which are by far the best-understood of these groups. Songbirds (technically termed oscine passerines) represent the most speciose suborder

W.T. Fitch / Cognition xxx (2006) xxx xxx 11 (oscine) of the largest order of birds (Passeriformes); there are about 4000 songbird species (of about 9000 bird species total). The sister clade of songbirds is the suboscine group, which in general do not learn their display vocalizations (which are nonetheless often called songs because of their similar function in mate attraction and territorial defense). It was clear to Darwin, and has remained unargued ever since, that bird song is analogous, not homologous, to human song (our common ancestor, a Paleozoic reptile, did not sing), and the same can be said for whale and seal song. By the definition above, there are no singing primates except humans: although certain primate calls are traditionally termed song (e.g., the complex vocalizations of gibbons (Geissmann, 2000), or the simple but beautiful and haunting call of the indri, a Madagascan prosimian) there is no evidence of vocal learning of complex vocalizations in any nonhuman primate (Janik & Slater, 1997). 2.2.2. Instrumental music or drumming In addition to song, the use of sound tools to create music is nearly universal among human cultures (Nettl, 1983). I will define instrumental music as the use of the limbs or other body parts to produce structured, communicative sound, possibly using additional objects. In general in the wild, such behaviours are non-tonal and percussive, and we can freely substitute the term drumming (e.g., ape drumming or woodpecker drumming ). In sharp contrast to song, which has evolved repeatedly, instrumental music is quite rare among vertebrates. Intriguingly, however, the best examples are found in our nearest relatives, the African great apes, specifically in chimpanzees (who drum on tree buttresses or other resonant structures as a component of their complex dominance displays (Arcadi, Robert, & Mugurusi, 2004; Arcadi, Robert, & Boesch, 1998; Goodall, 1986)) and gorillas (who drum on their bodies, and sometime objects, during aggressive displays and play (Schaller, 1963)). Enculturated bonobos also have considerable instrumental capacities, including both spontaneous rhythmic drumming on objects and playing musical keyboards (Savage Rumbaugh, pers. comm.). I know of no reports of drumming behaviour in orangutans. Thus, bimanual drumming in African great apes provides an intriguing possibility of a homologue of human instrumental music in our nearest cousins, which evolved after our split from orangutans and gibbons. Further parallels to drumming are quite rare in vertebrates, the most prominent examples being woodpeckers (Dodenhoff, Stark, & Johnson, 2001; Stark, Dodenhoff, & Johnson, 1998), or various desert rodent species (e.g., Randall, 1997). The only attested form of instrumental music involving more than one object is by palm cockatoos, who drum against hollow trees with sticks (Wood, 1984, 1988). Because this topic is both poorly studied and rarely mentioned in discussions of the biology of music I will review it in detail in Section 3.4. 2.2.3. Are human and animal song analogous? Despite this long history of considering bird song as an analog to human music, some recent treatments have questioned this (for discussion see Slater, 2001). In particular, a recent paper by Hauser and McDermott rejects the analogy between bird or whale song and human song as of little use, relative to the laboratory studies of

12 W.T. Fitch / Cognition xxx (2006) xxx xxx animal music perception they stress (Hauser & McDermott, 2003, p. 667). These authors give three reasons for this rejection. First, they state that the behavioural context of animal song is extremely limited relative to human song, and defined by its role in the adaptive context of territory defense and mate attraction. But animal song does occur outside of these contexts (e.g., subsong and whisper song, see below) and human music is similarly limited in many cultures (e.g., songs or even whole styles that are only appropriate in church, or at weddings, funerals or birthday parties, or people who sing exclusively in the shower). Such limitations certainly do not disqualify these performances from being music. More importantly, I see no reason that shared adaptive context should be a pre-requisite of biological analogy: the fact that flight is used in some species to capture prey and in others to escape predation or pursue mates does not make it less analogous. Analogy is a property of mechanisms, and should be based on objective, formal criteria subject to empirical test, not by inferred adaptive function. Hauser and McDermottÕs second argument for disqualifying animal song as analogous to human song is that animal song functions solely in communication with no evidence of solo performances, practice or productions for entertainment, while human singing is characteristically produced for pure enjoyment. This statement is misleading, because young male songbirds do sing alone, seemingly practicing for later adult performances (a behaviour termed subsong ), and even adult songbirds sometimes sing quietly and alone (termed whisper song ). Again, it is unclear why adaptive functional considerations should exclude analogy. In any case, to contrast pure enjoyment (a proximate causal explanation) with communicative function (an ultimate adaptive explanation) is to conflate two separate levels of biological explanation (Tinbergen, 1963). Finally, Hauser and McDermott reject the analogy between animal and human song because in most non-human singing species, singing is predominantly a male behaviour, which is not true for humans. But there are many bird species in which females sing as much as males (Langmore, 1998; Riebel, 2003) and some human cultures in which conspicuous musical performances are limited mainly to males (Titon et al., 1984). Although it does appear to be the case that only male whales sing (Croll et al., 2002; Payne & Payne, 1985), and it is clear that most of the birdsong in temperate regions is performed by males, female song and duetting is much more common in poorly studied tropical species. Since most birds species live in the tropics, our perception of the frequency of female song in birds may be somewhat skewed for accidental historical reasons (Langmore, 1998). But even to the extent that the generalization concerning male song is true, it is unclear why this should disqualify male-specific song from analogy with human song. In many insect species, only males have wings, but this fact provides no grounds for rejecting the analogy between male winged flight with that of other species. I conclude that none of these arguments provide compelling grounds for rejecting the traditional analogy between human and animal song, dating back to Aristotle and championed by Darwin, Marler and many others. Field or laboratory studies of animal music-making provide a complement to laboratory work on perception, not an alternative. In the next section, I will give an overview of the comparative

W.T. Fitch / Cognition xxx (2006) xxx xxx 13 data on animal song. Given the objective definition of song I propose, and a circumscribed subset of animal vocalizations to be isolated, it will be seen that the analogy is quite rich. Indeed, I will argue, a careful consideration of animal song provides an empirical basis for hypotheses about both some general constraints on the evolution of complex signaling systems, and specific aspects of musical form that may result from constraints imposed by the vertebrate nervous system by producing and processing such complex signals (e.g., hierarchicality, see Section 3.2). 3. Literature review: The comparative biology of music In this section, I will briefly review comparative data concerning animal musicmaking, in particular song and instrumental music, using the definitions above as my guide. Another source of useful and important data on animal musical abilities comes from laboratory studies examining animalsõ perception of human music, e.g., the finding that goldfish and pigeons can distinguish between and generalize about musical styles (Chase, 2001; Porter & Neuringer, 1984), or the difficulties monkeys have in generalizing about melody transpositions other than the octave (DÕAmato, 1988; Wright et al., 2000). However, there are several comprehensive reviews of this topic already in the literature (Carterette & Kendall, 1999; McDermott & Hauser, 2005), so I will focus here on studies of animalõs spontaneous production of song or drumming. Because such behaviour can be observed and recorded, the ethological literature on this topic is a rich source of insights into the biology of music. 3.1. Bird song The most obvious analog of human song in the animal world is birdsong. Bird song has traditionally been differentiated from other avian vocalizations ( calls ) by its complexity and, in songbirds, by the fact that it is learned (Catchpole & Slater, 1995; Langmore, 1998; Riebel, 2003). Other factors such as seasonality (e.g., singing in the spring), function (e.g., defending a territory) or sex differences (singing mostly by males) are also associated with, but not diagnostic of, song. The existence of song learning was already noted by Aristotle (Aristotle, 350 BC), and vocal learning provided an important basis for DarwinÕs suggestion that human and bird music are evolutionary analogs (Darwin, 1871). Since that time it has become clear that vocal learning is a key factor, in most bird species, distinguishing song from calls (which are typically innate), and furthermore that vocal learning is a quite rare ability among mammals (Janik & Slater, 1997). Nonetheless, there are a few types of birds which are said to sing on functional grounds (because the vocalization functions in territoriality or courtship) but whose vocalizations are not complex (e.g., doves) or are not learned (some suboscines). I will explicitly exclude such species here. The ability to vocally learn, and the resulting possibility for cultural transmission, dialect formation and the like, is a critical consideration when discussing the analogy with human music (or language). Thus (following my definition of animal song) this discussion is focused only on learned song.

14 W.T. Fitch / Cognition xxx (2006) xxx xxx Birds are one of the four classes of terrestrial vertebrates (Class Aves) and they are perhaps the most highly vocal (the other contenders are frogs and toads, Class Amphibia, and some mammals, Class Mammalia). One of the key features differentiating birds from other vertebrates is their vocal organ, called the syrinx, which is possessed by all birds (in at least some form), and only in birds. The syrinx is a complex structure which lies in the chest, at the base of the trachea, between the lungs (for an overview of birdsong production see Suthers, 1999). The syrinx is devoted entirely to sound making, and is highly diverse among different orders of birds (and, often, diagnostically similar within an order King, 1989). The existence of two independent sound sources within the oscine larynx (Greenewalt, 1968; Suthers, 1990) allows these birds to create two completely independent pitches (e.g., to theoretically sing both parts of a Bach two-part invention simultaneously), making the oscine syrinx arguably the most sophisticated vocal instrument known. Although birds have a larynx (which is the sound-producing organ in most vertebrates, including humans), no bird is known to use the larynx to make sounds. Although there is a rough correlation between the complexity of this vocal organ and song complexity (passerine birds have the most complex songs and the most complex syrinx), the relationship is imperfect in that many species with a complex syrinx sing relatively simple songs. However, complexity can vary considerably within closely related clades with identical syringeal anatomy. Thus, neural factors must play a key role in controlling song complexity, with the most important factor being a capacity for vocal learning. Vocal learning has evolved convergently at least thrice in birds, in songbirds, parrots and hummingbirds (Catchpole & Slater, 1995). True songbirds (oscine passerines) form one large division of passerine birds. The other passerines are called suboscines and generally are not vocal learners. However, there is now evidence of vocal learning in the suboscine bellbirds (Kroodsma, pers. comm.), which would make a fourth group. Thus, song is learned in only three of 23 orders of birds (Passeriformes, Psittaciformes and Trochiliformes). However, these three orders are among the most speciose and account for more than half of the worldõs bird species. Thus, it is safe to say that birds make up the vast majority of animal species with a capacity for vocal learning. Ethologists have studied birdsong intensively over the last 50 years, and a huge amount is known about the mechanisms, function and ontogeny of birdsong. This makes birdsong a very rich source of comparative data relevant to the biology and evolution of music. Despite the clear fact that singing is done mainly by males in many species, there is a growing realization of the importance of female singing (Langmore, 1998) and of female perceptual learning (e.g., to differentiate among singing males (Riebel, 2003)). It appears that the traditional assumption that only males sing results partly from historical accident: in temperate regions, male singing is the norm, but in tropical species, both duetting and solo female song increasingly appear to be common or even typical (Morton, 1996). Another intriguing fact is that females of some species that do not normally sing can be induced to sing via male hormone treatment. Thus, the neural mechanisms for song are in place in females (perhaps as part of their song evaluation system), even though not normally expressed. Female song was long overlooked in many temperate species, and female song was often treated as an aberrant,

W.T. Fitch / Cognition xxx (2006) xxx xxx 15 non-adaptive trait (e.g., Darwin, 1871). If females and males look identical, it was often assumed that a singing individual must be a male. But with laparoscopic sexing and tagging it has become increasingly clear that even in Europe or North America female song is surprisingly common, and a normal feature of the social system in numerous birds. For instance European robin Erithacus rubecula females sing during the autumn to defend winter territories (Kriner & Schwabl, 1991), and female song typifies the common N. American cardinal Cardinalis cardinalis (Ritchison, 1986). Careful experimentation has revealed both mate attraction and territorial functions for female song in different species (Langmore, 2000), though territoriality appears to predominate (Farabaugh, 1982). Still, true sex role reversal (females displaying and males choosing) is rare, and male song remains in most species with female song. Thus, although few doubt the overarching historical role of sexual selection in the evolution of birdsong, the traditional assumption that birdsong is always a sexually selected trait, dating back to Darwin (Darwin, 1871), may need to be reevaluated in such cases. Because song in humans is sexually egalitarian, with both women and men having the potential to be excellent singers, female bird song is quite relevant to the evolution of song in our own species. 3.1.1. The adaptive functions of birdsong Since Darwin (1871), many authors have based hypotheses about the evolution of human music based on the adaptive functions of birdsong, so it is worth considering this topic in some detail. We know more about the function of birdsong than about any other complex acoustic communication system, and the possibility of doing experiments to test functional hypotheses means that ethologists have gone beyond correlations to make experimental tests of causality (Kroodsma & Byers, 1991; Searcy & Yasukawa, 1996). The ability to use hormone treatments, remove males from their territories or mute them, raise birds in isolation or with artificial tutors, etc., allows some empirically grounded statements about the current utility of birdsong (in sharp contrast to, for example, whale song or human music). However, perhaps surprisingly, the status of birdsong as an adaptation in the traditional historical sense (Reeve & Sherman, 1993; West-Eberhard, 1992; Williams, 1966) remains unclear. The reason is that neither song itself, nor the organs producing it, fossilize, so we have very little to go on in terms of documenting its past history. Furthermore, the current utility of song appears to be quite labile, even within species or between closely related species, limiting our ability to reconstruct common ancestors to the relatively recent past. Thus, although it is widely agreed that sexual selection played a role in the evolution of passerine song, there is little specific discussion of the past function(s) of birdsong in the literature, nor is there likely to be. See (Kroodsma & Byers, 1991) for a more detailed discussion, and (Catchpole & Slater, 1995) for more examples. The traditionally given functions of male birdsong are two: territorial defense and mate attraction/courtship (Catchpole & Slater, 1995). The first of these has been rigorously tested in a few passerine species. There are two key types of evidence. In the first, a male is removed from his territory and replaced by a loudspeaker broadcasting his song. Such studies have demonstrated that territories without broadcast song