Cooing, Crying, and Babbling: A Link between Music and Prelinguistic Communication

Similar documents
Before Reading. Introduce Everyday Words. Use the following steps to introduce students to Nature Walk.

VOCAL MUSIC I * * K-5. Red Oak Community School District Vocal Music Education. Vocal Music Program Standards and Benchmarks

Binaural and temporal integration of the loudness of tones and noises

The Official IDENTITY SYSTEM. A Manual Concerning Graphic Standards and Proper Implementation. As developed and established by the

arxiv: v2 [cs.sd] 13 Dec 2016

LOGICAL FOUNDATION OF MUSIC

Standards Overview (updated 7/31/17) English III Louisiana Student Standards by Collection Assessed on. Teach in Collection(s)

DRAFT. Vocal Music AOS 2 WB 3. Purcell: Music for a While. Section A: Musical contexts. How is this mood achieved through the following?

Evaluation of the Suitability of Acoustic Characteristics of Electronic Demung to the Original Demung

Have they bunched yet? An exploratory study of the impacts of bus bunching on dwell and running times.

VISUAL IDENTITY GUIDE

Pitch I. I. Lesson 1 : Staff

Chapter 1: Introduction

walking. Rhythm is one P-.bythm is as Rhythm is built into our pitch, possibly even more so. heartbeats, or as fundamental to mu-

LCD Data Projector VPL-S500U/S500E/S500M

PRACTICE FINAL EXAM T T. Music Theory II (MUT 1112) w. Name: Instructor:

Chapter 2 Social Indicators Research and Health-Related Quality of Life Research

Standard Databases for Recognition of Handwritten Digits, Numerical Strings, Legal Amounts, Letters and Dates in Farsi Language

Explosion protected add-on thermostat

Application Support. Product Information. Omron STI. Support Engineers are available at our USA headquarters from

SeSSION 9. This session is adapted from the work of Dr.Gary O Reilly, UCD. Session 9 Thinking Straight Page 1

Safety Relay Unit G9SB

TAP 413-1: Deflecting electron beams in a magnetic field

Generating lyrics with the variational autoencoder and multi-modal artist embeddings

Answers to Exercise 3.3 (p. 76)

Interactions of Folk Melody and Transformational (Dis)continuities in Chen Yi s Ba Ban

Soft Error Derating Computation in Sequential Circuits

CPE 200L LABORATORY 2: DIGITAL LOGIC CIRCUITS BREADBOARD IMPLEMENTATION UNIVERSITY OF NEVADA, LAS VEGAS GOALS:

Structural and functional asymmetry of lateral Heschl s gyrus reflects pitch perception preference

Corporate Logo Guidelines

ARCHITECTURAL CONSIDERATION OF TOPS-DSP FOR VIDEO PROCESSING. Takao Nishitani. Tokyo Metropolitan University

ECE 274 Digital Logic. Digital Design. Datapath Components Registers. Datapath Components Register with Parallel Load

CMST 220 PUBLIC SPEAKING

WE SERIES DIRECTIONAL CONTROL VALVES

Reverse Iterative Deepening for Finite-Horizon MDPs with Large Branching Factors

CAN THO UNIVERSITY JOURNAL OF SCIENCE INSTRUCTIONS FOR AUTHORS

Chapter 3: Sequential Logic Design -- Controllers

Safety Relay Unit G9SB

DIGITAL EFFECTS MODULE OWNER'S MANUAL

Panel-mounted Thermostats

artifacts, of thinking, feeling, believing, valuing and acting.

1 --FORMAT FOR CITATIONS & DOCUMENTATION-- ( ) YOU MUST CITE A SOURCE EVEN IF YOU PUT INFORMATION INTO YOUR OWN WORDS!

Chapter 5. Synchronous Sequential Logic. Outlines

Sequencer devices. Philips Semiconductors Programmable Logic Devices

Animals. Adventures in Reading: Family Literacy Bags from Reading Rockets

Reproducible music for 3, 4 or 5 octaves handbells or handchimes. by Tammy Waldrop. Contents. Performance Suggestions... 3

Phosphor: Explaining Transitions in the User Interface Using Afterglow Effects

Mapping Arbitrary Logic Functions into Synchronous Embedded Memories For Area Reduction on FPGAs

ECE 274 Digital Logic. Digital Design. Sequential Logic Design Controller Design: Laser Timer Example

THE SOLAR NEIGHBORHOOD. XV. DISCOVERY OF NEW HIGH PROPER MOTION STARS WITH 0B4 yr 1 BETWEEN DECLINATIONS 47 AND 00

GRABLINKTM. FullTM. - DualBaseTM. - BaseTM. GRABLINK Full TM. GRABLINK DualBase TM. GRABLINK Base TM

Contents. English. English. Your remote control 2

Your Summer Holiday Resource Pack: English

A New Concept of Providing Telemetry Data in Real Time

Brain potentials indicate immediate use of prosodic cues in natural speech processing

THE MOSSAT COLLECTION BOOK SIX

lookbook Higher Education

INPUT CAPTURE WITH ST62 16-BIT AUTO-RELOAD TIMER

lookbook Transportation - Airports

Big Adventures. Why might you like to have an adventure? What kind of adventures might you enjoy?

Star. Catch a. How. Teachers Pack. A starry moonlit adventure. Based on the beautiful picture book by Oliver Jeffers

Sa ed H Zyoud 1,2,3, Samah W Al-Jabi 2, Waleed M Sweileh 4 and Rahmat Awang 3

Pro Series White Toner and Neon Range

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 30. Setting Up the Projector 17

Applications to Transistors

style type="text/css".wpb_animate_when_almost_visible { opacity: 1; }/style

CPSC 121: Models of Computation Lab #2: Building Circuits

Predicted Movie Rankings: Mixture of Multinomials with Features CS229 Project Final Report 12/14/2006

User's Guide. Downloaded from

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 29. Setting Up the Projector 16

PIRELLI BRANDBOOK 4. IDENTITY DESIGN

Introduction. APPLICATION NOTE 712 DS80C400 Ethernet Drivers. Jun 06, 2003

Synchronising Word Problem for DFAs

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 29. Setting Up the Projector 16

Train times. Monday to Sunday. Stoke-on-Trent. Crewe

Long wavelength identification of microcalcifications in breast cancer tissue using a quantum cascade laser and upconversion detection

Engineer To Engineer Note

First Grade Language Arts Curriculum Essentials

Preview Only. Editor s Note. Pronunciation Guide

MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko and Shuvra S. Bhattacharyya

400 Series Flat Panel Monitor Arm

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 28. Setting Up the Projector 15

Contents. Thank you for the music page 3 Fernando 9 Waterloo 18

Kelly McDermott h#s tr#veled the U.S., C#n#d# #nd Europe #s performer, te#cher #nd student. She h#s # B#chelor of Music degree in flute perform#nce

lookbook Corporate LG provides a wide-array of display options that can enhance your brand and improve communications campus-wide.

Notations Used in This Guide

TIMBRE IN MUSICAL AND VOCAL SOUNDS: THE LINK TO SHARED EMOTION PROCESSING MECHANISMS. A Dissertation CASADY DIANE BOWMAN

This page intentionally left blank

Day care centres (ages 3 to 5) Kindergarten (ages 4 to 5) taken part in a fire drill in her building and started to beep.

A Proposed Keystream Generator Based on LFSRs. Adel M. Salman Baghdad College for Economics Sciences

Successful Transfer of 12V phemt Technology. Taiwan 333, ext 1557 TRANSFER MASK

TAU 2013 Variation Aware Timing Analysis Contest

Notations Used in This Guide

The wonders of the mind. The way we are. Making a difference. Around the world. Module 1. Module 2. Module 3. Module 4

92.507/1. EYR 203, 207: novaflex universal controller. Sauter Systems

Contents 2. Notations Used in This Guide 7. Introduction to Your Projector 8. Using Basic Projector Features 34. Setting Up the Projector 17

Les documents qui font d4j4 I'objet dun drat d'auteur (articles de revue, examens publib, etc) ne sont pas mibofilm4s.

Association of blood lipids with Alzheimer s disease: A comprehensive lipidomics analysis

ViaLite SatComs Fibre Optic Link

Can you believe your eyes?

Transcription:

Cooing, Crying, nd Bbbling: A Lin between Music nd Prelinguistic Communiction Michel Byrd, Csdy Bowmn, nd Tshi Ymuchi (mybrd@neo.tmu.edu, csdyb@neo.tmu.edu, tshi-ymuchi@tmu.edu) Deprtment of Psychology, Mil Stop 4235 Texs A&M University, College Sttion, TX 77843 USA Abstrct Lie lnguge, the humn cpcity to crete music is one of the most slient nd unique mrers tht differentites humns from other species (Cross, 2005). In the following study, the uthors show tht people s bility to perceive emotions in infnts vocliztions (e.g., cooing nd bbbling) is lined to the bility to perceive timbres of musicl instruments. In one experiment, 180 synthetic bby sounds were creted by rerrnging spectrl frequencies of cooing, bbbling, crying, nd lughing mde by 6 to 9-month-old infnts. Undergrdute prticipnts (N=145) listened to ech sound one t time nd rted the emotionl qulity of the synthetic bby sounds. The results of the experiment showed tht five coustic components of musicl timbre (e.g., roll off, mel-frequency cepstrl coefficient, ttc time nd ttc slope) could ccount for nerly 50% of the vrition of the emotion rtings mde by undergrdute students. The results suggest tht the sme mentl processes re probbly pplied for the perception of musicl timbres nd tht of infnts prelinguistic vocliztion. Keywords: Emotion; Lnguge; Music Introduction Infnts use vriety of vocl sounds, such s cooing, bbbling, crying, nd lughing, to express their emotions. Infnts prelinguistic vocl communictions re highly ffective in the sense tht they evoe specific emotions hppiness, frustrtion, nger, hunger, nd/or joy without conveying concrete ides. In this sense, infnts vocl communiction prllels music. Music is highly ffective; yet it is conceptully limited (Cross, 2005; Ross, 2009). The interction between music nd lnguge hs ttrcted much ttention recently (Chen-Hfftec, 2011; Cross, 2001; Mst, 2007). However, despite their similrities, little ttention hs been pid to the reltionship between music nd prelinguistic vocliztions (Chen-Hfftec, 2011; Cross, 2001; He, Hotson, & Trinor, 2007; Mst, 2007). If music nd lnguge re highly relted, wht is the reltionship between infnts vocl communictions such s bbbling, nd music? In the study described below, we nlyze coustic cues of infnts vocliztion nd demonstrte tht emotions creted by prelinguistic vocliztion cn be explined to lrge extent by the coustic cues of sound tht differentite timbres of musicl instruments, potentilly implicting tht the sme mentl processes re pplied for the perception of musicl timbres nd tht of infnts vocliztions. The pper is orgnized s follows: we review relted wor exmining the lin between prelinguistic vocliztion nd music followed by n overview of the experiment. After discussing our timbre extrction nd sound cretion method, we introduce one experiment tht investigtes the connection between music nd prelinguistic communiction. Relted Wor Infnts begin life with the bility to me different sounds first cooing nd crying, then bbbling. Next they form one word, nd then two, followed by full sentences nd speech. In the first ten months, infnts progress from simple sounds tht re not expressed in the phonetic lphbet, to bbbling, which is n importnt step in infnts lerning how to spe (Gros-Louis, West, Goldstein, & King, 2006; Oller, 2000). Musicl instruments nd infnts vocliztions both elicit emotionl responses, while conveying little informtion on wht the sender is trying to express. Music cn hve very powerful effect on its listeners, s we ll hve piece of music tht will bring bc emotions. Music cn convey t lest three universl emotions, hppiness, sdness nd fer (Fritz et l., 2009). These emotions re similr to the emotions expressed by infnts with their limited sounds (Dessureu, Kurowsi, & Thompson, 1998; Zeifmn, 2001; Zesind & Mrshll, 1998). Both infnts nd music convey mening without the use of words. Infnts rely on their voices nd non-verbl/non-word sounds to communicte nd it is these sounds tht inform the listener of how importnt nd of wht type of dnger the infnt is fcing, such s being too cold, hungry or of being left lone (Dessureu et l., 1998; Zeifmn 2001; Zesind & Mrshll, 1998). Across cultures, songs sung while plying with bbies re fst, high in pitch, nd contin exggerted rhythmic ccents, wheres lullbies re lower, slower nd softer. Infnts will use cues in both music nd lnguge to lern the rules of culture. Motherese, form of speech used by dults in intercting with infnts, often consists of singing to infnts using musicl, sing-song voice, tht mimics bbies cooing by using higher pitch. An infnt s cregiver will use higher pitch when speing to n infnt, s it helps the infnt lern nd lso drws their ttention (Fernld 1989). In summry, reserch shows tht there is close lin between infnts vocl communiction nd music. This lin is demonstrted through the bbbling nd cooing sounds used by infnts to communicte, nd lso by mothers use of motherese to ssist infnt s lerning of lnguge in sing-song mnner. Infnts re ble to use the sme cues 1392

from both music nd lnguge to fcilitte lerning in both domins. Given these close connections, it is liely tht the sme mentl processes re involved for the perception of instrumentl sounds nd the perception of infnts vocliztions. The beginning stges of this ide re investigted in one experiment by exmining the emotion perception of synthetic bby sounds. Overview of the Study In the Emotion Rting Experiment described below, we tested the generl hypothesis tht the sme mentl process is involved for the perception of infnts vocliztion nd tht of timbres of musicl instruments. More specificlly, we hypothesize tht the coustic components of timbre will be significnt predictors of emotion. If this is true, then there should be plusible lin between musicl timbre nd prelinguistic vocl timbre, lso indicting lin for mentl processing in the two domins. We employed n udio synthesizer progrm nd creted 180 different synthetic bby sounds by combining spectrl frequencies of rel bby sounds. In the experiment, our undergrdute prticipnts (N=145) listened to the synthetic bby sounds one t time nd rted ffective qulities of these sounds. Lter, we extrcted musicl timbres from the synthetic bby sounds, nd exmined the extent to which the emotionl rtings mde by our undergrdute students were ccounted for by the timbres of the synthetic bby sounds. Timbre is n importnt perceptul feture of both music nd speech. Timbre is defined s the coustic property tht distinguishes two sounds for exmple, those of the flute nd the pino of identicl pitch, durtion, nd intensity (Hilstone et l., 2009; McAdms & Cunible, 1992). The clssic definition of timbre sttes tht two different timbres result from the sound of different mplitudes (of hrmonic components) of complex tone in stedy stte (Helmholtz, 1885). Timbre is sound qulity tht encompsses the spect of sound tht is used to distinguish it from other sounds of the sme pitch, durtion, nd loudness. The timbre properties of ttc time, ttc slope, zerocross, roll off, brightness, mel-frequency cepstrl coefficients, roughness, nd irregulrity re well nown in music perception reserch s the min coustic cues tht correlte with the perception of timbre of musicl instruments (Hilstone et l., 2009). Our ssumption is tht if infnts vocl sounds re perceived in the sme mnner s the timbres of musicl instruments re perceived, these sme coustic properties cn ccount for the perception of emotions in infnts vocliztion. Using principl components nlysis (PCA), we summrized emotionl rtings mde by our undergrdute prticipnts into two principl dimensions, to reduce the dt, nd pplied stepwise regression to evlute the extent to which our predictors the coustic timbre components ccounted for emotion rtings for synthesized bby sounds. Below, we briefly describe our timbre extrction method nd the method of creting synthetic bby sounds. Timbre Extrction This section describes coustic cues relting to timbre in detil, s well s the computtionl procedure of extrcting these cues. The purpose of using these coustic cues is to ct s predictors in regression nlyses tht cn explin perceived emotions of our synthetic bby sounds. The coustic cues were chosen bsed on their use in musicl timbre (see Lrtillot & Toiviinen, 2007). Eight coustic properties of timbre: ttc time, ttc slope, zero-cross, roll off, brightness, mel-frequency cepstrl coefficients, roughness, nd irregulrity were extrcted from ll sound stimuli using MIRToolbox in Mtlb (Lrtillot, Toiviinen, & Eerol, 2008). These coustic properties re nown to contribute to the perception of timbre in music independently of melody nd other musicl cues (Hilstone et l., 2009). The coustic fetures were extrcted from synthesized sounds rted in the Emotion Rting Experiment. Attc time is the time in seconds it tes for sound to trvel from mplitude of zero, to the mximum mplitude of given sound signl, or more simply the temporl durtion. Some fetures of timbre such s ttc time contribute to the perception of emotion in music (Gbrielsson & Juslin, 1996; Juslin, 2000; Loughrn, Wler, O Neill & O Frrell, 2001); which suggests tht fetures of timbre cn t lest in prt determine the emotionl content of music (Hilstone et l., 2009). Figure 1. Attc times of n udio file. A through d re seprte ttc times; indicteb by the distnce from the blc line, to the Attc time is computed using the eqution of line, y = mx + b, it is prt of sounds mplitude envelope where m is the slope of the line nd b is the point where the line crosses the verticl xis (t=0). Figure 1 gives n illustrtion of ttc time. The horizontl segments below the x-xis indicte the time it tes in seconds to chieve the mximum pe of ech frme for which the ttc time ws clculted. the red line. Attc slope is the ttc phse of the mplitude envelope of sound, lso interpreted s the verge slope leding to the ttc time. This cn lso be clculted using the eqution of line y = mx +b, where m is the slope of the line nd b is the point where the line crosses the verticl xis (t=0), see Figure 2. The red line in Figure 2 indictes the slope of the ttc. 1393

Figure 2. Attc slope of udio file. The red rrow indictes the durtion (ttc time) for which the Zero-cross is the number of times sound signl crosses the x-xis, this ccounts for noisiness in signl nd is clculted using the following eqution where sign is 1 for positive rguments nd 0 for negtive rguments. X[n] is the time domin signl for frme t. ttc slope is clculted. N 1 Z t sign ( x[ n]) sign ( s[ n ]) 2 n 1 Roll off is the mount of high frequencies in signl, which is specified by cut-off point. The roll-off frequency is defined s the frequency where response is reduced by -3 db. This is clculted using the following eqution where Mt is the mgnitude of the Fourier trnsform t frme t nd frequency bin n. Rt is the cutoff frequency, see Figure 3. R t R t M [ n] 0.85* t n 1 n 1 M [ n] Brightness is the mount of energy bove specified frequency, typiclly set t 1500 Hz this is relted to spectrl centroid. The term brightness is lso used in discussions of sound timbres, in rough nlogy with visul brightness. Timbre reserchers consider brightness to be one of the strongest perceptul distinctions between sounds. Acousticlly it is n indiction of the mount of highfrequency content in sound, nd uses mesure such s the spectrl centroid, see Figure 3. Figure 3. Brightness of n udio file. To the right of the red dshed line is the mount of energy bove 1500 Hz, or the brightness of the sound. t Roughness is sensory dissonnce, the perceived hrshness of sound; this is the opposite of consonnce (hrmony) within music or even single tone hrmonics. Both consonnce nd dissonnce re relevnt to emotion perception (Koelsch, 2005). Roughness is clculted by computing the pes within sound s spectrum nd mesuring the distnce between pes, dissonnt sounds hve irregulrly plced spectrl pes s compred to consonnt sounds with evenly spced spectrl pes. Formlly, roughness is clculted using the following eqution where j nd re the mplitudes of the components, nd g (f cb ) is stndrd curve. This ws first proposed by (Plomp & Levelt, 1965). n j, j n j 2 j g( f Following extrction of the vlue for roughness from the sound stimuli, principl components nlysis ws used to reduce the dimensions of the roughness dt. Mel-frequency Cepstrl Coefficients (mfcc) represent the power spectrum of sound. This power spectrum is bsed on liner trnsformtion from ctul frequency to the Melscle of frequency. The Mel scle is bsed on mpping between ctul frequency nd perceived pitch s the humn uditory system does not perceive pitch in liner mnner. Mel-frequency cepstrl coefficients re the dominnt fetures used in speech recognition s well s some music modeling (Logn, 2001). Frequencies in the Mel scle re eqully spced, nd pproximte the humn uditory system more closely thn linerly spced frequency bnds used in norml cepstrum. Due to lrge dt output, prior to nlyses mfcc dt were reduced using principl components nlyses to crete worble set of dt. A cutoff criterion of 80% ws used to represent the vribility in the originl mfcc dt. Figure 4 shows the numericl Melfrequency cepstrl coefficient rn vlues for the 13 mfcc components returned. Thirteen components re returned due to the concentrtion of the signl informtion in only few low-frequency components. Figure 4. Mel-frequency cepstrl coefficients (mfcc) of n udio file. This figure shows the coustic component mfcc. Ech br represents the numericl (rn coefficient) vlue computed for the thirteen components returned. Irregulrity of spectrum is the degree of vrition between pes of spectrum (Lrtillot et l., 2008). This is clculted using the following eqution where irregulrity is the sum of the squre of the difference in mplitude between djoining prtils in sound. N 1 ( N 1 2 ) 2 1 cb ) 1394

Creting Synthetic Bby Sounds We creted 180 short, 2 second, synthetic bby sounds from ten rel infnt sounds: five mles nd five femles rnging from ges 6 to 9 months ming screming, lughing, crying, cooing nd bbbling sounds. These sounds were chosen to crete novel stimuli emulting humn prelinguistic sounds. Among these sounds, four (one screming boy, one crying boy, one screming girl nd one crying girl) were udio-recorded directly from two volunteer infnts in Ncogdoches, Texs using n Olympic Digitl Voice WS-400S recorder. The sounds of bbbling nd cooing boys nd girls were ten from udio-files downloded from sound effects website (http://www.freesounds.org), nd the sounds of lughing boy nd girl were ten from files downloded from YouTube (http://www.youtube.com). These infnt sounds were decomposed by four lbortory ssistnts into mplitude nd spectrl frequency components by pplying fst Fourier trnsform using sound editing softwre progrm (SPEAR, Klingbeil, 2005). Arbitrrily chosen spectrl frequencies of one sound (e.g., bbbling sound of boy) were mixed with rbitrrily chosen spectrl frequencies of nother sound (e.g., cooing girl) nd then modified by mens of mplitude, or shifting frequencies, to convey one of the bsic emotions, hppy, sd, nger, or fer (Emn, 2002). For ech sound pir, four sounds were creted to sound hppy, sd, ngry, nd ferful. In this mnner, ech sound pir (45 pirs in totl, ll possible pirs of the 10 rel sounds), ws used to crete four ffective sounds, which ws decided subjectively by the lbortory ssistnts. The totl 180 sound stimuli were normlized nd white noise ws ten out prior to nd fter cretion of ech sound stimulus. Emotion Rting Experiment The gol of the experiment ws to obtin empiricl rtings of college students exmining the emotionl qulity of the synthetic bby sounds tht we creted. To nlyze the lin between emotion rtings nd coustic cues, stepwise regression nlysis ws employed. djusted nd normlized. Prticipnts were instructed to listen to sound stimuli, nd rte ech sound on five emotion ctegories, hppy, sd, ngry, ferful, nd disgusting (Emn, 1992; Johnson-Lird & Otley, 1989). Ech scle rnged from 1 to 7 1 being strongly disgree (the degree to which the stimuli, sounded lie one of the five emotions), nd 7 being strongly gree. Stimuli were presented in rndom order. Results This section strts with descriptive sttistics of emotion rtings followed by the results from stepwise regression nlysis, which exmined the extent to which emotion rtings given to the synthetic bby sounds were explined by their timbre properties. For the regression nlysis, verge emotion scores were clculted for individul synthetic sounds by collpsing over individul prticipnts, yielding 179 sounds x 5 emotion dimension mtrix. By pplying principl component nlysis (PCA), this mtrix ws summrized in 179 x 2 mtrix with the two columns corresponding to two principl components identified by the PCA procedure. The first two orthogonl components explined 88.1% nd 7.1% of the vrince of the emotion rting dt, respectively. Descriptive Sttistics. Behviorl dt, Figure 5, shows overll observtions for ech emotion from the emotion rting dt. From the whisers of the box plot for the emotion dt, it is pprent tht there is vrition within the dt. The highest rting for the emotion dt did not exceed vlue of 6, on the scle of 1-7. The medin of the rtings for emotion vried between pproximtely 2.5 nd 4.75 within the emotion rting dt. For ll 179 sounds rted, most were rted s ngry, indicted by the medin of the dt for nger. The sounds were rted lest lie the emotion hppy, s the medin for this emotion ws the lowest for ll sounds rted on the five emotions. Prticipnts. A totl of 145 undergrdute students (73 mles, 73 femles) prticipted in this experiment for course credit. Prticipnts were rndomly ssigned to one of two groups tht listened to 90 or 89 sounds of 179 totl sounds. Stimuli were rndomly ssigned to one of two groups; no prticipnts were in both groups. Mterils. Stimuli were ten from the 180 synthetic bby sounds tht were creted from group of totl of ten recorded rel infnts sounds (see the Creting Synthetic Bby Sounds section for the detils of the sound cretion). Procedure. Prticipnts were presented with 90/89 sounds using customized Visul Bsic softwre through JVC Flts stereo hedphones. Ech stimulus s mximum volume ws Figure 5. Box plot of observtions for emotion rtings. Ech box in the figure indictes one emotion rted by prticipnts. The medin is indicted by the red line in the center of ech box, nd the edges indicte the 25 th nd 75 th percentiles, the whisers of ech plot indicte the extreme dt points, nd outliers re plotted outside of the whisers. 1395

Regression nlysis. A step-wise regression nlysis ws used to nlyze the collected rting dt nd timbre components, to determine which component could best explin the emotion rting dt. Seventeen totl predictors were used in the stepwise regression to nlyze the emotion rtings mde by prticipnts. These were ttc time, ttc slope, zero-cross, roll off, brightness, mel-frequency cepstrl coefficients 1-6, roughness 1-4, nd irregulrity. Due to lrge dt output, mfcc dt were reduced using principl components nlyses to crete worble set of dt. There were originlly 13 numericl Mel-frequency cepstrl coefficient rn vlues returned. These 13 rn vlues were reduced to 6, ccounting for 78% of the totl mfcc dt. Roughness ws lso reduced in the sme wy using PCA, from 79 components to four components tht described 80% of the originl roughness dt. These predictors were used to nlyze the emotion rtings mde by prticipnts. The results of the regression for the first principl component (PCA1) indicted four coustic fetures significntly predicted emotion rtings; roll off (β = -.386, p<.001), mfcc 6 (β =. 218, p<.001), ttc time (β =. 248, p<.001), nd mfcc 3 (β = -.202, p<.002), nd ttc slope (β =.034, p<.034), see Tble 1 for percent explined by principl component 1. Tble 1: Significnt coustic components for emotion PCA 1nd PCA 2 Predictors PCA 1 PCA 2 % explined 88% 7.1% Attc time.23***.31*** Attc slope.12* Irregulrity -.16* Mfcc 1 -.24** Mfcc 3 -.19** Mfcc 6.22*** Roughness.21** Zero-cross.25** Roll off -.41*** * p <.05, ** p <.01, nd *** p <.001. The second principl component (PCA 2) showed tht five coustic fetures significntly predicted emotion rtings; mfcc 1 (β = -.244, p<.001), zero cross (β =.250, p<.002), ttc time (β =.305, p<.000), roughness 2 (β =.208, p<.006), nd irregulrity (β = -.159, p<. 024), (Tble 1). () (b) Figure 6. R-squred. Emotion judgment principl components 1 (PCA 1) nd 2 (PCA 2). This figure shows the proportion of R-squred contributed for ech ddition of predictor to the model for principl component I nd II from the emotion judgments. Figure 6 shows the proportion of R-squred contributed for ech ddition of predictor to the model for PCA 1 () nd PCA 2 (b). Looing t the vlues of R-squred, it is pprent tht roll-off ws best ble to describe the emotion rtings, ccounting for 30% of the emotion rtings for PCA 1. The second principl component does show severl significnt coustic cues tht predict emotion; however, none re s strong s in the first principl component. Generl Discussion Music nd lnguge re perhps two of the most cognitively complex nd emotionlly expressive sounds invented by humns. Recently, the evolutionry origins of music nd lnguge hve ttrcted much ttention in reserchers of brod spectrum (Cross, 2001, 2005; Huser et l., 2002; Kirby, 2007). The present study, exmining the reltionship between infnts vocliztions cooing, bbbling, crying nd screming nd the perception of musicl timbres, suggests tht the lin between music nd lnguge cn go even further bc to the prelinguistic level of development. Our Emotion Rting Experiment indictes tht nerly 50% of emotions creted by syntheticlly produced infnt sounds cn be explined by smll number of coustic cues pertining to musicl timbres. Among those, roll off, which quntifies the mount of high frequencies in signl, turned out to be the most importnt cue. The second most importnt property, mfcc (mel-frequency cepstrl coefficients), corresponds to perceived pitch in the humn uditory system, nd re the dominnt fetures used in speech recognition nd music modeling (Logn, 2001). Given these findings, we conjecture tht high-frequency sounds re probbly ten s the robust cue of emotion ttribution, nd more fine-grined distinctions of emotion re mde by extrcting speech-relted cues. The bility to discriminte sounds is sid to be present even in primitive nimls such s crp (Chse, 2001), implying tht this bility evolved erly in history. Some nimls hve sounds nd or clls tht cn convey the emotions of finding something of interest or of fer (Huser, 1396

Chomsy, & Fitch, 2002). Such bilities were probbly present even before music ws fully developed in the current form. Acnowledgments We would lie to thn N Yung Yu nd Ricrdo Gutierrez- Osun for their vluble comments. The first two uthors, MB nd CB, contributed to this study n equl mount nd the order of their uthorship ws determined by coin toss. References Chse, A. R. (2001). Music Discrimintions by crp (Cyprinus crpio). Animl Lerning nd Behviour, 29, 336-353. Cross, I. (2001). Music, mind nd evolution. Psychology of Music, 29, 95-102. Cross, I. (2005). Music nd mening, mbiguity nd evolution. In D. Miell, R. McDonld, D. Hrgreves (Ed.), Musicl Communiction (pp. 27-43). New Yor: Oxford University Press. Dessureu, B. K., Kurowsi, C. O., & Thompson, N. S. (1998). A ressessment of the role of pitch nd durtion in dults' responses to infnt crying. Infnt Behvior nd Development, 21, 367-371. Emn, P. (1992). Are there bsic emotions? Psychologicl Review, 99, 550-553. Fernld, A. (1989). Intontion nd Communictive Intent in Mothers' Speech to Infnts: Is the Melody the Messge? Child Development, 60, 1497-1510. Fritz, T., Jensche, S., Gosselin, N., Smmler, D., Peretz, I., Turner, R., Koelsch, S. (2009). Universl Recognition of Three Bsic Emotions in Music. Current Biology, 19, 573-576. Gbrielsson, A., & Juslin, P. N. (1996). Emotionl expression in music performnce between the performer s intention nd the listener s experience. Psychology of Music, 24, 68-91. Gros-Louis, J., West, M. J., Goldstein, M. H., & King, A. P. (2006). Mothers provide differentil feedbc to infnts' prelinguistic sounds. Interntionl Journl of Behviorl Development, 30, 112-119. Hilstone, J. C., Omr, R., Henley, S., Frost, C., Kenwrd, M., & Wrren, J. D. (2009). It's not wht you ply, it's how you ply it: Timbre ffects perception of emotion in music. Qurterly Journl of Experimentl Psychology, 62, 2141-2155. Huser, M. D., Chomsy, N., & Fitch, W. T. (2002). The Fculty of Lnguge: Wht Is It, Who Hs It, nd How Did It Evolve? Science, 22, 1569-1580. He, C., Hotson, L., & Trinor, L. J. (2007). Mismtch Responses to Pitch Chnges in Erly Infncy. Journl of Cognitive Neuroscience, 19, 878-892. Helmholtz, H. v. (2005). On the Senstions of Tone s Physiologicl Bsis for the Theory of Music. London: Longmns, Green, nd Co. Johnson-lird, P. N., & Otley, K. (1989). The lnguge of emotions: An nlysis of semntic field. Cognition & Emotion, 3, 81-123. Juslin, P. N. (2000). Cue utiliztion in communiction of emoting in music performnce: Relting performnce to perception. Journl of Experimentl Psychology: Humn Perception nd Performnce, 26, 1797-1813. Kirby, S. (2007). The evolution of lnguge. In R. Dunbr & L. Brrett (Eds.), Oxford hndboo of evolutionry psychology (pp. 669-681). Oxford: Oxford University Press. Klingbeil, M. (2005). Softwre for spectrl nlysis, editing, nd synthesis Proceeding of the ICMC (pp. 107-110). Brcelon Spin. Koelsch, S. (2005). Neurl substrtes of processing syntx nd semntic in music. Current Opinion in Neurobiology, 15, 207-212. Lrtillot, O., Toiviinen, P., Eerol, T. (2008) A Mtlb Toolbox for Music Informtion Retrievl. In, C. Preisch, H. Burhrdt, L. Schmidt-Thieme, nd R. Decer (eds.), Dt Anlysis, Mchine Lerning nd Applictions, Studies in Clssifiction, Dt Anlysis, nd Knowledge Orgniztion, (pp 261-268). New Yor: Springer. Logn, B., & Robinson, T. (2001). Adptive model-bsed speech enhncement. Speech Communiction, 34, 351-368. Loughrn, R., Wler, J., O Neill, M., O Frrell, M. (2001). The Use of Mel-frequency Cepstrl Coefficients in Musicl Instrument Identifiction. in Proc. of the 6th Interntionl Conference on Music Informtion Retrievl (ISMIR), 2005, Finlnd, (pp. 1825 1828). Mst, N. (2007). Music, evolution nd lnguge. Developmentl Science, 10, 35-39. McAdms, S., & Cunible, J. C. (1992). Perception of timbrl nlogies. Philosophicl Trnsctions: Biologicl Sciences, 9, 336-383. Olivier Lrtillot, Petri Toiviinen, A Mtlb Toolbox for Musicl Feture Extrction From Audio, Interntionl Conference on Digitl Audio Effects, Bordeux, 2007. Oller, D. K. (2000). The emergence of the speech cpcity. Mhwh, NJ: Lwrence Erlbum Plomp, R., & Levelt, W. J. M. (1965). Tonl consonnce nd criticl bndwidth. Soesterberg: Institute for Perception RVO-TNO, Ntionl Defense Reserch Council T.N.O. Ross, B. (2009). Chllenges fcing theories of music nd lnguge co-evolution. Journl of the Musicl Arts in Afric, 6, 61-76. Zesind, P., S, & Mrshll, T., R. (1988). The Reltion between Vritions in Pitch nd Mternl Perceptions of Infnt Crying. Child Development, 59, 193-196. 1397