Lecture 2 What we hear: Basic dimensions of auditory experience

Similar documents
Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

We realize that this is really small, if we consider that the atmospheric pressure 2 is

9.35 Sensation And Perception Spring 2009

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

HST 725 Music Perception & Cognition Assignment #1 =================================================================

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

UNIVERSITY OF DUBLIN TRINITY COLLEGE

CTP431- Music and Audio Computing Musical Acoustics. Graduate School of Culture Technology KAIST Juhan Nam

Music Representations

Timbre perception

Music Perception & Cognition

Pitch Perception. Roger Shepard

Psychoacoustics. lecturer:

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

CSC475 Music Information Retrieval

Music 175: Pitch II. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) June 2, 2015

UNIT 1: QUALITIES OF SOUND. DURATION (RHYTHM)

Melody: sequences of pitches unfolding in time. HST 725 Lecture 12 Music Perception & Cognition

Music Representations

AUD 6306 Speech Science

Analysing Room Impulse Responses with Psychoacoustical Algorithms: A Preliminary Study

Concert halls conveyors of musical expressions

PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF)

Simple Harmonic Motion: What is a Sound Spectrum?

Measurement of overtone frequencies of a toy piano and perception of its pitch

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos

Quarterly Progress and Status Report. Violin timbre and the picket fence

The Physics Of Sound. Why do we hear what we hear? (Turn on your speakers)

Math and Music: The Science of Sound

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

Creative Computing II

Tempo and Beat Analysis

August Acoustics and Psychoacoustics Barbara Crowe Music Therapy Director. Notes from BC s copyrighted materials for IHTP

The Tone Height of Multiharmonic Sounds. Introduction

Do Zwicker Tones Evoke a Musical Pitch?

BBN ANG 141 Foundations of phonology Phonetics 3: Acoustic phonetics 1

MUSI-6201 Computational Music Analysis

Loudness and Sharpness Calculation

Rhythm: patterns of events in time. HST 725 Lecture 13 Music Perception & Cognition

8/16/16. Clear Targets: Sound. Chapter 1: Elements. Sound: Pitch, Dynamics, and Tone Color

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England

The Cocktail Party Effect. Binaural Masking. The Precedence Effect. Music 175: Time and Space

Basic Considerations for Loudness-based Analysis of Room Impulse Responses

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

MODIFICATIONS TO THE POWER FUNCTION FOR LOUDNESS

Robert Alexandru Dobre, Cristian Negrescu

Harmony and tonality The vertical dimension. HST 725 Lecture 11 Music Perception & Cognition

Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra

Psychoacoustics and cognition for musicians

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

Lecture 1: What we hear when we hear music

Tempo and Beat Tracking

Topic 4. Single Pitch Detection

Pitch perception for mixtures of spectrally overlapping harmonic complex tones

Audio Feature Extraction for Corpus Analysis

Vibration Measurement and Analysis

Experiments on tone adjustments

MASTER'S THESIS. Listener Envelopment

THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO. J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England

TO HONOR STEVENS AND REPEAL HIS LAW (FOR THE AUDITORY STSTEM)

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Note on Posted Slides. Noise and Music. Noise and Music. Pitch. PHY205H1S Physics of Everyday Life Class 15: Musical Sounds

Analysis of local and global timing and pitch change in ordinary

CHAPTER 20.2 SPEECH AND MUSICAL SOUNDS

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink

Musical Signal Processing with LabVIEW Introduction to Audio and Musical Signals. By: Ed Doering

Getting Started with the LabVIEW Sound and Vibration Toolkit

1aAA14. The audibility of direct sound as a key to measuring the clarity of speech and music

Consonance perception of complex-tone dyads and chords

Pitch is one of the most common terms used to describe sound.

DETECTING ENVIRONMENTAL NOISE WITH BASIC TOOLS

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

Our Perceptions of Music: Why Does the Theme from Jaws Sound Like a Big Scary Shark?

Influence of tonal context and timbral variation on perception of pitch

S. S. Stevens papers,

Auditory scene analysis

Physics and Neurophysiology of Hearing

2 Autocorrelation verses Strobed Temporal Integration

Automatic music transcription

Music Complexity Descriptors. Matt Stabile June 6 th, 2008

Noise evaluation based on loudness-perception characteristics of older adults

The presence of multiple sound sources is a routine occurrence

Music Theory: A Very Brief Introduction

Amplitude and Loudness 1

Psychoacoustic Evaluation of Fan Noise

"The mind is a fire to be kindled, not a vessel to be filled." Plutarch

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION

Progress in calculating tonality of technical sounds

Precedence-based speech segregation in a virtual auditory environment

Topic 10. Multi-pitch Analysis

Instrumental Performance Band 7. Fine Arts Curriculum Framework

EFFECT OF REPETITION OF STANDARD AND COMPARISON TONES ON RECOGNITION MEMORY FOR PITCH '

Pitch-Synchronous Spectrogram: Principles and Applications

Acoustic concert halls (Statistical calculation, wave acoustic theory with reference to reconstruction of Saint- Petersburg Kapelle and philharmonic)

MUSI-6201 Computational Music Analysis

Transcription:

Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani HST 725 Music Perception & Cognition Lecture 2 What we hear: Basic dimensions of auditory experience (Image removed due to copyright considerations.) www.cariani.com

What we hear: dimensions of auditory experience Hearing: ecological functions (distant warning, communication, prey detection; works in the dark) Detection, discrimination, recognition, reliability, scene analysis Operating range: thresholds, ceilings, & frequency limits Independent dimensions of hearing & general properties Pitch Timbre (sound quality) Loudness Duration Location Distance and Size Perception of isolated pure tones Interactions of sounds: beatings, maskings, fusions Masking (tones vs. tones, tones in noise) Fusion of sounds & the auditory "scene": how many objects/sources/voices/streams? Representation of periodicity and spectrum Power spectrum and auditory filter metaphors Analytical (Helmholtz) vs. Gestalt (Stumpf) perspectives

Hearing: ecological functions Distant warning of predators approaching Identification of predators Localization/tracking of prey Con-specific communication Mating/competition Cooperation (info. sharing) Territory Navigation in the dark http://www.pbs.org/wgbh/nova/wolves/ http://www.pbs.org/lifeofbirds/songs/index.html bat-eared fox http://www.essex.ac.uk/psychology/hearinglab/index.htm

The auditory scene: basic dimensions Attributes of sounds Loudness (intensity) Pitch (dominant periodicity) Timbre (spectrum) Duration Location (bearing, range) Temporal organization Events Notes Temporal patterns of events Organization of sounds Voices, instruments Streams Objects Sources

Auditory qualities in music perception & cognition Pitch Timbre Loudness Organization Rhythm Longer pattern Melody, harmony, consonance Instrument voices Dynamics Fusions, objects. How many voices? Temporal organization of events Repetition, sequence Mnemonics Hedonics Affect Semantics Familiarity, novelty Pleasant/unpleasant Emotional associations, meanings Cognitive associations/expectations

Basic auditory qualities Dimensions of auditory perception Pitch Location Timbre Loudness TEMPORAL EVENT STRUCTURE Meter, sequence FUSION Grouping into separate objects Temporal co-occurrence harmonic structure John Lurie Car Cleveland Music from Stranger than Paradise

Visual scene Line Shape Texture Lightness Color Transparency Objects Apparent distance Apparent size etc. LIFE MAGAZINE COVER, Margaret Bourke-White Fort Peck Dam, Montana (1 st Life Cover) November 23, 1936.

Sound level basics Sound pressure levels are measured relative to an absolute reference (re: 20 micro-pascals, denoted Sound Pressure Level or SPL). Since the instantaneous sound pressure fluctuates, the average amplitude of the pressure waveform is measured using root-mean-square RMS. (Moore, pp. 9-12) Rms(x) = sqrt(mean(sum(x t2 ))) Where x t is the amplitude of the waveform at each instant t in the sample Because the dynamic range of audible sound is so great, magnitudes are expressed in a logarithmic scale, decibels (db). A decibel of amplitude expresses the ratio of two amplitudes (rms pressures, P1 and P_reference) and is given by the equation: db = 20 * log10(p1/p_reference) 20 db = 10 fold change in rms level

Decibel scale for relative amplitudes (levels) (rules of thumb) 20 db = fold change amplitude 10 db = 3+ fold change 6 db = 2 fold change amplitude 3 db = 1.4 fold change 2 db = 1.26 fold change (26 %) 1 db = 1.12 fold change (12%) 0 db = 1 fold change (no change) -6 db = 1/2-20 db = 1/10 fold change

Perceptual functions Subjective vs. objective measures Subjective measures Magnitude estimation Objective measures Detection: capability of distinguishing the presence or absence of a stimulus (or some aspect of a stimulus, e.g. AM detection) Threshold: the value of a stimulus parameter at which a stimulus can be reliably detected Sensation level (SL): sound level re: threshold Discrimination: capability of distinguishing between two stimuli Difference limen: the change in a stimulus parameter required for reliable discrimination, just-noticeable-difference (jnd) Weber fraction: Difference limen expressed as proportional change (e.g. f/f) Matching task Two-alternative forced choice (2AFC) Recognition: correct identification of a particular stimulus

Dynamic range 0 db SPL is set at 20 micropascals 60 db SPL is therefore a 1000 fold change in RMS over 0 db A typical background sound level is 50-60 db SPL. Dynamic range describes the range of sound pressure levels. The auditory system registers sounds from 20 db to >> 120 db SPL The auditory system has a dynamic range in excess of 100 db (!) or a factor of 10 5 = 100,000 in amplitude. It is quite remarkable that musical sounds remain recognizable over most of this range. This a fundamental aspect of hearing that all auditory theories must address -- how auditory percepts remain largely invariant over this huge range (perceptual constancy).

Typical sound levels in music On origins of music dynamics notation http://www.wikipedia.org/wiki/pianissimo In music, the word dynamics refers to the volume of the sound. The renaissance composer Giovanni Gabrieli was one of the first to indicate dynamics in music notation.the two basic dynamic indications in music are piano, meaning "softly" or "quietly", usually abbreviated as p; and forte, meaning "loudly" or "strong", usually abbreviated as f. More subtle degrees of loudness or softness are indicated by mp, standing for mezzo-piano, and Pain > 130 db SPL Loud rock concert 120 db SPL Loud disco 110 db SPL meaning "half-quiet"; and mf,mezzo- fff 100 db SPL is ff, standing for "fortissimo", and meaning "very loudly"; and pp, f (forte, strong) 80 db SPL forte, "half loud". Beyond fand p, there standing for "pianissimo", and meaning "very quietly". To indicate even more extreme degrees of intensity, more ps or fs are added as required. fff (fortississimo ) and ppp (pianississimo ) are found in sheet music quite p (piano, soft) 60 db SPL ppp 40 db SPL frequently, but more than three fs or ps is quite rare. It is sometimes said that pppp stands for pianissississimo, but such words are very rarely used either in speech or writing, even when present in a score. There is some evidence that this use of an increasing number of letters to indicate greater extremes of Lower limit Theshold of hearing 0 db SPL volume stems from a convention dating from the 17th century where pstood for piano,pp stood for più piano (literally "more quietly") and, by extension, ppp Musical notation ranges from Pierce, Science of Musical Sound, p. 325 indicated pianissimo.antonio Vivaldi seems to have written using this convention, but it was largely replaced by the above, more familiar, system by the middle of the 18th century.

Typical sound pressure levels in everyday life Disco (Reproduced courtesy of WorkSafe, Department of Consumer and Employment Protection, Western Australia (www.safetyline.wa.gov.au). The graphic being that at the bottom of: http://www.safetyline.wa.gov.au/institute/level2/course18/lecture54/l54_03.asp)

Demonstrations Demonstrations using waveform generator Relative invariance of pitch & timbre with level Loudness matching Pure tone frequency limits Localization

Loudness Dimension of perception that changes with sound intensity (level) Intensity ~ power; Level~amplitude Demonstration using waveform generator Masking demonstrations Magnitude estimation Loudness matching

Sound level meters and frequency weightings Relative Response (db) 5 0-5 -10-15 -20-25 -30-35 -40-45 2 4 8 10 2 2 4 8 10 3 2 4 8 10 4 Frequency (Hz) Sound Meter

Intensity discrimination improves at higher sound levels Best Weber fraction L/L is about 1 db 7 6 5 db L 4 3 2 1 10 20 30 40 50 60 70 80 db SL A comparison of just noticeable intensity differences (averaged across frequencies) for various species. Man (open symbols): red (Dimmick & Olson, 1941), orange (experiment I), blue (experiment III, Harris, 1963); cat: purple (Raab & Ades,1946; Elliott & McGee, 1965); rat: pink (Henry, 1938; Hack, 1971); mouse: brown (Ehret, 1975b); parakeet: green (Dooling & Saunders, 1975b). Figure adapted from cited sources above.

Loudness as a function of pure tone level & frequency Loudness level (in phons) Limit of pain 120 120 2 x 10 100 2 Sound pressure level (db) 80 60 40 20 Threshold of hearing 70f 50p 2 x 10-1 2 x 10-2 2 x 10-3 2 x 10-4 Newtons/m 2 0 0 2 x 10-5 Absolute detection thresholds on the order of 1 part in a million, pressure ~1/1,000,000 atm (Troland, 1929) 20 100 500 Frequency (Hz) 1,000 5,000 10,000 Constant-loudness curves for persons with acute hearing. All sinusoidal sounds whose levels lie on a single curve (an isophon) are equally loud. A particular loudness-level curve is designated as a loudness level of some number of phons. The number of phons is equal to the number of decibels only at the frequency 1,000 Hz.

Loudness perception: perceived growth of loudness w. level 100 Perceived Loudness In Sones 10 1.1 1,000 Hz 10,000 Hz 100 Hz.01 20 40 60 80 100 120 140 Intensity In Decibels Perceived loudness of tones of various frequencies as a function of physical intensity.

Loudness perception: population percentiles 120 Threshold of Feeling 100 Intensity Level (db) 80 60 40 20 0 20 100 Frequency (Hz) 90% 50% 10% 1% 1,000 10,000 20,000 Curves showing threshold of hearing at various frequencies for a group of Americans: 1 percent of the group can hear any sound with an intensity above the 1 percent curve; 5 percent of the group can hear any sound with an intensity above the 5 percent curve; and so on.

Hearing loss with age Loss in db 0 10 20 30 60 yr 40 yr 50 yr 30 yr 20 yr 40 31 62 125 250 500 1000 2000 4000 8000 Frequency in cps Progressive loss of sensitivity at high frequencies with increasing age. The audiogram at 20 years of age is taken as a basis of comparison. (From Morgan, 1943, after Bunch, 1929.)

Dynamic range of some musical instruments Please see Figure 8.5 in The science of musical sound. John R. Pierce. Edition: Rev. ed. Published: New York: Freeman, c1922. ISBN: 0716760053.

Range of pitches of pure & complex tones Pure tone pitches Range of hearing (~20-20,000 Hz) Range in tonal music (100-4000 Hz) Most (tonal) musical instruments produce harmonic complexes that evoke pitches at their fundamental frequencies (F0 s) Range of F0 s in tonal music (30-4000 Hz) Range of missing fundamental (30-1200 Hz)

Emergent pitch 0 200 400 600 800 1000 1200 1400 1600 1800 2000 0 5 10 15 20 Missing F0 600 Line spectra Autocorrelation (positive part) 0 200 400 600 800 10001200140016001800200 Pure tone 200 Hz 10 5 0 0 5 10 15 20

Correlograms: interval-place displays (Slaney & Lyon) Frequency (CF) Autocorrelation lag

Frequency ranges of (tonal) musical instruments 10k 8 6 5 4 3 2 1 0.5 0.25 27 Hz 110 262 440 880 4 khz Hz Hz Hz Hz

Frequency ranges: hearing vs. musical tonality (Courtesy of Malcolm Slaney (Research Staff Member of IBM Corporation). Used with permission.) 100 Hz 2 khz Temporal neural mechanism Musical tonality Octaves, intervals, melody: 30-4000 Hz Place mechanism Range of hearing Ability to detect sounds: ~ 20-20,000 Hz

Duplex time-place representations temporal representation level-invariant strong (low fc, low n) weak (high fc, high n; F0 < 100 Hz) place-based representation level-dependent coarse 30 100 1k 10k Similarity to interval pattern cf. Terhardt's spectral and virtual pitch Similarity to place pattern

Pitch dimensions: height & chroma C 6 C 6 G 5 C 5 C5 E 5 G 4 C 4 C 4 E 4 G 3 C 3 C 3 E 3 Tone-height Chroma Contrast between one-dimensional and two-dimensional models of pitch perception. Notes of a scale played on an ordinary instrument spiral upward around the surface of a cylinder, but computer-generated notes can form a Shepard scale that goes around in circle.

Pitch height and pitch chroma Please see figures 1, 2, and 7 in Roger N. Shepard. Geometrical approximations to the structure of musical pitch. Psychological Review 89 (4): 305-322, 1982.

JND's 10-2 10 1 Weber Fraction ( f/f) 10-3 10-4 10-5 Difference Limen ( f in Hz) 10 0 10-1 10-2 10-3 10-6 0.2 1.0 10.0 4 10 100 500 0 10 20 3040506070 80 Frequency (khz) Duration (ms) Level (db SL) Human Typical human performance for pure-tone frequency discrimination.

Pure tone pitch discrimination becomes markedly worse above 2 khz Weber fractions for frequency ( f/f) increase 1-2 orders of magnitude between 2 khz and 10 khz Weber Fraction ( f/f) 10-2 10-3 10-4 10-5 Human Data 10-6 0.2 1.0 10.0 Frequency (khz) Human

Pure tone pitch discrimination improves at longer tone durations and at higher sound pressure levels Difference Limen ( f in Hz) 10 1 10 0 10-1 10-2 10-3 Human Data αd / T αd / T 3 Human Data 4 10 100 500 0 10 20 3040506070 80 Duration (ms) Level (db SL) Human

Note durations in music Twinkle Twinkle God Rest Ye Merry Camptown Races Love Me Tender Yankee Doodle Happy Birthday Skip To Rock-A-Bye Baby Overall 50 100 200 500 1000 2000 5000 Milliseconds Image adapted from: McAdams, and Bigand. Thinking in Sound: The Cognitive Psychology of Human Audition. Oxford University Press, 1993.

Timbre: a multidimensional tonal quality tone texture, tone color distinguishes voices, instruments (Photo Courtesy of Pam Roth. Used with permission.) Stationary Aspects (spectrum) Vowels Dynamic Aspects spectrum intensity pitch attack decay Photo Courtesy of Per-Ake Bystrom. Used with permission.) Consonants Photo Courtesy of Miriam Lewis. Used with permission.) http://www.wikipedia.org/

Stationary spectral aspects of timbre [ae] F0 = 100 Hz Waveforms Power Spectra Autocorrelations Formant-related Pitch periods, 1/F0 Vowel quality 100 Hz 125 Hz Timbre [ae] F0 = 125 Hz [er] F0 = 100 Hz [er] F0 = 125 Hz 0 10 20 0 1 2 3 4 0 5 10 15 Time (ms) Frequency (khz) Interval (ms)

Timbre dimensions: spectrum, attack, decay Series of figures from Handel, S. 1989. Listening: an Introduction to the Perception of Auditory Events. MIT Press. Used with permission.

Masking (tone vs. tone) Demonstration: tones in noise; tones vs. tones

Masking audiograms Wegel & Lane, 1924

1000 Hz pure tone masker Please see http://www. zainea.com/masking2.htm for a discussion of masking.

Tone on tone masking curves (Wegel & Lane, 1924)

From masking patterns to "auditory filters" as a model of hearing (Courtesy of Prof. Chris Darwin (Dept. of Psychology at the University of Sussex). Used with permission.) Power spectrum Filter metaphor Notion of one central spectrum that subserves perception of pitch, timbre, and loudness 2.2. Excitation pattern Using the filter shapes and bandwidths derived from masking experiments we can produce the excitation pattern produced by a sound. The excitation pattern shows how much energy comes through each filter in a bank of auditory filters. It is analogous to the pattern of vibration on the basilar membrane. For a 1000 Hz pure tone the excitation pattern for a normal and for a SNHL (sensori-neural hearing loss) listener look like this: The excitation pattern to a complex tone is simply the sum of the patterns to the sine waves that make up the complex tone (since the model is a linear one). We can hear out a tone at a particular frequency in a mixture if there is a clear peak in the excitation pattern at that frequency. Since people suffering from SNHL have broader auditory filters their excitation patterns do not have such clear peaks. Sounds mask each other more, and so they have difficulty hearing sounds (such as speech) in noise. --Chris Darwin, U. Sussex, http://www.biols.susx.ac.uk/home/chris_darwin/perception/lecture_notes/hearing3/hearing3.html

Shapes of perceptually-derived "auditory filters" (Moore) 90 80 70 Relative gain, db 0 c -10 b -20 d -30-40 a e -50 0.5 1.0 1.5 2.0 Frequency, khz Relative Excitation Level, db 0 c -10 b -20 d -30-40 a e -50 0.5 1.0 1.5 2.0 Filter Center Frequency, khz Excitation Level, db 60 50 40 30 20 10 0 0.5 1 2 5 10 Frequency, khz (log scale)

Binaural localization Azimuth: interaural time differences (20-600 usec) interaural level differences Elevation: received spectrum of broadband sounds (pinna effects) Please see Figure 2.1 in Woodworth, Robert Sessions, 1869-1962. Experimental Psychology. New York: H. Holt and company, c1938.

Interaural time difference and localization of sounds 0.6 Interaural Time Difference (msec) 0.5 0.4 0.3 0.2 0.1 0 0 o 20 o 40 o 60 o 80 o 100 o 120 o 140 o 160 o 180 o