Timbre perception

Similar documents
Music Perception & Cognition

9.35 Sensation And Perception Spring 2009

Lecture 2 What we hear: Basic dimensions of auditory experience

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

HST 725 Music Perception & Cognition Assignment #1 =================================================================

CTP431- Music and Audio Computing Musical Acoustics. Graduate School of Culture Technology KAIST Juhan Nam

HST Neural Coding and Perception of Sound. Spring Cochlear Nucleus Unit Classification from Spike Trains. M.

Pitch-Synchronous Spectrogram: Principles and Applications

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Topic 10. Multi-pitch Analysis

AUD 6306 Speech Science

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Quarterly Progress and Status Report. Violin timbre and the picket fence

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

Measurement of overtone frequencies of a toy piano and perception of its pitch

Rhythm: patterns of events in time. HST 725 Lecture 13 Music Perception & Cognition

Harmony and tonality The vertical dimension. HST 725 Lecture 11 Music Perception & Cognition

Analysis of the effects of signal distance on spectrograms

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Simple Harmonic Motion: What is a Sound Spectrum?

Creative Computing II

2. AN INTROSPECTION OF THE MORPHING PROCESS

Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra

2 Autocorrelation verses Strobed Temporal Integration

A METHOD OF MORPHING SPECTRAL ENVELOPES OF THE SINGING VOICE FOR USE WITH BACKING VOCALS

1. Introduction NCMMSC2009

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis

Topic 4. Single Pitch Detection

Tempo and Beat Analysis

MOTIVATION AGENDA MUSIC, EMOTION, AND TIMBRE CHARACTERIZING THE EMOTION OF INDIVIDUAL PIANO AND OTHER MUSICAL INSTRUMENT SOUNDS

Temporal summation of loudness as a function of frequency and temporal pattern

An interdisciplinary approach to audio effect classification

Auditory scene analysis

Polyphonic music transcription through dynamic networks and spectral pattern identification

Perceptual Considerations in Designing and Fitting Hearing Aids for Music Published on Friday, 14 March :01

UNIVERSITY OF DUBLIN TRINITY COLLEGE

Automatic music transcription

Transcription An Historical Overview

AN ALGORITHM FOR LOCATING FUNDAMENTAL FREQUENCY (F0) MARKERS IN SPEECH

Making music with voice. Distinguished lecture, CIRMMT Jan 2009, Copyright Johan Sundberg

Timbre blending of wind instruments: acoustics and perception

Music Representations

Digital music synthesis using DSP

A prototype system for rule-based expressive modifications of audio recordings

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Physics and Neurophysiology of Hearing

Concert halls conveyors of musical expressions

Week 6 - Consonants Mark Huckvale

Experiments on musical instrument separation using multiplecause

We realize that this is really small, if we consider that the atmospheric pressure 2 is

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

Affective Sound Synthesis: Considerations in Designing Emotionally Engaging Timbres for Computer Music

Scoregram: Displaying Gross Timbre Information from a Score

"The mind is a fire to be kindled, not a vessel to be filled." Plutarch

Do Zwicker Tones Evoke a Musical Pitch?

A FUNCTIONAL CLASSIFICATION OF ONE INSTRUMENT S TIMBRES

Our Perceptions of Music: Why Does the Theme from Jaws Sound Like a Big Scary Shark?

Pitch is one of the most common terms used to describe sound.

Automatic Construction of Synthetic Musical Instruments and Performers

Music Perception with Combined Stimulation

Topic 1. Auditory Scene Analysis

Music Radar: A Web-based Query by Humming System

Pitch Perception. Roger Shepard

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2

Copyright 2009 Pearson Education, Inc. or its affiliate(s). All rights reserved. NES, the NES logo, Pearson, the Pearson logo, and National

Melody: sequences of pitches unfolding in time. HST 725 Lecture 12 Music Perception & Cognition

Analysis, Synthesis, and Perception of Musical Sounds

The Cocktail Party Effect. Binaural Masking. The Precedence Effect. Music 175: Time and Space

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal

Week 14 Music Understanding and Classification

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

Spectral toolkit: practical music technology for spectralism-curious composers MICHAEL NORRIS

How do scoops influence the perception of singing accuracy?

Lecture 9 Source Separation

Consonance perception of complex-tone dyads and chords

Quarterly Progress and Status Report. Formant frequency tuning in singing

LEARNING SPECTRAL FILTERS FOR SINGLE- AND MULTI-LABEL CLASSIFICATION OF MUSICAL INSTRUMENTS. Patrick Joseph Donnelly

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

Advance Certificate Course In Audio Mixing & Mastering.

Dimensions of Music *

Rhythm and Melody Aspects of Language and Music

AUTOMATIC TIMBRAL MORPHING OF MUSICAL INSTRUMENT SOUNDS BY HIGH-LEVEL DESCRIPTORS

1 Introduction to PSQM

Kent Academic Repository

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink

Received 27 July ; Perturbations of Synthetic Orchestral Wind-Instrument

Supervised Musical Source Separation from Mono and Stereo Mixtures based on Sinusoidal Modeling

PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF)

Available online at International Journal of Current Research Vol. 9, Issue, 08, pp , August, 2017

Music for the Hearing Care Professional Published on Sunday, 14 March :24

THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO. J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England

& Ψ. study guide. Music Psychology ... A guide for preparing to take the qualifying examination in music psychology.

Quarterly Progress and Status Report. Acoustic analysis of three male voices of different quality

Amplitude and Loudness 1

Transcription:

Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani Timbre perception www.cariani.com

Timbre perception Timbre: tonal quality ( pitch, loudness, duration or location) Defines separate voices, musical coloration Multidimensional space: not completely well understood Two general aspects: spectrum & dynamics Stationary spectrum Spectral center of gravity - "brightness" Formant structure; Harmonicity Amplitude-frequency-phase dynamics Amplitude dynamics (attack, decay) amplitude modulation (roughness) Frequency dynamics relative timings of onsetsand offsets of partials frequency modulation (vibrato) Phase dynamics (noise, phase coherence, chorus effect) Analogy with phonetic distinctions in speech Vowels (stationary spectra; formant structure) Consonants (dynamic contrasts: amplitude, frequency & noise) Temporal integration windows and timbral fusion Some neural correlates

Timbre ~ sonic texture, tone color Please see Paul Cezanne, Apples, Peaches, Pears, and Grapes c. (1879-80); Oil on canvas, 38.5 x 46.5 cm; The Hermitage, St. Petersburg at (http://www.ibiblio.org/wm/paint/auth/cezanne/sl/)

Timbre ~ sonic texture, tone color

Texture Roughness

Stationary and dynamic factors in timbre perception Periodicity (noise-like or tone-like) Harmonicity (is this properly an aspect of timbre?) Phase coherence (noise-incoherent; tones-coherent) Smoothness or roughness Stationary spectrum Formant structure; Amplitude-frequency-phase dynamics Amplitude dynamics (attack, decay) amplitude modulation (roughness) Frequency dynamics relative timings of onsetsand offsets of partials frequency modulation (vibrato) Phase dynamics (phase shifts, chorus effect) Analogy with phonetic distinctions in speech Vowels (stationary spectra; formant structure) Consonants (dynamic contrasts: amplitude, frequency & noise)

Stationary and dynamic factors in timbre perception Stationary spectrum Formant structure; Harmonicity Amplitude-frequency-phase dynamics Amplitude dynamics (attack, decay) amplitude modulation (roughness) Frequency dynamics relative timings of onsetsand offsets of partials frequency modulation (vibrato) Phase dynamics (noise, phase coherence, chorus effect) Analogy with phonetic distinctions in speech Vowels (stationary spectra; formant structure) Consonants (dynamic contrasts: amplitude, frequency & noise)

Timbre: a multidimensional tonal quality tone texture, tone color distinguishes voices, instruments Photo Courtesy of Pam Roth. Stationary Aspects (spectrum) Vowels Dynamic Aspects spectrum intensity pitch attack decay Photo Courtesy of Per-Ake Bystrom. Consonants Photo Courtesy of Miriam Lewis. http://www.wikipedia.org/

Harmonicity Frequency dynamics violin, trumpet, guitar (more harmonic, stationary spectra) Rafael A. Irizarry's Music and Statistics Demo Spectrograms of Harmonic Instruments Non-Harmonic Instruments marimba, timpani, gong (more inharmonic, time-varying spectra) http://www.biostat.jhsph.edu/~ririzarr/demo/demo.html

Some methods for studying the perceptual space 1. Try to derive the structure of the space from the dimensionality of responses Similarity magnitude estimations Similarity rankings Multidimensional scaling 2. Systematically vary acoustic parameters known to influence timbre to find acoustic correlates of perceptual dimensions, e.g. Formant structure Attack and decay parameters

Grey (1975) Timbre: Perceptual dimensions Figure from Butler, David. The Musician's Guide to Perception and Cognition. Schirmer, 1992. Also see: Grey, J., and J. Moorer. "Perceptual Evaluations of Synthesized Musical Instrument Tones." J. Acoustical Society of America 63 (1977): 1493-1500.

Timbre dimensions: spectrum, attack, decay

Music based on timbral contrasts Kurt Schwitters, Ur Sonata (1932) perf. George Melly, Miniatures Miniatures [Pipe/Cherry Red] Performer(s): Various Artists Label: Cbc Records/Musica Viva (Can) Catalog: #1043 Audio CD (January 19, 1997) Number of Discs: 1 ASIN: B000003WYQ

Stationary spectral aspects of timbre [ae] F0 = 100 Hz Waveforms Power Spectra Autocorrelations Formant-related Vowel quality Timbre Pitch periods, 1/F0 125 Hz 100 Hz [ae] F0 = 125 Hz [er] F0 = 100 Hz [er] F0 = 125 Hz 0 10 20 Time (ms) 0 1 2 3 4 Frequency (khz) 0 5 10 15 Interval (ms)

Formants and the vocal tract Heed Hid Head Had Hod Hawed Hood Who d

Vowel F1-F2 space log [2250] heed log [2000] log [1750] hid head had log [1500] heard log [1250] hud log [1000] hood hod who d hawed log [750] log [300] log [400] log [600] log [800]

Time domain analysis of auditory-nerve fiber firing rates. Hugh Secker-Walker & Campbell Searle, J. Acoust. Soc. 88(3), 1990 Neural responses to /da/ @ 69 db SPL from Miller and Sachs (1983) Reprinted with permission from Secker-Walker HE, Searle CL. 1990. Time-domain analysis of auditory-nerve-fiber firing rates. J. Acoust. Soc. Am. 88 (3): 1427-36. Copyright 1990, Acoustical Society of America. Used with permission. Low CFs F1 F2 F3 High CFs Peristimulus time (ms)

. Vowels Population-interval coding of timbre (vowel formant structure) Signal autocorrelation [ae] m ag 1 0 n itud e Voice pitch Population-wide distributions of short intervals for 4 vowels [ α] [ ] 0 5 10 15 Population interval histogram Formantstructure Formantstructure 1/F1 Voice pitch # inte [u] r vals [æ] # inte r vals 0 5 10 15 Interval (ms) 0 5 0 5 Interval (ms)

Coding of vowel quality (timbre) (Reprinted with permission from Secker-Walker HE, Searle CL. 1990. Time-domain analysis of auditory-nerve-fiber firing rates. J. Acoust. Soc. Am. 88 (3): 1427-36. Copyright 1990, Acoustical Society of America.)

Please See Hirahara, Cariani, Delgutte (1996)

Please see Figures 6, and 7 in Hirahara, Cariani, Delgutte (1996)

Spectrum as a function of intensity (trumpet) Please see Figure 4-3 in Butler, David. The Musician s Guide to Perception and Cognition. New York: Schirmer Books ; Toronto: Maxwell Macmillan Canada, New York: Maxwell Macmillan International, c1992.

Singer's formant Graph, Fig 11.12 on page 138 of Cook, Perry, ed. Music, Cognition & Computerized Sound MIT Press 2001. Used with permission.

Singer's formant Graph, Fig 11.12 on page 138 of Cook, Perry, ed. Music, Cognition & Computerized Sound MIT Press 2001. Used with permission.

Amplitude dynamics (envelope, intensity contour)

Frequency dynamics of note onsets (clarinet) Please see Figure 4-4 in Butler, David. The Musician s Guide to Perception and Cognition. New York: Schirmer Books ; Toronto: Maxwell Macmillan Canada, New York: Maxwell Macmillan International, c1992. ISBN: 0028703413.

Time-course of harmonics Please see Figure 3 in Deutsch, D., ed. The Psychology of Music.. San Diego: Academic Press, 1999.

Speech Neurogram Please see Delgutte, B. "Auditory Neural Processing of Speech." In The Handbook of Phonetic Sciences. Edited by W. J. Hardcastle, and j. Laver. London: Blackwell, 1995.

Possible interval-based neural correlates for basic phonetic distinctions CHARACTERISTIC ACOUSTIC DISTINCTION PHONETIC CLASS EXAMPLES INTERVAL CORRELATES Voice Pitch (80-400 Hz) pitch contours, ² over time voice pitch, F0 prosody most common interval running interval ² Voice onset time VOT prominent interval between onset/offset responses Spectral Pattern stationary low frequency Spectro-temporal pattern fast transition slow transition formant pattern nasal resonances formant transitions Spectral Dispersion noise-excitation (frication) Voiced-unvoiced voiced/unvoiced stop consonants fricatives whispered/voiced Dynamic Amplitude Patterns amplitude time profiles Rhythm abrupt/gradual ² (buildup / decay) vowels nasals [u], [ae], [i] [m], [n] intervals for periodicities 50-5000 Hz cross-bf intervals (?) consonants [b], [d], [g] timing of FM responses (?) slow ² in interval distr. semivowels [w], [r], [y] low freq modulations dipthong s [a y ], [a w ],[e y ] interactions fricative consonants /f/, /s/,//,/v/,/θ/ semi periodic temporal struct. ;phase incoherence [b]/[p] presence of harmonic [v]/[f] structure in intervals affricative/fricative metrical aspects word rhythm speaking rate /t/ vs // chip vs ship degree interval dispersion adaptation + running interval buildup p atterns (Autocorrelations ² sh ape) Longer interval patterns (50-500 msec) Duration duration prominent int erval between onset & offset responders Suprasegmental structure word time pattern whole word patterns longer time structures

Interspike interval (ms) High F2 Formant Sweep [i] [æ] [i] 10 5 Auditory nerve fiber CF: 1.4 khz Thr: 2.0 SR: 90.7 35-60 15 Audit ory nerve fiber CF: 1.4 khz Thr: 2.0 SR: 90.7 35-60 1/F0 1/F0 n/f1 1/F1 Interspike interval (ms) Low F2 Formant Sweep [u] [a] [u] 15 10 5 n/f1 1/F1 Second formant (Hz) 0 2000 1500 1000 1000 2000 Two-formant vowel sweeps i High F2 I æ Low F2 α u Interspike interval (ms) 15 10 5 0 PVCN Chop-S CF: 2.1 khz Thr: 5.3 SR: 17.7 1000 2000 1/F0 n/f1 Interspike interval (ms) 250 500 750 First formant (Hz) AVCN Pri-N CF: 1.5 khz Thr: 8.8 SR: 247.1 15 10 5 0 1000 2000 Peristimulus time (ms) 1/F0 n/f1 Interspike interval (ms) 15 10 5 0 DCN Pauser CF: 1.3 khz Thr: 24.8 SR:0.0 0 1000 2000 1000 2000 Peristimulus time (ms) 1/F0 1/F1

Second formant (Hz) 2000 1500 1000 Two-formant vowel sweeps i High F2 I æ Low F2 α u 250 500 750 First formant (Hz)

Interspike interval (ms) 15 10 5 Low F2 Formant Sweep [u] [a] [u] Auditory nerve fiber CF: 1.4 khz Thr: 2.0 SR: 90.7 35-60 1/F0 n/f1 0 1000 2000

Second formant (Hz) 2000 1500 1000 Two-formant vowel sweeps i High F2 I æ Low F2 α u 250 500 750 First formant (Hz)

Interspike interval (ms) 15 10 5 High F2 Formant Sweep [i] [æ] [i] Auditory nerve fiber CF: 1.4 khz Thr: 2.0 SR: 90.7 35-60 1/F0 n/f1 1/F1 0 1000 2000

GUY-BUY-DIE 50 100 150 200 250 300 500 1000 1500 2000 2500 3000 5000 4500 4000 3500 3000 Frequency 2500 2000 1500 1000 500 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Time

Timbre: summary Time-invariant properties (static) Stationary spectrum Relatively well-understood & characterized Time-varying properties (dynamic) Amplitude dynamics (envelope) Frequency dynamics (spectral changes, vibrato) Phase shifts (chorus effect & electronic contexts) Relatively poorly understood & characterized

Reading/assignment for next meeting Tuesday, March 2 Consonance, dissonance, and roughness Reading: Deutsch, Rasch & Plomp chapter re: beats, combination tones, and consonance also Burns chapter on intervals & scales (look at section on consonance)