Topic 1. Auditory Scene Analysis

Similar documents
Topic 10. Multi-pitch Analysis

AUD 6306 Speech Science

Sound Quality PSY 310 Greg Francis. Lecture 32. Sound perception

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

9.35 Sensation And Perception Spring 2009

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO. J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England

Psychoacoustics. lecturer:

Tempo and Beat Analysis

Topic 4. Single Pitch Detection

Auditory scene analysis

Introductions to Music Information Retrieval

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink

Auditory Stream Segregation (Sequential Integration)

"The mind is a fire to be kindled, not a vessel to be filled." Plutarch

Lecture for CIRMMT (June 19, 2008) ASA at McGill

CTP431- Music and Audio Computing Musical Acoustics. Graduate School of Culture Technology KAIST Juhan Nam

158 ACTION AND PERCEPTION

Demonstrations. to accompany Bregman s. Auditory Scene Analysis. The perceptual organization of sound MIT Press, 1990

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

Behavioral and neural identification of birdsong under several masking conditions

Our Perceptions of Music: Why Does the Theme from Jaws Sound Like a Big Scary Shark?

The presence of multiple sound sources is a routine occurrence

The Tone Height of Multiharmonic Sounds. Introduction

Computer Models for Musical Instrument Identification

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Quarterly Progress and Status Report. Violin timbre and the picket fence

An interdisciplinary approach to audio effect classification

Topic 11. Score-Informed Source Separation. (chroma slides adapted from Meinard Mueller)

Math and Music: The Science of Sound

Audio Feature Extraction for Corpus Analysis

Acoustic Scene Classification

August Acoustics and Psychoacoustics Barbara Crowe Music Therapy Director. Notes from BC s copyrighted materials for IHTP

Concert halls conveyors of musical expressions

Analysis, Synthesis, and Perception of Musical Sounds

PHY 103 Auditory Illusions. Segev BenZvi Department of Physics and Astronomy University of Rochester

Cymatic: a real-time tactile-controlled physical modelling musical instrument

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

NOTE-LEVEL MUSIC TRANSCRIPTION BY MAXIMUM LIKELIHOOD SAMPLING

Automatic Construction of Synthetic Musical Instruments and Performers

1aAA14. The audibility of direct sound as a key to measuring the clarity of speech and music

Pitch-Synchronous Spectrogram: Principles and Applications

Smooth Rhythms as Probes of Entrainment. Music Perception 10 (1993): ABSTRACT

12/7/2018 E-1 1

MODIFICATIONS TO THE POWER FUNCTION FOR LOUDNESS

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Proceedings of Meetings on Acoustics

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England

Computer Audio and Music

ON FINDING MELODIC LINES IN AUDIO RECORDINGS. Matija Marolt

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

A Behavioral Study on the Effects of Rock Music on Auditory Attention

UNIVERSITY OF DUBLIN TRINITY COLLEGE

Physics and Neurophysiology of Hearing

Topics in Computer Music Instrument Identification. Ioanna Karydi

Real-time Granular Sampling Using the IRCAM Signal Processing Workstation. Cort Lippe IRCAM, 31 rue St-Merri, Paris, 75004, France

Affective Sound Synthesis: Considerations in Designing Emotionally Engaging Timbres for Computer Music

S. S. Stevens papers,

A prototype system for rule-based expressive modifications of audio recordings

Commentary on David Huron s On the Role of Embellishment Tones in the Perceptual Segregation of Concurrent Musical Parts

Music Perception with Combined Stimulation

Music Appreciation Final Exam Study Guide

The Development of a Cognitive Framework for the Analysis of Acousmatic Music

Asynchronous Preparation of Tonally Fused Intervals in Polyphonic Music

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Creative Computing II

2. AN INTROSPECTION OF THE MORPHING PROCESS

KNES Primary School Course Outline Year 2 Term 1

MEMORY & TIMBRE MEMT 463

Enhancing Music Maps

A Computational Model for Discriminating Music Performers

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

MUSI-6201 Computational Music Analysis

& Ψ. study guide. Music Psychology ... A guide for preparing to take the qualifying examination in music psychology.

IMPROVISING WITH THE SONIC ENVIRONMENT. Lindsay Vickery School of Music Western Australian Academy of Performing Arts

MUSIC Hobbs Municipal Schools 6th Grade

Physical Modelling of Musical Instruments Using Digital Waveguides: History, Theory, Practice

We realize that this is really small, if we consider that the atmospheric pressure 2 is

EMS : Electroacoustic Music Studies Network De Montfort/Leicester 2007

A New "Duration-Adapted TR" Waveform Capture Method Eliminates Severe Limitations

Perception and Sound Design

Brain.fm Theory & Process

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Drum Source Separation using Percussive Feature Detection and Spectral Modulation

2 Autocorrelation verses Strobed Temporal Integration

Music Representations

2005 BY THE REGENTS OF THE UNIVERSITY OF CALIFORNIA. The Influence of Pitch Interval on the Perception of Polyrhythms

Pitch Perception. Roger Shepard

Chapter 13. Key Terms. The Symphony. II Slow Movement. I Opening Movement. Movements of the Symphony. The Symphony

Advanced Placement Music Theory Course Syllabus Greenville Fine Arts Center

Audio Structure Analysis

Thoughts and Emotions

Pitch is one of the most common terms used to describe sound.

Chapter 11. The Art of the Natural. Thursday, February 7, 13

Secrets To Better Composing & Improvising

Transcription:

Topic 1 Auditory Scene Analysis

What is Scene Analysis? (from Bregman s ASA book, Figure 1.2) ECE 477 - Computer Audition, Zhiyao Duan 2018 2

Auditory Scene Analysis The cocktail party problem (From http://www.justellus.com/) ECE 477 - Computer Audition, Zhiyao Duan 2018 3

It s very difficult! ECE 477 - Computer Audition, Zhiyao Duan 2018 4

The Ear ECE 477 - Computer Audition, Zhiyao Duan 2018 5

The Cochlea Each point on the basilar membrane resonates to a particular frequency At the resonance point, the membrane moves ECE 477 - Computer Audition, Zhiyao Duan 2018 6

A Movie! (thanks to Howard Hughes Medical Institute) ECE 477 - Computer Audition, Zhiyao Duan 2018 7

Spectrogram violin ECE 477 - Computer Audition, Zhiyao Duan 2018 8

Spectrogram female ECE 477 - Computer Audition, Zhiyao Duan 2018 9

If they sound together violin + female ECE 477 - Computer Audition, Zhiyao Duan 2018 10

How about this? cocktail party ECE 477 - Computer Audition, Zhiyao Duan 2018 11

Auditory Scene Analysis Studies the mechanism of the human auditory system to answer these questions How many sources at a time? Which frequency components belong to the same source? How does a source evolve? Where are the sources? ECE 477 - Computer Audition, Zhiyao Duan 2018 12

Vision vs. Audition Visual scenes mainly describe objects that reflect light Shape, color, brightness, texture, etc. Auditory scenes mainly describe sources that emit sound Time, frequency, loudness, location, etc. Visual objects occlude; auditory objects overlap ECE 477 - Computer Audition, Zhiyao Duan 2018 13

Analyzing auditory scenes is like Analyzing visual scenes where Objects are half-transparent Objects change transparency Objects disappear and reappear unexpectedly Two miles northeast, then five miles southwest -- that sort of thing. Fold into whipped cream and add a dash of salt and sprinkling of paprika. By that time, perhaps something better can be done. ECE 477 - Computer Audition, Zhiyao Duan 2018 14

The Analysis-Synthesis Process Decompose the acoustic scene into a collection of segments Group segments into streams Simultaneous vs. sequential This is the main concern of ASA ECE 477 - Computer Audition, Zhiyao Duan 2018 15

Exclusive Allocation The allocation of the X tones are different when the C tones are played or not, and it affects our perception of the A and B tones. ECE 477 - Computer Audition, Zhiyao Duan 2018 16

Simultaneous vs. Sequential Things that affect the grouping of ABC tones Frequency difference between A and B Frequency difference between B and C Synchronization between B and C ECE 477 - Computer Audition, Zhiyao Duan 2018 17

Stream Segregation High and low tones are segregated when played fast Can you tell the order of the tones? ECE 477 - Computer Audition, Zhiyao Duan 2018 18

Segregation depends on Time gap between tones within a stream Frequency gap between the two streams Let s look at a demo http://auditoryneuroscience.com/sceneanalysis/streaming-alternating-tones ECE 477 - Computer Audition, Zhiyao Duan 2018 19

Stream Segregation in Music Two streams Toccata and Fugue in D minor, J.S. Bach http://www.youtube.com/watch?v=r_tu63ypb6i (violin performance! 2 47 ) ECE 477 - Computer Audition, Zhiyao Duan 2018 20

Occlusions in Vision The occlusion in this example helps with the grouping of the fragments ECE 477 - Computer Audition, Zhiyao Duan 2018 21

Masking in Audition Sinusoids Speech ECE 477 - Computer Audition, Zhiyao Duan 2018 22

Primitive vs. Learned H1-L1-H2-L2 L2-H2-L1-H1 Infants cannot discriminate the two stimuli, which indicates that they performed stream segregation of the high and low tones. ECE 477 - Computer Audition, Zhiyao Duan 2018 23

Primitive Grouping Mechanisms For simultaneous grouping Periodicity Common onset and offset Common amplitude and frequency modulation For sequential grouping Proximity in frequency and time Continuous or smooth transition Related rhythm Common spatial location ECE 477 - Computer Audition, Zhiyao Duan 2018 24

Primitive vs. Learned Listening to the stimulus repeatedly can improve performance in ASA tasks. Easier to follow a friend s than a stranger s voice in a noisy environment Prior knowledge of timbre helps Music training helps analyzing music audio scene Prior knowledge of music theory, composition rules, music style, etc. helps ECE 477 - Computer Audition, Zhiyao Duan 2018 25

Extreme Capability in Music ASA In Rome, he (14 years old) heard Gregorio Allegri's Miserere once in performance in the Sistine Chapel. He wrote it out entirely from memory, only returning to correct minor errors... -- Gutman, Robert (2000). Mozart: A Cultural Biography Wolfgang Amadeus Mozart Can we make computers compete with Mozart?? ECE 477 - Computer Audition, Zhiyao Duan 2018 26

What is CASA? Computational ASA the challenge of constructing a machine system that achieves human performance in ASA. ---- E.C. Cherry To computationally extract individual streams from one or two recordings of an acoustic scene The definition of CASA makes no reference to the underlying mechanism that a system should adopt, but many systems are based on the principles of processing in the human auditory system. ECE 477 - Computer Audition, Zhiyao Duan 2018 27

CASA System Overview ASA findings Mimic human auditory system Prior knowledge of sound sources (from the CASA book, Figure 1.5) ECE 477 - Computer Audition, Zhiyao Duan 2018 28

CASA vs. Computer Audition Both have the same goal. The term CASA has come to be associated with a perceptually motivated approach. Computer Audition is open to any kinds of approaches including those purely engineering ones. ECE 477 - Computer Audition, Zhiyao Duan 2018 29