Speaking loud, speaking high: non-linearities in voice strength and vocal register variations. Christophe d Alessandro LIMSI-CNRS Orsay, France

Similar documents
Making music with voice. Distinguished lecture, CIRMMT Jan 2009, Copyright Johan Sundberg

3 Voiced sounds production by the phonatory system

Physiological and Acoustic Characteristics of the Female Music Theatre Voice in belt and legit qualities

Pitch-Synchronous Spectrogram: Principles and Applications

Welcome to Vibrationdata

Glottal open quotient in singing: Measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency

Glossary of Singing Voice Terminology

Vocal tract resonances in singing: Variation with laryngeal mechanism for male operatic singers in chest and falsetto registers

(Adapted from Chicago NATS Chapter PVA Book Discussion by Chadley Ballantyne. Answers by Ken Bozeman)

Quarterly Progress and Status Report. Voice source characteristics in different registers in classically trained female musical theatre singers

EVTA SESSION HELSINKI JUNE 06 10, 2012

How We Sing: The Science Behind Our Musical Voice. Music has been an important part of culture throughout our history, and vocal

Analysis for synthesis of nonverbal elements of speech communication based on excitation source information

Vocal efficiency in trained singers vs. non-singers

ANATOMY OF THE VOICE The physical working and structure of the vocal tract

AN ALGORITHM FOR LOCATING FUNDAMENTAL FREQUENCY (F0) MARKERS IN SPEECH

The Choir Director as the Primary Voice Teacher: Strengthening your choral singers vocal technique through vocal pedagogy

Some Phonatory and Resonatory Characteristics of the Rock, Pop, Soul, and Swedish Dance Band Styles of Singing

The Perception of Formant Tuning in Soprano Voices

Quarterly Progress and Status Report. Formant frequency tuning in singing

Complete Vocal Technique in four pages

Glottal behavior in the high soprano range and the transition to the whistle register

increase by 6 db each if the distance between them is halved. Likewise, vowels with a high first formant, such as /a/, or a high second formant, such

Week 6 - Consonants Mark Huckvale

UNIVERSITY OF DUBLIN TRINITY COLLEGE

Acoustic Prediction of Voice Type in Women with Functional Dysphonia

Real-time magnetic resonance imaging investigation of resonance tuning in soprano singing

The Interplay between Glottis and Vocal Tract during the Male Passaggio

Analysis of the effects of signal distance on spectrograms

The Utilization of Non-Classical Approaches in Teaching Musical Theater Repertoire for the Undergraduate Soprano Voice Major

Absolute Perceived Loudness of Speech

Music 170: Wind Instruments

Quarterly Progress and Status Report. Acoustic analysis of three male voices of different quality

2. AN INTROSPECTION OF THE MORPHING PROCESS

AUD 6306 Speech Science

1. Introduction NCMMSC2009

Mirroring the voice from Garcia to the present day: some insights into singing voice registers.

DEVELOPING THE MALE HEAD VOICE. A Paper by. Shawn T. Eaton, D.M.A.

Quarterly Progress and Status Report. Simultaneous analysis of vocal fold vibration and transglottal airflow; Exploring a new experimental set-up

Digital music synthesis using DSP

Vocal tract resonances in speech, singing, and playing musical instruments

IBEGIN MY FIRST ARTICLE AS Associate Editor of Journal of Singing for

Relation between violin timbre and harmony overtone

Acoustic Analysis of Voice Quality in Iron Maiden s Songs

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

Loudness and Pitch of Kunqu Opera 1 Li Dong, Johan Sundberg and Jiangping Kong Abstract Equivalent sound level (Leq), sound pressure level (SPL) and f

Comparison Parameters and Speaker Similarity Coincidence Criteria:

LINGUISTICS 321 Lecture #8. BETWEEN THE SEGMENT AND THE SYLLABLE (Part 2) 4. SYLLABLE-TEMPLATES AND THE SONORITY HIERARCHY

Laugh when you re winning

Steve Tramack

Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra

Vocal tract adjustments in the high soprano range

On human capability and acoustic cues for discriminating singing and speaking voices

Music Representations

Timbral and Melodic Characteristics of the Persian Singing Style of Avaz HAMA JINO BIGLARI

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

Laryngeal Vibratory Mechanisms: The Notion of Vocal Register Revisited

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Components of intonation. Functions of intonation. Tones: articulatory characteristics. 1. Tones in monosyllabic utterances

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

EAVOCZ. Appreciation Rating Scale for the Singing Voice. Soraia Ibrahim, Ana Mendes & Inês Vaz. London, 29th March 2017

A prototype system for rule-based expressive modifications of audio recordings

Quarterly Progress and Status Report. X-ray study of articulation and formant frequencies in two female singers

Simple Harmonic Motion: What is a Sound Spectrum?

Pitch. There is perhaps no aspect of music more important than pitch. It is notoriously

they in fact are, and however contrived, will be thought of as sincere and as producing music from the heart.

MUSIC ACOUSTICS. TMH/KTH Annual Report 2001

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

fundamentals Copyright Paul Sykes 2018

A comparison of the acoustic vowel spaces of speech and song*20

Using Praat for Linguistic Research

DOC s DO s, DON T s and DEFINITIONS

Topic 4. Single Pitch Detection

Quarterly Progress and Status Report. Replicability and accuracy of pitch patterns in professional singers

The role of vocal tract resonances in singing and in playing wind instruments

Perceptual and Acoustic Study of Voice Quality in High-Pitched Heavy Metal Singing

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

FOR MUCH OF THE LAST CENTURY vocal registration was thought to

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

APP USE USER MANUAL 2017 VERSION BASED ON WAVE TRACKING TECHNIQUE

An interdisciplinary approach to audio effect classification

Version 5: August Requires performance/aural assessment. S1C1-102 Adjusting and matching pitches. Requires performance/aural assessment

GCT535- Sound Technology for Multimedia Timbre Analysis. Graduate School of Culture Technology KAIST Juhan Nam

Musicians Adjustment of Performance to Room Acoustics, Part III: Understanding the Variations in Musical Expressions

How do clarinet players adjust the resonances of their vocal tracts for different playing effects?

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm

Recent political, cultural, and legal

AN ACOUSTIC-PHONETIC APPROACH TO VOCAL MELODY EXTRACTION

2011 and 2012 Facebook Practice Analysis Questions

Analysis, Synthesis, and Perception of Musical Sounds

International Journal of Computer Architecture and Mobility (ISSN ) Volume 1-Issue 7, May 2013

Classification of Voice Modality using Electroglottogram Waveforms

Rhythm and Melody Aspects of Language and Music

Vowel-pitch matching in Wagner s operas: Implications for intelligibility and ease of singing

Timbre perception

Advanced Signal Processing 2

Music for Alto Saxophone & Computer

Tempo and Beat Analysis

We realize that this is really small, if we consider that the atmospheric pressure 2 is

Transcription:

Speaking loud, speaking high: non-linearities in voice strength and vocal register variations Christophe d Alessandro LIMSI-CNRS Orsay, France 1

Content of the talk Introduction: voice quality 1. Voice quality dimensions 2. source/filter model in time and frequency 3. Non-linearities : voice quality dimension vs voice acoustic parameters, using synthesis Application: performative synthesis 2

Voice, Speech, Singing, Meaning and Expression Functions of voice in communication: 1. Linguistic and pragmatic functions : to convey linguistic meaning (ideas, concepts, facts ), to perform speech acts (command, promise ). Mainly associated to phoneme and words (double articulation). Noted using writing. 2. Expressive function: to make audible attitudes, feelings, emotions, personality, mood. Speech beyond (or below) linguistic meaning. Mainly associated to prosody and voice quality. Difficult to note using writing. The music of speech 3. Musical function: singing, non linguistic but highly structured communication 3

Voice Quality: a prosodic feature? Prosodic parameters are usually restricted to pitch, duration, pauses and some sort of intensity parameter. But intonation and voice quality are linked (e.g. voice registers) In some languages, voice quality has a phonological status (e.g. strangled tones of Vietnamese) In all languages voice quality has a pragmatic function Synthesis of expressive speech has demonstrated that convincing natural sounding results are impossible to obtain without dealing with voice quality parameters 4

Voice Quality: expression of emotions? Vocal expression of emotions and attitudes is one of the main domains of application for voice quality studies. Although it has been studied for a long time in psychology, it can be considered as an emerging research domain in many areas of speech communication : speech recognition and synthesis, but also speech coding. Voice quality is crucial for singing, theatre and other aesthetical vocalizations. 5

Questions related to Voice Quality Voice quality is still a rather fuzzy concept: what is the timbre of a voice? What are the domains of variation of every day speech? How to measure and quantify voice quality dimensions like vocal effort, vocal tension or noise in the voice? What are the physical and perceptive correlates of voice quality? What are the relationships between voice quality and others aspects of prosody? 6

Voice quality dimensions Ø A promenade in the landscape of voice quality, speech, singing. Ø Phonation dimensions Ø Vocal tract dimensions 7

Voice quality dimensions Syllabic or sentence-level voice quality Dimensions are often defined according to production (instead of perception) Based on settings of respiration, articulation and phonation 8

Speech production model Four main parts: 1. Respiration 2. Phonation 3. Articulation 4. radiation 9

Speech production model Voice quality is in: 1. Respiration: laughter, subglottal pressure 2. Phonation: phonation types, voice registers, effort, tension, voicing, noise 3. Articulation: smile, rounding, rate, strength, front/back, vocal tract length 10

Voice quality dimensions: examples (1) Breathiness Whispe r Semivoiced Nasalisation Nasal 2 Nasal 1 Modal voiced Pressed 1 Pressed 2 Denasaliz ed 1 Denasaliz ed 2 Roughness Modal Rough 1 Rough 2 Creakiness Modal Creaky 1 Creaky 2 Vocal Tract Length Long 2 Long 1 Modal Short 1 Short 2 Tension Lax Modal Tense 1 Tense 2 Lips Rounde d 2 Rounde d 1 Modal Retracted 1 Retracted 2 Press ed 3 Rough 3 Creak y 3 Tense 3 Pitch Low 2 Low 1 Modal High 1 High 2 High 3 Loudness Weak Modal Loud 1 Loud 2 Loud 3 Laughs Smiling Laugh 1 Smiling 1 Laugh 2 Smiling 2 Laugh 3 Smilin g 3 Laugh 4 Laugh 5 Laugh 6 1 female speaker, 1 sentence: "Il est sorti avant le jour." with various vocal qualities. High 4 Loud 4 Laug h 7 High 5 Loud 5 Laug h 8 High 6 Lou d 6 Loud 6b Lou d 7 Loud 7b Autres 1 2 3 4 5 6 11

Voice quality dimensions: examples (2) 1 male speaker, 1 sentence: "She (has) left for a great party today" with various vocal qualities. Modal voice Nasalization Roughness/ Creakiness Vocal tract Tension Lip protrusion mod al1 nasal 1 roug h1 short 1 relax 1 roun d1 mod al2 nasa l2 roug h2 short 2 relax 2 roun d2 moda l3 nasal 3 roug h3 long1 tense d1 smile 1 moda l4 nasal 4 creak y1 long2 tense d2 smile 2 moda l5 creak y2 open ed tense d3 modal 6 creaky 3 closed 1 tense d4 closed 2 tensed 5 Pitch low1 low2 low3 high1 high2 high3 high4 Loudness(1) Loudness(2) Laughs Others whis per loud1 laugh 1 left soft1 soft2 soft3 soft4 soft5 loud 2 laug h2 centr al tensed 6 loud3 loud4 loud5 loud6 strong shout1 shout2 laugh 3 laugh 4 right clear yawn theatri cal omino us1 omino us2 mysteri ous sho ut3 dark 12

Phonation types The three main sources of sound in the larynx are (Catford, 1977): 1. vocal fold vibration (voiced speech) 2. turbulent noise produced through open vocal folds (unvoiced speech) 3. ventricular band vibrations (ventricular speech) 4. Mixtures of voiced, noisy and ventricular phonation types 13

Phonation types Sound examples: 1. vocal fold vibration (voiced speech) 2. turbulent noise produced through open vocal folds (unvoiced speech) 3. ventricular band vibrations (ventricular speech) 4. Mixtures of voiced, noisy and ventricular phonation types 5. Polyphonic voice (ventricular + vocal folds) 14

Main voice quality dimensions Four main dimensions: 1. voice registers :voice mechanisms : creak, modal, falsetto, whistle 2. noise: breathiness, hoarseness 3. Pressure: pressed/lax voice, strangled tones. 4. Effort: accentuation, force. 15

Voice registers Phonation type Description Creak Very low f r e q u e n c y, periodic air pulses Modal Falsetto Usual voice for most males and l o w - p i t c h e d females, low to m e d i u m F 0 register. Usual voice for h i g h p i t c h e d females, high F0 register Production Voice registers Mechanism 0 of vocal folds vibration. Thick and heavy vocal folds, low subglottal pressure, low mean flow Mechanism 1 of vocal folds vibration. Thick and heavy vocal folds vibrating along their whole lengths Mechanism 2 of vocal fold vibration. Thin and light vocal folds, vibrating along about 2/3 of their anterior lengths 16

Ventricular phonation Phonation type Description Production Ventricular phonation Ventricular A harsh quality, with a lot of aperiodicities, low F0 Ventricular creak Very low f r e q u e n c y, p e r i o d i c a i r pulses Produced between the ventricular bands, or false vocal folds Ventricular bands vibration, low subglottal pressure, low mean flow 17

Aperiodicities Phonatio n type B r e a t h phonation B r e a t h y voice Description Unvoiced speech A m i x t u r e o f breath and voice Production Aperiodicities Glottis wide open, high mean flow Incomplete folds closure. High mean flow. Glottal chink Whisper Unvoiced speech Narrowed opening compared to breath phonation, low mean flow Whispery voice Hoarse voice Multipho ny A m i x t u r e o f whisper and voice Irregular, rough quality A v o i c e w i t h multiple F0 and/or sub-harmonics Incomplete folds closure. Low mean flow. Narrow glottal chink. A voice with structural aperiodicities, jitter or shimmer Dissymmetric vibration of the vocal folds, or combination of ventricular and voiced vibrations 18

Lax-tense voice Phonation type Description Production Lax-tense dimension Tense Lax A hard or sharp quality, audible glottal formant A relaxed, soft voice quality Adduction of the posterior part of vocal folds Abduction of the posterior part of vocal folds 19

Vocal effort Phonation type Description Production Vocal effort dimension Loud A strong voice, with much vocal force Flow voice A strong voice, with high amplitude of voicing and flow. Weak A w e a k v o i c e, without vocal force High sub-glottal pressure, high tension of the vocal folds, moderate flow, high voicing amplitude Normal sub-glottal pressure, tension of the vocal folds, high flow, high voicing amplitude Low sub-glottal pressure, low tension of the vocal folds, low flow, low voicing amplitude 20

The voice registers dimension Voice register depend on the underlying voice mechanism: Mechanism 0: vocal fry (creaky voice), very low F0, thick and heavy vocal folds, low sub-glottal pressure, low mean flow Mechanism I: modal voice, usual voice for males and lowpitched females, low to medium F0 register. Thick and heavy vocal folds vibrating along their whole lengths Mechanism II: falsetto voice, usual voice for high pitched females, high F0. Thin and light vocal folds, vibrating along about 2/3 of their anterior lengths Mechanism III: whistle. Very high pitch, mostly children and possibly female 21

The voice registers dimension (Henrich et Castellengo, 2001) Modal voice Falsetto voice (after Vennard, 1967) 22

The voice registers dimension Glissando (barytone) (Henrich, 2001) 23

The voice registers dimension (Glissando contre-ténor) (Henrich, 2001) 24

The noise dimension Represents the relative amount of noise in the speech signal. 1. Additive noises. Whispery voice, breathy voice. Turbulent flow at the glottal constriction. 2. Structural noises. Hoarseness, roughness: 1. Jitter: This is a random fluctuation of the duration of fundamental periods; 2. Shimmer: This is a random fluctuation of amplitude for successive periods. 25

The noise dimension Sound examples 1. Additive noises. 1. Whispery voice. narrow glottis 2. breathy voice. Wide glottis, voicing constriction. 2. Structural noises. Hoarseness, roughness: 26

The pressed/lax dimension The vocal folds can be pressed together more or less strongly at their posterior extremities (arytenoids cartilages): 1. Pressed voice: sometimes called tense or sharp voice quality 2. Lax voice: if the arytenoids are separated, a chink is created at the posterior part of the glottis. Note that this pressed quality may be relatively independent of the vocal effort. 27

The pressed/lax dimension Sound Examples 1. Pressed voice: sometimes called tense or sharp voice quality 2. Lax voice: if the arytenoids are separated, a chink is created at the posterior part of the glottis. Note that this pressed quality may be relatively independent of the vocal effort. 28

The vocal effort dimension important for stress and accentuation important for emotion, affect, attitude etc Loudness = spectral balance and voice amplitude. Results of tension and stiffness of the vocal folds, high sub-glottal pressure 29

The vocal effort dimension Sound examples 1. Speech 1. Soft 2. Loud 3. shouting 2. Singing 3. Emotions 30

Voice range profile (phonetogram) (Sulter, Wit, Schutte, Miller, (1994)) 31

Vocal tract settings important for emotion, affect, attitude etc Important for styles Co-variation with source Very few systematic acoustic studies 32

The vocal tract dimension Sound examples 1. Smiling 2. Rounding 3. Bite block 4. Lengthening 5. Shortening 6. yawning 33

Conclusions on voice quality dimensions About 4 main dimensions for phonation (+ pitch/f0) Vocal tract dimensions of voice quality mostly unknown Respiration dimension of voice quality mostly unknown 34

Modelling the voice source Ø Voice source signals models Ø Time-domain and spectral parameters Ø Physical model and signal models 35

Glottal flow models : time domain Examples: Rosenberg C (Rosenberg, 1971) LF (Liljencrants & Fant, 1985) Klatt (Klatt & Klatt, 1990) R++ (Veldhuis, 1998) 36

Glottal flow models KLGLOTT88 (Klatt & Klatt, Jasa 1988) Rosenberg C (Rosenberg Jasa 1971) LF model,( Liljenkrants, Fant, Lin KTH -STL, 1985) 37

A unified set: 5 time-domain parameters (Doval, d Alessandro & henrich, Acta Acustica 2006) T 0, fundamental period Av, voiced amplitude O q, open quotient a m, asymmetry coefficient (equivalent to speed quotient) Q a, return phase quotient Other parameters of interest :J, total flow of a single pulse E, negative peak amplitude of the glottal flow derivative 38

Time-domain equations In the case of Qa = 0 (abrupt closure), the GFM can all be expressed as : n g (x, a m ) depends on the model normalized glottal flow model : 39

Glottal flow models : frequency domain Glottal flow: Glottal flow derivative: N g (x, a m ) : Fourier transform of n g (x, a m ) N g (x, a m ) : Fourier transform of n g (x, a m ) These two functions depend on the model 40

41 Glottal flow models : spectral description J E j e A A J E j e O T F m n m n v g m n m n q g = = = = ) ( ) ( 2 1 ) ( ) ( 2 1 0 α α π α α π «glottal formant» : g a g a a q a a A F F F E A O T Q F = = π π 2 ) (1 2 1 0 spectral slope : Doval, d Alessandro, Henrich (2006)

Spectral / Time domain :open quotient, asymmetry 42

Spectral / Time domain: spectral tilt Effect of E and Spectral tilt 43

Causal-Anticausal linear voice source model (CALM) Doval, d Alessandro, Henrich (2003) Convergence region for a stable CALM Anticausal filter Causal filter Glottal pulse (CALM vs. R++) Frequency response 44

Voice quality dimensions and acoustic parameters Ø Non-linear relationships between parameters of the acoustic model and voice quality dimensions Ø General relationships Ø Speaker low-high Ø Speaking soft-loud 45

Voice quality dimensions and source parameters dimension Time domain Registers F0, Open quotient Spectral domain T0, Glottal formant Noise Noise, Jitter, Shimmer Noise, harmonic widths Tension Open quotient Glottal formant Force Closure, peak flow Spectral tilt, amplitude 46

Voice quality dimensions and source parameters Parameter Description Duality Main effect on Phonation Time domain parameters Av Amplitude of voicing E, Ags Flow Oq Open quotient Fg Tenseness, Am Asymmetry Bg, Sq Tenseness, Loudness Qa Return phase Fa, Loudness Alternative time domain parameters E Derivative peak Av Flow, Loudness SPL Sound Pressure Level Av, E Flow, Loudness Sq Speed quotient Bg, Am Tenseness, loudness Rd Amplitude quotient (Fant) AV, E, F0 Loudness, Tenseness Aq Amplitude quotient (Alku) AV, E Loudness, Tenseness 47

Voice quality dimensions and source parameters Parameter Description Duality Main effect on Phonation E Derivative peak Av Flow, Loudness SPL Sound Pressure Level Av, E Flow, Loudness Sq Speed quotient Bg, Am Tenseness, loudness Rd Amplitude quotient (Fant) AV, E, F0 Loudness, Tenseness Aq Amplitude quotient (Alku) AV, E Loudness, Tenseness Spectral parameters Fg Glottal formant frequency Oq Tenseness Bg Glottal formant bandwidth Sq, Am Tenseness, loudness Fa Spectral tilt frequency Qa, Tl Loudness Ags Glottal formant amplitude Av, SPL Flow Alternative spectral parameters H1*-H2* 1rst and 2nd Harmonic amplitude differences Oq, Am Tenseness H1*-F3* 1rst harmonic to 3 rd formant amplitude difference Tl, Qa,Fa Loudness Tl Spectral tilt Qa Loudness HRF Harmonic richness factor Qa Loudness 48

Voice quality dimensions and source parameters Parameter Description Duality Main effect on Phonation Aperiodicities Jitter Period-to-period frequency variations Roughness Shimmer Period-to-period amplitude variation Roughness PAPR Periodic-aperiodic ratio Breathiness, whisper LoV Limit of voicing Breathiness, whisper NTL Noise spectral tilt Breathiness, whisper IHN Inter harmonic noise Breathiness, whisper 49

Speaking high Low-high dimension voice registers : signal changes with pitch height: open quotient, amplitude phonétogram : SLP dependence with pitch height formant tuning: vocal tract changes with pitch height FG tuning : open quotient 50

Speaking loud Soft-loud dimension voice spectral tilt changes spl changes noise in the source formant tuning: vowel opening F0 rise F0 contour F1 tuning (Liénard) FG tuning Peakiness, impulsiveness 51

Application to Performative synthesis Formant + CALM source Real-time control Including non-linear source-filter interactions Including a phonetogram DEMO : Cantor Digitalis 22/06/13 NOLISP 2013 52

Acknowledgements Contributions of : Sylvain Le Beux, Nicolas D Alessandro, Lionel Feugère, Boris Doval, Olivier Perrotin For the Cantor Digitalis Are gratefully acknowledged 53