Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra

Similar documents
Quarterly Progress and Status Report. Violin timbre and the picket fence

Measurement of overtone frequencies of a toy piano and perception of its pitch

TO HONOR STEVENS AND REPEAL HIS LAW (FOR THE AUDITORY STSTEM)

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Temporal summation of loudness as a function of frequency and temporal pattern

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Psychoacoustics. lecturer:

Loudness and Sharpness Calculation

Concert halls conveyors of musical expressions

9.35 Sensation And Perception Spring 2009

Quarterly Progress and Status Report. Formant frequency tuning in singing

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England

Quarterly Progress and Status Report. X-ray study of articulation and formant frequencies in two female singers

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Simple Harmonic Motion: What is a Sound Spectrum?

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

Noise evaluation based on loudness-perception characteristics of older adults

The Tone Height of Multiharmonic Sounds. Introduction

Quarterly Progress and Status Report. Replicability and accuracy of pitch patterns in professional singers

increase by 6 db each if the distance between them is halved. Likewise, vowels with a high first formant, such as /a/, or a high second formant, such

Making music with voice. Distinguished lecture, CIRMMT Jan 2009, Copyright Johan Sundberg

PHYSICS OF MUSIC. 1.) Charles Taylor, Exploring Music (Music Library ML3805 T )

ADVANCED PROCEDURES FOR PSYCHOACOUSTIC NOISE EVALUATION

Linear Time Invariant (LTI) Systems

Modeling sound quality from psychoacoustic measures

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Quarterly Progress and Status Report. Statistical computer measurements of the tone-scale in played music

The Cocktail Party Effect. Binaural Masking. The Precedence Effect. Music 175: Time and Space

Analysing Room Impulse Responses with Psychoacoustical Algorithms: A Preliminary Study

We realize that this is really small, if we consider that the atmospheric pressure 2 is

MODIFICATIONS TO THE POWER FUNCTION FOR LOUDNESS

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

Binaural dynamic responsiveness in concert halls

Quarterly Progress and Status Report. Acoustic analysis of three male voices of different quality

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

BBN ANG 141 Foundations of phonology Phonetics 3: Acoustic phonetics 1

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

ACOUSTIC RETROREFLECTORS FOR MUSIC PERFORMANCE MONITORING

How to Obtain a Good Stereo Sound Stage in Cars

Proceedings of Meetings on Acoustics

Informational masking of speech produced by speech-like sounds without linguistic content

Determination of Sound Quality of Refrigerant Compressors

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Psychoacoustic Evaluation of Fan Noise

Preferred acoustical conditions for musicians on stage with orchestra shell in multi-purpose halls

PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF)

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

Experiments on tone adjustments

MASTER'S THESIS. Listener Envelopment

CHAPTER 20.2 SPEECH AND MUSICAL SOUNDS

Quarterly Progress and Status Report. Perception of just noticeable time displacement of a tone presented in a metrical sequence at different tempos

Why do some concert halls render music more expressive and impressive than others?

Virtual Vibration Analyzer

FLOW INDUCED NOISE REDUCTION TECHNIQUES FOR MICROPHONES IN LOW SPEED WIND TUNNELS

THE DIGITAL DELAY ADVANTAGE A guide to using Digital Delays. Synchronize loudspeakers Eliminate comb filter distortion Align acoustic image.

Please feel free to download the Demo application software from analogarts.com to help you follow this seminar.

UNIVERSITY OF DUBLIN TRINITY COLLEGE

A PSYCHOACOUSTICAL INVESTIGATION INTO THE EFFECT OF WALL MATERIAL ON THE SOUND PRODUCED BY LIP-REED INSTRUMENTS

What is proximity, how do early reflections and reverberation affect it, and can it be studied with LOC and existing binaural data?

THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO. J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England

ELECTRO-ACOUSTIC SYSTEMS FOR THE NEW OPERA HOUSE IN OSLO. Alf Berntson. Artifon AB Östra Hamngatan 52, Göteborg, Sweden

UB22z Specifications. 2-WAY COMPACT FULL-RANGE See NOTES TABULAR DATA for details CONFIGURATION Subsystem DESCRIPTION

Pitch-Synchronous Spectrogram: Principles and Applications

Available online at International Journal of Current Research Vol. 9, Issue, 08, pp , August, 2017

Hybrid active noise barrier with sound masking

PS User Guide Series Seismic-Data Display

Equal Intensity Contours for Whole-Body Vibrations Compared With Vibrations Cross-Modally Matched to Isophones

1aAA14. The audibility of direct sound as a key to measuring the clarity of speech and music

Physics and Neurophysiology of Hearing

EE-217 Final Project The Hunt for Noise (and All Things Audible)

HARMONIC ANALYSIS OF ACOUSTIC WAVES

Absolute Perceived Loudness of Speech

THE ACOUSTICS OF THE MUNICIPAL THEATRE IN MODENA

Some Phonatory and Resonatory Characteristics of the Rock, Pop, Soul, and Swedish Dance Band Styles of Singing

Soundscape and Psychoacoustics Using the resources for environmental noise protection. Standards in Psychoacoustics

Consonance perception of complex-tone dyads and chords

Quarterly Progress and Status Report

Kent Academic Repository

DIFFERENCES IN TRAFFIC NOISE MEASUREMENTS WITH SLM AND BINAURAL RECORDING HEAD

Diamond Cut Productions / Application Notes AN-2

Perception of bass with some musical instruments in concert halls

ON THE DYNAMICS OF THE HARPSICHORD AND ITS SYNTHESIS

Pitch. There is perhaps no aspect of music more important than pitch. It is notoriously

RoomMatch RM and RM TECHNICAL DATA SHEET. asymmetrical array modules. Key Features. Product Overview. Technical Specifications

Music Source Separation

The Physics Of Sound. Why do we hear what we hear? (Turn on your speakers)

Using the BHM binaural head microphone

Basic Considerations for Loudness-based Analysis of Room Impulse Responses

Quarterly Progress and Status Report. Intonation preferences for major thirds with non-beating ensemble sounds

Table 1 Pairs of sound samples used in this study Group1 Group2 Group1 Group2 Sound 2. Sound 2. Pair

Using tapers at the beginning and end of the time series to which pads will be added before filtering

UNIT 1: QUALITIES OF SOUND. DURATION (RHYTHM)

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 FORMANT FREQUENCY ADJUSTMENT IN BARBERSHOP QUARTET SINGING

Linrad On-Screen Controls K1JT

Signal processing in the Philips 'VLP' system

Release from speech-on-speech masking in a front-and-back geometry

Room acoustics computer modelling: Study of the effect of source directivity on auralizations

The presence of multiple sound sources is a routine occurrence

Higher-Order Modulation and Turbo Coding Options for the CDM-600 Satellite Modem

Transcription:

Dept. for Speech, Music and Hearing Quarterly Progress and Status Report An attempt to predict the masking effect of vowel spectra Gauffin, J. and Sundberg, J. journal: STL-QPSR volume: 15 number: 4 year: 1974 pages: 057-062 http://www.speech.kth.se/qpsr

.,. V. SPEECH PERCEPTON A. AN ATTEMPT TO PREDCT THE MASKNG EFFECT OF VOWEL SPECTRA J. Gadffin and J. Sundberg Abstract A simple method is suggested for the prediction of the masking effect of a vowel spectrum. The neth hod is based on empirical findings mainly. t involves two steps: a) prediction of the masked threshold of each single partial in the complex masker tone, and b) prediction of a correction to be added to the envelope of these masked thresholds. These predictions are made by means of equations established by measurements of the masking of one- and two-tone-maskers. Observed and predicted masked thresholds for four vowel sounds of different fundamental frequencies show reasonable agreement. ntroductiqn he masking effect of cornpiex tones is of considerabie interest in speech research as well as in musical acoystics. What partials in a vowel spectrum contribute to the timbre perceived (cf. Chistovich et a1 1973)? And what demands must a solo singer accompanied by an orchestra meet in order to be heard? An answer to such questions requires more knowledge about the masking of complex tones than seems to be available today. The purpose of the present investigation was to explore the possibilities of predicting the masked threshold of a given vowel spectrum masker using simple empirically based approximations. Masker properties The physical variability of vowel spectra is restricted in some important respects. Fir st, the partials constituting a vowel spectrum have harmonic frequencies normally lower than 4 khz. Second, the strongest partials are generally weaker than, say 80 db SPL. Third, the spectrum is characterized by a small number of spectrum envelope peaks corresponding to the formants. Our task is then to investigate the predictability of the masking effect of this particular kind of spectra. Experiments The masker and the probetone were presented through a loudspeaker 50 cm in front of the subject. The measurements were all made with one subject seated with neckrest in an anechoic chamber. The probetone

STL-QPSR 4/1974 58. amplitude shifted twice per second between two values separated by 6 db, and the subject adjusted the probetone amplitude so that he could hear only its stronger parts as a pitch or as roughness. Thereafter the masker signal was switched off and the probetone SPL was measured with a microphone just at the upper edge of the subjects s pinna. This procedure was repeated two or four times for each probetone frequency, depending on the spread of the readings. n choosing the probetone frequencies (one per critical band) slow beats between the probetone and the masker was always avoided. n this way the subject's responses for a given probetone frequency differed by more than 4 db only exceptionally. Results ---------------- One- and two-tone-maskers Fig. V-A-2 shows the results obtained from sine-wave maskers. For the masker intensities considered here it seems that the masked thres- 1 holds can be roughly approximated by trapezoids (cf. Karliickaya et a1 1973). The horizontal portion is about one critical band wide and lies 20 db below the masker SPL. The slope towards frequencies below the masker is 50 db per critical band in our subject, approximately. The slope on the other side exhibits a dip about one critical band above the masker frequency for the two strongest maskers measured (83 and 80 db SPL). According to Greenwood (197 1) this dip is caused by inter- modulation between the masker and the probetone and it disappears when the masker is a sufficiently broad-banded noise or a complex tone (cf. also Ehmer 1959). Therefore, for our purposes, the dip should be dis- regarded. f so, the masked thresholds above the critical band sur- rounding the masker frequency can be approximated by a straight line sloping at a rate which depends on the masker intensity. This dependence is illustrated in Fig. V-A-3 showing the slope as a function of the sen- sation level of the masker, i. e. the intensity relative to the absolute threshold of the subject. The sensation level was preferred to the SPL since the values displayed greater spread on an SPL scale. For com- parison, the plot also includes values derived from data published by ~e~ar;? & Rtjsier (1950) on sine-wave maskers and Zwicker (1963) on noise of subcritical bandwidth. The graph suggests a linear relationship between the sensation level of the masker and the slope. Noise maskers seem to give slightly higher values than sine-wave maskers, i. e. the!

.050,150 25.35.45,57-70.84 1.0 1.17 1.37 1.60 1.85,2.15 2.50 2.9 3.4 4.0 4.8 FREQUENCY ( khz) Fig. V-A-2a. Masked thresholds obtained from single sine-wave maskers with the frequency and SPL indicated by the bars. The solid line is the straight line approximation of the masked thresholds. The frequency scale is based on the critical bandwidth. The subject's absolute threshold is shown by the dashed curve.

,050.50.25.35.45.57,70.84 1.0 1.17.84 1.0 1.17 1.37 1.60 1.85 2.15 2.50 2.9 3.4 4.0 4.8 FREQUENCY (khz) FREQUENCY ( khz) Fig. V-A-2b. Masked thresholds obtained from single sine-wave maskers with the frequency and SPL indicated by the bars. The solid line is the straight line approximation of the masked thresholds. The frequency scale is based on the critical bandwidth. The subject' s absolute threshold is shown by the dashed curve.

20 40 60 80 100 ; MASKERSL (db)... Fig. V-A-3. The slope of the masked threshold in db SPL per critical band in the frequency region above. the masker frequency. The values were derived from sine-wave maskers (filled symbols) and from noise maskers of subcritical bandwidth according to Zwicker 1963 (open symbols). The masker frequencies were 110. 220. 280, 330 1540 Hz (our data); A 250 Hz, V 500 Hz, 1 1000 Hz (according to DeMark h R8sler 1950); 0 650 Hz, 0 1000 Hz. V 2000 Hz (according to Zwicker). The solid line shows an average for the noise maskers, and the dashed line gives the average for our data used in the predictions in Fig. V-A-6. J * ' a,.

STL-QPSR 4/1974 59. masked threshold for a narrow-banded noise appear to fall off steeper. The difference in slope between sine-wave and narrow-band noise is db per critical band, on the average. Our subject shows good agreement with the data pertaining to averages for other subjects. The following expressions give the best straight line fit to our data points. The slope a, = 16-0.16 SL db per critical band (1) where SL is the sensation level of the masker. The equation seems to be valid provided that SL 75 db SL. Equation () is similar to that suggested by Karnickaya et a1 (1973). One difference is that we operate with the sensation level of the masker instead of the SPL. Plotting our data on an SPL scale (and disregarding the greater spread) showed that Karnickaya' s equation predicted slightly too high values. This may very well be due to intersubject differences. Fig. V-A-4 shows the masked thresholds obtained from pairs of sine-wave maskers together with the masked thresholds for each of the sine-wave maskers constituting the pair. The masked thresholds of the two-tone-maskers does not exactly coincide with the envelope of the individual masked thresholds (see Bilger 1959, Webster et a1 1952, and cf. Zwicker & Scharf 1965). A corr-ection, c, has to be added to the 1 envelope. This correction is large when the distance between the masked thresholds is small. Also, c seems to be smaller where two negatively sloping masked thresholds are involved than in other situations. The c- value of 8.7 db at 450 Hz for the 110t330 Hz masker is due to the fact mentioned previously that the dip above a sine-wave masker disappears i., when another masker is added at a lower frequency. Thus, c seems to i depend on the distance between the masked thresholds of each of the masker components. This dependence is illustrated in Fig. V-A-5. The point just mentioned occurs at A = 20 db, and should be disregarded. The plot also includes some values derived from Bilger' s (1959) meas- urements nn combinations of high- and low-passed noise. Eventhough the spread is considerable, rough approximations may be obtained with the following equations:. 1

Fig. V-A-5. The difference, c, between the masked threshold of a two-tone masker and the envelope of the masked thresholds of each of these two tones. A is the difference between the individual masked thresholds of the tones (see text). The open and filled circles pertain to values obtained for frequencies between the maskers and above the highest masker, respectively. The open squares were derived from Bilger' s (1959) measurements on combinations of high- and lowpassed filtered noise maskers. The dashed lines show the averages used in the predictions shown in Fig. V-A-6.

STL-QPSR 4/1974 60. c=5-0.15~ for^ 2 3 3. a ~. c=o for A > 33 db c=9-0.20a for A 45 db c=o for A > 45 db where A is the difference in db between the masked thresholds. Eq. (2) applies to cases where two negatively sloping thresholds are involved, and eq. (3) to all other cases. Vowel spectrum maskers Using the same method as for the one- and two-tone-maskers our subject determined the masked threshold of four vowel spectra. Two of the vowel sounds were synthesized, and two were natural vowels produced by professional singers, The vowel sounds were reproduced by playing a tape loop. With the subject in situ in the anechoic chamber the masker spectra were analyzed using the microphone at the subject's pinna. On the basis of the findings reported above, attempts were made to predict the masked thresholds of these four vowel spectra. The procedure was as follows. First, the masked threshold for each partial in the masker spectrum was constructed. The horizontal part, one critical band wide, was centered symmetrically around the partial frequency and placed 20 db below the SPL of the partial. The slope towards frequencies below the partial was 50 db per critical band. The slope on the other side was determined by means of eq. (1). Second, eqs. (2) and (3) were used to calculate the correction, c, to be added to the envelope of the individual masked thres- holds. At each frequency only the two highest thresholds were taken into account. The procedure is illustrated in Fig. V-A-6a. The predicted masked thresholds of the four vowel spectra can be compared with the observed masked thresholds in Fig. N-A-6. Towards frequencies higher than 1. 5 khz, approximately, the predicted values tend to be consistently too low. This may be an effect of the fact that, here, two or more partials fall into the same critical band, which would increase the masking. n cases where the partials have weak amplitudes the tape noise may also contribute to the underestimation of the masked threshold. Errors as high as 13 db occur at single frequencies, as e. g. at 700 Hz in the [ i ] with 220 Hz fundamental frequency. The reason for this is

FREQUENCY ( khz) Fig. V-A-6a. Observed and predicted masked thresholds (solid and dashed lines respectively) for the sung vowel spectrum indicated. The lower dashed line is the subject's absolute threshold. The thin lines illustrate the approximations of the masked thresholds for the individual partials.

.OJO -150.25.35.45.57-70,84 1.0 1.17 1.37 1.60 1.85 2.15 2.50 2.9 FREQUENCY (khz) Fig. V-A-6b. Observed and predicted masked thresholds (solid and dashed lines respectively) for the synthesized vowel spectrum indicated. The lower dashed line is the subject' s absolute threshold.

,050.50.25-35 -45-57.70.84 1.0 1.17 1.37 1.60 1.85 2.15 2.50 FREQUENCY (khz) Fig. V-A-6c. Observed and predicted masked thresholds (solid and dashed lines. respectively) for the synthesized vowel spectrum indicated. The lower dashed line is the subject's absolute threshold.

STL-QPSR 4/1974 62 4 DeMARE, G. and RUSLER, G. : "Untersuchungen uber den Verdeckunpeffekt bei Leitungs- und nnerohrschwerh6rigkeith, Acta Oto-Lar. 38:2 (1950), pp. 179-1901 - EHMER, R. H. : "Masking Patterns of Tones", J, Acoust. Soc. Am. 31:8 (1959), pp. 1115-1120. GREENWOOD, D. D. : "Aural combination' Tones and Auditory Masking", J. Acoust. Soc. Am, 50:2 (197 ), pp. 502-543. - KARNCKAYA, E.G., MUSHNKOV, V.N., SLEPOKUROVA, N.A., ans ZHUKOV, S. Ja. : "Auditory Processing af Steady-State Vowels", Proc. of the Syinp. on Auditory Analysis and Perception of Speaeh, Leninghct; to be publ. by Academic Press, London. WEBSTER, J. C., MLLER, P. H., THOMPSON, P.0,. and DAVENPORT, E. W. : "The Masking and Pitch Shifts of Pure Tones near Abrupt Changes in a Thermal Noise Spectrum", J, Acoust. Soc. Am. 24:7 (1952), pp. 147-152. - ZWCKER, E. : "Uber die Lautheit von ungedrosselten und gedrossclten Schalleh", Acustica 13 (1963), pp. 194-21 1. - ZWCKER, E. and SCHARF, B. : "A Model of Loudne8s Summation", Psych.Rev, 72:1 (i965), pp, 3-26. - 1 -