Voice segregation by difference in fundamental frequency: Effect of masker type

Size: px
Start display at page:

Download "Voice segregation by difference in fundamental frequency: Effect of masker type"

Transcription

1 Voice segregation by difference in fundamental frequency: Effect of masker type Mickael L. D. Deroche a) Department of Otolaryngology, Johns Hopkins University School of Medicine, 818 Ross Research Building, 720 Rutland Avenue, Baltimore, Maryland John F. Culling School of Psychology, Cardiff University, Tower Building, Park Place, Cardiff, CF10 3AT, United Kingdom Abstract: Speech reception thresholds were measured for a voice against two different maskers: Either two concurrent voices with the same fundamental frequency (F0) or a harmonic complex with the same long-term excitation pattern and broadband temporal envelope as the masking sentences (speech-modulated buzz). All sources had steady F0s. A difference in F0 of 2 or 8 semitones provided a 5-dB benefit for buzz maskers, whereas it provided a 3- and 8-dB benefit, respectively, for masking sentences. Whether intelligibility of a voice increases abruptly with small DF0s or gradually toward larger DF0s seems to depend on the nature of the masker. VC 2013 Acoustical Society of America PACS numbers: Dc, Gv [SGS] Date Received: July 13, 2013 Date Accepted: October 7, Introduction Effects of a difference in fundamental frequency (DF0) between competing sources were first reported by Brokx and Nooteboom (1982). They used monotonously spoken voices as well as voices that had been resynthesized to have a steady fundamental frequency (F0) and showed that speech recognition improved gradually with increasing DF0. Interestingly, with a DF0 of one octave, the performance dropped, suggesting that the mechanisms that exploit DF0s must be related to the harmonic structure of the competing sources. Later experiments used synthesized vowels rather than resynthesized sentences (Scheffers, 1983; Culling and Darwin, 1993) and found that vowel identification improved sharply for very small DF0s and saturated above one semitone. The mechanisms underlying these effects are still currently under investigation, but one important piece of the puzzle, which is the focus of the present study, is to understand why the pattern of improvement that results from DF0s is very sharp for vowels but gradual for sentences. 1.1 Energetic masking de Cheveigne et al. (1995, 1997a) showed that identification of simultaneous synthetic vowels depended on their harmonicity; identification of a given vowel was unaffected by its own harmonicity, but was reduced if the competing vowel was inharmonic. Deroche and Culling (2011a) demonstrated a similar effect for the recognition of a voice separated by two semitones from a harmonic complex with a speech-shaped spectral profile. Inharmonicity was generated by F0 modulation and reverberation a) Author to whom correspondence should be addressed. J. Acoust. Soc. Am. 134 (5), November 2013 VC 2013 Acoustical Society of America EL465

2 applied independently to the competing sources. These manipulations resulted in substantial elevation of speech reception threshold (SRT) when they were applied to the masker. As with synthetic vowels, the effect of a DF0 on intelligibility of the target voice did not depend on its harmonicity, but instead depended strongly on the masker s harmonicity. There are at least three main reasons why harmonic complexes should allow these large releases from energetic masking. First, when harmonic partials are in specific phase relationships, such as cosine, sine, and positive Schroeder-phase, they produce highly modulated waveforms after cochlear filtering. There are short temporal dips, as long as the fundamental period, in within-channel temporal envelopes that may allow listeners a better target-to-masker ratio (TMR) at these specific times, which is facilitated by cochlear compression (Kohlrausch and Sander, 1995; Carlyon and Datta, 1997). However, various forms of this phase hypothesis have been examined and seem implausible (de Cheveigne et al., 1997b; de Cheveigne, 1999). Second, harmonic complexes have spectral dips that allow a better TMR at center frequencies located in between resolved partials. Third, even when a role for temporal and spectral dips is excluded, detection of a narrowband of noise is easier against harmonic than inharmonic complexes (Deroche and Culling, 2011b). Periodicity in the masker waveforms may thus also contribute to the reduced masking of harmonic complexes. It remains, however, unclear how much, and in which conditions (i.e., F0 range, phase settings, spectral profile, masker level), each of these accounts contributes to the release from masking observed experimentally. 1.2 Informational masking In the presence of several voices, there might be an ambiguity as to which voice one should attend to, resulting in informational masking (Kidd et al., 2005). For unprocessed sentences, energetic and informational masking occurs together. However, Kidd et al. processed the target and masking sentences such that they occupied different frequency bands, ensuring that energetic masking would be largely absent. They observed large amounts of masking for speech-on-speech configurations, but not for speech-on-noise configurations. In order to release this informational masking, listeners can use a variety of cues to group sounds into sequential streams: F0 (Darwin and Hukin, 2000; Darwin et al., 2003; Drullman and Bronkhorst, 2004), signal-to-noise ratio (Brungart, 2001; Brungart et al., 2001), spatial separation (Darwin and Hukin, 2000; Freyman et al., 2001; Hawley et al., 2004; Kidd et al., 2005, Lee and Shinn-Cunningham, 2008), priming by the target talker or onset cues (Freyman et al., 2004), vocal-tract length, sex difference, and prosody (Culling and Porter, 2004; Darwin et al., 2003; Brungart et al., 2001), and even tactile cues (Drullman and Bronkhorst, 2004). In the case of overlapping dialogue, it is unclear what proportion of the observed DF0 effects can be attributed to a release of energetic masking, and what proportion to a release of informational masking. 1.3 Aim of the present experiment Identification of competing vowels improves with very small DF0s and is not improved further by increasing the DF0 beyond one or two semitones. One would hope that the mechanisms underlying these improvements in vowels would also be involved in the segregation of voices. However, recognition of a voice in the presence of competing voices improves more gradually as DF0 increases (Bird and Darwin, 1998). If the same mechanisms are involved, why is the pattern of improvement so different? The present study attempts to answer this question by (a) testing whether the sharp improvement with small DF0s can occur with full sentences (or whether it is restricted to vowels), and (b) testing whether the gradual improvement with DF0 depends on the masker type. To this aim, the benefit of 2 or 8 semitones DF0 was measured, using target sentences masked by two masking sentences, spoken by the same talker as the target, or masked by harmonic complexes matched in both temporal and spectral envelope, hereafter termed speech-modulated buzz. Little or no informational masking was expected in the presence of buzz maskers, whereas using the same materials as in the EL466 J. Acoust. Soc. Am. 134 (5), November 2013 M. L. D. Deroche and J. F. Culling: Voice segregation by fundamental frequency

3 present study, Hawley et al. (2004) found evidence of informational masking when two simultaneous masking voices were used. Note that superimposing a large number of masking voices would reduce or eliminate informational masking, but a single voice would also result in little informational masking especially when the same sentence is used throughout a given SRT measurement and displayed on the screen in front of the listener. 2. Method 2.1 Listeners Two different groups of listeners took part in the experiment: The first group had 12 listeners; the second had 8 listeners. They were all undergraduate students, aged between 20 and 30 yrs old, who were paid for their participation. All listeners reported normal hearing and English as their first language. Each listener attended a single 45-min experimental session. 2.2 Stimuli and conditions Two types of maskers were used: Speech-modulated buzz and 2-concurrent masking sentences. For the first group of listeners, target and masker had either the same F0 or F0s that were two semitones apart. For the second group, target and masker had either the same F0 orf0s that were eight semitones apart. The DF0 benefit could then be compared across the two masker types (within-subjects) and the two DF0 sizes (between-subjects). Speech materials were 80 sentences recorded from a male speaker of American English in the original IEEE list format (IEEE, 1969). The Praat PSOLA package resynthesized each sentence with a specified F0 throughout. Target sentences were monotonized at 110, (2 semitones higher), and Hz (8 semitones higher). Eight masking sentences, different from any of the targets, were monotonized at 110 Hz, and then added by pairs to create four 2-voice maskers. The DF0 was thus always generated from varying the target s F0. Note that changing the masker s F0 over a large range results in very different opportunities for listeners to glimpse spectrally in between masker partials. The buzz maskers were created from a broadband sine-phase harmonic complex, based on a F0 fixed at 110 Hz. A sine-phase complex was chosen because it was a closer approximation to glottal pulse excitation than a random-phase complex. This complex was filtered with a Hamming-window based linear-phase finite impulse response filter with 5000 coefficients designed to match the average long-term excitation pattern of the monotonized masking sentences (as in Deroche and Culling, 2011a). In order to further increase their similarity with the masking sentences, the temporal envelopes of the four 2-voice maskers were extracted by half-wave rectification and low-pass filtering with a cutoff at 20 Hz and applied to the complex with a speech-like spectral profile. This manipulation resulted in four speech-modulated buzz maskers. All maskers and target stimuli were equalized to the same root-mean-square power, prior to changes in TMR during the adaptive track. Maskers were presented at 69 db sound pressure level and the relative target level was adjusted. 2.3 Procedure and equipment The procedure and equipment were similar to those used by Deroche and Culling (2011a). In each list of ten sentences, listeners attempted to type a transcript. Presented with the actual transcript, containing five highlighted key words, listeners scored themselves, disregarding spelling errors. TMR decreased when three words or more were identified and increased when two words or less were identified. Following this 1-up/1-down adaptive rule, measurement of each SRT was taken as the mean TMR in the last eight sentences, tracking 50% intelligibility. For the 2-voice maskers, the two masking sentences were displayed on a computer screen in front of the listener who was instructed to disregard them and to listen to the target sentence (not displayed on the screen). Note that for this masker type a SRT of 0 db occurred when the target level was 3 db higher than each of the two masking sentences. Stimuli were presented diotically over Sennheiser HD650 J. Acoust. Soc. Am. 134 (5), November 2013 M. L. D. Deroche and J. F. Culling: Voice segregation by fundamental frequency EL467

4 headphones in a single-walled IAC sound-attenuating booth within a sound-treated room in a single-walled IAC sound-attenuating booth within a sound-treated room. Prior to the experimental session, listeners were familiarized with the task with 3 practice runs using 30 unprocessed sentences not used in the experiment, masked by speech-modulated buzz (1 run) or by 2-voice speech maskers (2 runs), also not used in the rest of the experiment. The following eight runs measured two SRTs for each of the four experimental conditions, which were averaged to give one mean SRT per condition. While each of the 80 target sentences was presented to every listener in the same order, the order of the conditions was rotated for successive listeners, to counterbalance order effects. No sentence was presented twice to a listener within the experiment, and each listener could only sign up once. 3. Results The left panel of Fig. 1 shows that mean SRTs were overall lower for buzz than for 2-voice speech maskers, and lower when target and maskers differed in F0. The statistical analysis focused on the DF0 benefits, extracted in each group from the difference in SRTs between the no-df0 and the DF0 conditions and plotted in the right panel. Analysis of variance with one within-subjects factor (masker type) and one betweensubjects factor (DF0 size) revealed no main effect of masker type [F(1,18) ¼ 1.9, p > 0.05], but a main effect of DF0 size [F(1,18) ¼ 14.2, p < 0.001] and a strong interaction [F(1,18) ¼ 17.5, p < 0.001]. The DF0 benefit was overall larger at 8 than at 2 semitones, but this depended largely on the masker type. Post hoc pairwise comparisons revealed that for the speech-modulated buzz masker, the DF0 benefit was similar at 2 and 8 semitones [F(1,18) ¼ 0.2, p > 0.05], whereas for the speech maskers, it was greater by 5 db for 8 than 2 semitones [F(1,18) ¼ 22.7, p < 0.001]. At 2-semitones DF0, the benefit was smaller for speech maskers than for speech-modulated buzz [F(1,18) ¼ 4.9, p < 0.05], and vice versa at 8-semitones DF0 [F(1,18) ¼ 12.9, p < 0.01]. 4. Discussion A DF0 of 2 semitones provided a similar benefit as a DF0 of 8 semitones with speechmodulated buzz maskers (Fig. 1, left side of the right panel). When the harmonic structure of a voice is shifted very little from that of a speech-modulated speech-shaped harmonic complex, intelligibility of this voice improves quite dramatically, by 5 db, but does not seem to improve further by increasing the DF0 size. This sharp pattern of improvement with DF0 (i.e., improvements for very small DF0s which saturate above 2 semitones) has only been observed previously with vowel stimuli (Culling and Darwin, Fig. 1. (Left panel) Mean speech reception thresholds for two types of masker: Speech-modulated buzz or two concurrent masking sentences, with the same F0 or a different F0 than that of the target. One group of listeners was tested for a DF0 of two semitones; the other group for a DF0 of eight semitones. Error bars are 61 standard error of the mean. (Right panel) Mean benefits of 2- and 8-semitones DF0 for the two masker types. Error bars are 61 standard error of the mean. EL468 J. Acoust. Soc. Am. 134 (5), November 2013 M. L. D. Deroche and J. F. Culling: Voice segregation by fundamental frequency

5 1993; de Cheveigne et al., 1997a,b). In contrast, the use of two concurrent sentences as maskers produced a more gradual pattern of improvement with increments in DF0: A 3-dB benefit at 2 semitones and an 8-dB benefit at 8 semitones (Fig. 1, right side of the right panel). So the reason why intelligibility of a target voice keeps improving as its F0 is set further and further apart from a competing voice (e.g., Bird and Darwin, 1998) may not be due to the use of sentences per se, but the fact that maskers were sentences. Masking sentences may automatically engage phonetic or lexical processing whereas buzz maskers may not. In other words, masking sentences may involve informational masking, whereas buzz maskers may not. Despite our efforts to match the temporal and spectral envelopes of the two masker types, there were breaks in the voicing of speech maskers in which glimpses could have been potentially available. So there was, in principle, more energetic masking with buzz than with speech maskers and yet intelligibility of the target voice was better against the buzz than against the two other voices. So it is likely that informational masking is responsible for this large increase in threshold with speech maskers. Listeners may then use DF0s to release both from energetic and informational masking invoked by masking voices, but a relatively large DF0 seems necessary. It may be that a large DF0 is required for competing voices to be perceptually segregated (Darwin et al., 2003), so the release from informational masking may not be accessible until the competing F0s are quite far apart. The benefit of a 2-semitone DF0 was significantly smaller for masking sentences than for buzz. The release from energetic masking may be reduced with sentences due to breaks in voicing which, as a result of unvoiced consonants, occurred more frequently than in the buzz maskers. The more interrupted the masker s F0, the less beneficial a DF0 may be. With masking voices, therefore, the release from energetic masking may be somewhat reduced and the release from informational masking may require large DF0s to be effective, resulting in more gradual improvements with DF0s. Finally, it is important to bear in mind that there are a variety of instantaneous DF0s in realistic situations of conversation. Their overall benefit may depend on the average DF0 over the long term of sentences (which may be as large as one octave for gender differences) and how robust the mechanisms underlying DF0 effects are to different degrees of F0 modulation. 5. Conclusion The present experiment investigated listeners ability to recognize a male talker in the presence of speech-modulated buzz or two same-male-talker maskers when they differ in F0 from the target by 0, 2, or 8 semitones. With speech-modulated buzz maskers, the DF0 benefit was about 5 db at 2 and 8 semitones, reminiscent of the DF0 effects observed for double-vowels. With speech maskers, the DF0 benefit increased from 3 to 8 db at 2 and 8 semitones, respectively. Therefore the pattern of improvement with DF0 seems to depend on the masker type, particularly whether it is linguistic or not. Acknowledgment This research was supported by the UK EPSRC while both authors were at Cardiff University. References and links Bird, J., and Darwin, C. J. (1998). Effects of a difference in fundamental frequency in separating two sentences, in Psychophysical and Physiological Advances in Hearing, edited by A. R. Palmer, A. Rees, A. Q. Summerfield, and R. Meddis (Whurr, London), pp Brokx, J., and Nooteboom, S. (1982). Intonation and the perceptual separation of simultaneous voices, J. Phonetics 10, Brungart, D. (2001). Informational and energetic masking effects in the perception of two simultaneous talkers, J. Acoust. Soc. Am. 109, J. Acoust. Soc. Am. 134 (5), November 2013 M. L. D. Deroche and J. F. Culling: Voice segregation by fundamental frequency EL469

6 Brungart, D., Simpson, B., Ericson, M., and Scott, K. (2001). Informational and energetic masking effects in the perception of multiple simultaneous talkers, J. Acoust. Soc. Am. 110, Carlyon, R. P., and Datta, A. J. (1997). Excitation produced by Schroeder-phase complexes: Evidence for fast-acting compression in the auditory system, J. Acoust. Soc. Am. 101, Culling, J. F., and Darwin, C. J. (1993). Perceptual separation of simultaneous vowels: Within and across-formant grouping by f0, J. Acoust. Soc. Am. 93, Culling, J. F., and Porter, J. S. (2004). Effects of differences in the accent and gender of competing voices on speech segregation, in Auditory Signal Processing: Physiology, Psychoacoustics and Models, edited by D. Pressnitzer, A. de Cheveigne, S. McAdams, and L. Collet (Springer Verlag, New York), pp Darwin, C. J., Brungart, D. S., and Simpson, B. D. (2003). Effects of fundamental frequency and vocaltract length changes on attention to one of two simultaneous talkers, J. Acoust. Soc. Am. 114, Darwin, C. J., and Hukin, R. W. (2000). Effectiveness of spatial cues, prosody and talker characteristics in selective attention, J. Acoust. Soc. Am. 107, de Cheveigne, A. (1999). Waveform interactions and the segregation of concurrent vowels, J. Acoust. Soc. Am. 106, de Cheveigne, A., Kawahara, H., Tsuzaki, M., and Aikawa, K. (1997a). Concurrent vowel segregation. I. Effects of relative amplitude and F0 difference, J. Acoust. Soc. Am. 101, de Cheveigne, A., McAdams, S., Laroche, J., and Rosenberg, M. (1995). Identification of concurrent harmonic and inharmonic vowels: A test of the theory of harmonic cancellation and enhancement, J. Acoust. Soc. Am. 97, de Cheveigne, A., McAdams, S., and Marin, C. (1997b). Concurrent vowel segregation. II. Effects of phase, harmonicity and task, J. Acoust. Soc. Am. 101, Deroche, M. L. D., and Culling, J. F. (2011a). Voice segregation by difference in fundamental frequency: Evidence for harmonic cancellation, J. Acoust. Soc. Am. 130, Deroche, M. L. D., and Culling, J. F. (2011b). Narrow noise band detection in a complex masker. Masking level difference due to harmonicity, Hear. Res. 282, Drullman, R., and Bronkhorst, A. (2004). Speech perception and talker segregation: Effects of level, pitch, and tactile support with multiple simultaneous talkers, J. Acoust. Soc. Am. 116, Freyman, R., Balakrishnan, U., and Helfer, K. (2001). Spatial release from informational masking in speech recognition, J. Acoust. Soc. Am. 109, Freyman, R., Balakrishnan, U., and Helfer, K. (2004). Effect of number of masking talkers and auditory priming on informational masking in speech recognition, J. Acoust. Soc. Am. 115, Hawley, M., Litovsky, R., and Culling, J. (2004). The benefit of binaural hearing in a cocktail party: Effect of location and type of masker, J. Acoust. Soc. Am. 115, IEEE (1969). IEEE recommended practice for speech quality measurements, IEEE Trans. Audio Electroacoust. 17, Kidd, G., Mason, C., Brughera, A., and Hartmann, W. M. (2005). The role of reverberation in release from masking due to spatial separation of sources for speech identification, Acta Acust. Acust. 91, Kohlrausch, A., and Sander, A. (1995). Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets, J. Acoust. Soc. Am. 97, Lee, A. K. C., and Shinn-Cunningham, B. G. (2008). Effects of reverberant spatial cues on attentiondependent object formation, J. Assoc. Res. Otolaryngol. 9, Scheffers, M. T. M. (1983). Sifting vowels: Auditory pitch analysis and sound segregation, Ph.D. thesis, Rijksuniversiteit Groningen, The Netherlands. EL470 J. Acoust. Soc. Am. 134 (5), November 2013 M. L. D. Deroche and J. F. Culling: Voice segregation by fundamental frequency

AUD 6306 Speech Science

AUD 6306 Speech Science AUD 3 Speech Science Dr. Peter Assmann Spring semester 2 Role of Pitch Information Pitch contour is the primary cue for tone recognition Tonal languages rely on pitch level and differences to convey lexical

More information

Behavioral and neural identification of birdsong under several masking conditions

Behavioral and neural identification of birdsong under several masking conditions Behavioral and neural identification of birdsong under several masking conditions Barbara G. Shinn-Cunningham 1, Virginia Best 1, Micheal L. Dent 2, Frederick J. Gallun 1, Elizabeth M. McClaine 2, Rajiv

More information

Informational masking of speech produced by speech-like sounds without linguistic content

Informational masking of speech produced by speech-like sounds without linguistic content Informational masking of speech produced by speech-like sounds without linguistic content Jing Chen, Huahui Li, Liang Li, and Xihong Wu a) Department of Machine Intelligence, Speech and Hearing Research

More information

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound

Pitch Perception and Grouping. HST.723 Neural Coding and Perception of Sound Pitch Perception and Grouping HST.723 Neural Coding and Perception of Sound Pitch Perception. I. Pure Tones The pitch of a pure tone is strongly related to the tone s frequency, although there are small

More information

Precedence-based speech segregation in a virtual auditory environment

Precedence-based speech segregation in a virtual auditory environment Precedence-based speech segregation in a virtual auditory environment Douglas S. Brungart a and Brian D. Simpson Air Force Research Laboratory, Wright-Patterson AFB, Ohio 45433 Richard L. Freyman University

More information

Auditory scene analysis

Auditory scene analysis Harvard-MIT Division of Health Sciences and Technology HST.723: Neural Coding and Perception of Sound Instructor: Christophe Micheyl Auditory scene analysis Christophe Micheyl We are often surrounded by

More information

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring 2009 Week 6 Class Notes Pitch Perception Introduction Pitch may be described as that attribute of auditory sensation in terms

More information

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng S. Zhu, P. Ji, W. Kuang and J. Yang Institute of Acoustics, CAS, O.21, Bei-Si-huan-Xi Road, 100190 Beijing,

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Psychological and Physiological Acoustics Session 1pPPb: Psychoacoustics

More information

Pitch perception for mixtures of spectrally overlapping harmonic complex tones

Pitch perception for mixtures of spectrally overlapping harmonic complex tones Pitch perception for mixtures of spectrally overlapping harmonic complex tones Christophe Micheyl, a Michael V. Keebler, and Andrew J. Oxenham Department of Psychology, University of Minnesota, Minneapolis,

More information

The Tone Height of Multiharmonic Sounds. Introduction

The Tone Height of Multiharmonic Sounds. Introduction Music-Perception Winter 1990, Vol. 8, No. 2, 203-214 I990 BY THE REGENTS OF THE UNIVERSITY OF CALIFORNIA The Tone Height of Multiharmonic Sounds ROY D. PATTERSON MRC Applied Psychology Unit, Cambridge,

More information

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England Asymmetry of masking between complex tones and noise: Partial loudness Hedwig Gockel a) CNBH, Department of Physiology, University of Cambridge, Downing Street, Cambridge CB2 3EG, England Brian C. J. Moore

More information

Temporal Envelope and Periodicity Cues on Musical Pitch Discrimination with Acoustic Simulation of Cochlear Implant

Temporal Envelope and Periodicity Cues on Musical Pitch Discrimination with Acoustic Simulation of Cochlear Implant Temporal Envelope and Periodicity Cues on Musical Pitch Discrimination with Acoustic Simulation of Cochlear Implant Lichuan Ping 1, 2, Meng Yuan 1, Qinglin Meng 1, 2 and Haihong Feng 1 1 Shanghai Acoustics

More information

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics) 1 Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics) Pitch Pitch is a subjective characteristic of sound Some listeners even assign pitch differently depending upon whether the sound was

More information

Pitch is one of the most common terms used to describe sound.

Pitch is one of the most common terms used to describe sound. ARTICLES https://doi.org/1.138/s41562-17-261-8 Diversity in pitch perception revealed by task dependence Malinda J. McPherson 1,2 * and Josh H. McDermott 1,2 Pitch conveys critical information in speech,

More information

Release from speech-on-speech masking in a front-and-back geometry

Release from speech-on-speech masking in a front-and-back geometry Release from speech-on-speech masking in a front-and-back geometry Neil L. Aaronson Department of Physics and Astronomy, Michigan State University, Biomedical and Physical Sciences Building, East Lansing,

More information

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU

LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China LOUDNESS EFFECT OF THE DIFFERENT TONES ON THE TIMBRE SUBJECTIVE PERCEPTION EXPERIMENT OF ERHU Siyu Zhu, Peifeng Ji,

More information

Pitch-Synchronous Spectrogram: Principles and Applications

Pitch-Synchronous Spectrogram: Principles and Applications Pitch-Synchronous Spectrogram: Principles and Applications C. Julian Chen Department of Applied Physics and Applied Mathematics May 24, 2018 Outline The traditional spectrogram Observations with the electroglottograph

More information

Psychoacoustics. lecturer:

Psychoacoustics. lecturer: Psychoacoustics lecturer: stephan.werner@tu-ilmenau.de Block Diagram of a Perceptual Audio Encoder loudness critical bands masking: frequency domain time domain binaural cues (overview) Source: Brandenburg,

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Informational Masking and Trained Listening. Undergraduate Honors Thesis

Informational Masking and Trained Listening. Undergraduate Honors Thesis Informational Masking and Trained Listening Undergraduate Honors Thesis Presented in partial fulfillment of requirements for the Degree of Bachelor of the Arts by Erica Laughlin The Ohio State University

More information

Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra

Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra Dept. for Speech, Music and Hearing Quarterly Progress and Status Report An attempt to predict the masking effect of vowel spectra Gauffin, J. and Sundberg, J. journal: STL-QPSR volume: 15 number: 4 year:

More information

Music Perception with Combined Stimulation

Music Perception with Combined Stimulation Music Perception with Combined Stimulation Kate Gfeller 1,2,4, Virginia Driscoll, 4 Jacob Oleson, 3 Christopher Turner, 2,4 Stephanie Kliethermes, 3 Bruce Gantz 4 School of Music, 1 Department of Communication

More information

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES Vishweshwara Rao and Preeti Rao Digital Audio Processing Lab, Electrical Engineering Department, IIT-Bombay, Powai,

More information

The Cocktail Party Effect. Binaural Masking. The Precedence Effect. Music 175: Time and Space

The Cocktail Party Effect. Binaural Masking. The Precedence Effect. Music 175: Time and Space The Cocktail Party Effect Music 175: Time and Space Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD) April 20, 2017 Cocktail Party Effect: ability to follow

More information

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink

Digital audio and computer music. COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink Digital audio and computer music COS 116, Spring 2012 Guest lecture: Rebecca Fiebrink Overview 1. Physics & perception of sound & music 2. Representations of music 3. Analyzing music with computers 4.

More information

1. Introduction NCMMSC2009

1. Introduction NCMMSC2009 NCMMSC9 Speech-to-Singing Synthesis System: Vocal Conversion from Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices * Takeshi SAITOU 1, Masataka GOTO 1, Masashi

More information

I. INTRODUCTION. 1 place Stravinsky, Paris, France; electronic mail:

I. INTRODUCTION. 1 place Stravinsky, Paris, France; electronic mail: The lower limit of melodic pitch Daniel Pressnitzer, a) Roy D. Patterson, and Katrin Krumbholz Centre for the Neural Basis of Hearing, Department of Physiology, Downing Street, Cambridge CB2 3EG, United

More information

This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail.

This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Powered by TCPDF (www.tcpdf.org) This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Lokki, Tapio; Pätynen, Jukka; Tervo,

More information

DERIVING A TIMBRE SPACE FOR THREE TYPES OF COMPLEX TONES VARYING IN SPECTRAL ROLL-OFF

DERIVING A TIMBRE SPACE FOR THREE TYPES OF COMPLEX TONES VARYING IN SPECTRAL ROLL-OFF DERIVING A TIMBRE SPACE FOR THREE TYPES OF COMPLEX TONES VARYING IN SPECTRAL ROLL-OFF William L. Martens 1, Mark Bassett 2 and Ella Manor 3 Faculty of Architecture, Design and Planning University of Sydney,

More information

1aAA14. The audibility of direct sound as a key to measuring the clarity of speech and music

1aAA14. The audibility of direct sound as a key to measuring the clarity of speech and music 1aAA14. The audibility of direct sound as a key to measuring the clarity of speech and music Session: Monday Morning, Oct 31 Time: 11:30 Author: David H. Griesinger Location: David Griesinger Acoustics,

More information

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high. Pitch The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high. 1 The bottom line Pitch perception involves the integration of spectral (place)

More information

The presence of multiple sound sources is a routine occurrence

The presence of multiple sound sources is a routine occurrence Spectral completion of partially masked sounds Josh H. McDermott* and Andrew J. Oxenham Department of Psychology, University of Minnesota, N640 Elliott Hall, 75 East River Road, Minneapolis, MN 55455-0344

More information

INTRODUCTION J. Acoust. Soc. Am. 107 (3), March /2000/107(3)/1589/9/$ Acoustical Society of America 1589

INTRODUCTION J. Acoust. Soc. Am. 107 (3), March /2000/107(3)/1589/9/$ Acoustical Society of America 1589 Effects of ipsilateral and contralateral precursors on the temporal effect in simultaneous masking with pure tones Sid P. Bacon a) and Eric W. Healy Psychoacoustics Laboratory, Department of Speech and

More information

I. INTRODUCTION. Electronic mail:

I. INTRODUCTION. Electronic mail: Neural activity associated with distinguishing concurrent auditory objects Claude Alain, a) Benjamin M. Schuler, and Kelly L. McDonald Rotman Research Institute, Baycrest Centre for Geriatric Care, 3560

More information

Tempo and Beat Analysis

Tempo and Beat Analysis Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:

More information

Topic 10. Multi-pitch Analysis

Topic 10. Multi-pitch Analysis Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds

More information

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University

More information

MASTER'S THESIS. Listener Envelopment

MASTER'S THESIS. Listener Envelopment MASTER'S THESIS 2008:095 Listener Envelopment Effects of changing the sidewall material in a model of an existing concert hall Dan Nyberg Luleå University of Technology Master thesis Audio Technology Department

More information

We realize that this is really small, if we consider that the atmospheric pressure 2 is

We realize that this is really small, if we consider that the atmospheric pressure 2 is PART 2 Sound Pressure Sound Pressure Levels (SPLs) Sound consists of pressure waves. Thus, a way to quantify sound is to state the amount of pressure 1 it exertsrelatively to a pressure level of reference.

More information

Spatial-frequency masking with briefly pulsed patterns

Spatial-frequency masking with briefly pulsed patterns Perception, 1978, volume 7, pages 161-166 Spatial-frequency masking with briefly pulsed patterns Gordon E Legge Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA Michael

More information

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio

Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Application Of Missing Feature Theory To The Recognition Of Musical Instruments In Polyphonic Audio Jana Eggink and Guy J. Brown Department of Computer Science, University of Sheffield Regent Court, 11

More information

A Behavioral Study on the Effects of Rock Music on Auditory Attention

A Behavioral Study on the Effects of Rock Music on Auditory Attention A Behavioral Study on the Effects of Rock Music on Auditory Attention Letizia Marchegiani 1 and Xenofon Fafoutis 2 1 Language and Speech Laboratory, Faculty of Arts, University of Basque Country l.marchegiani@laslab.org

More information

Why are natural sounds detected faster than pips?

Why are natural sounds detected faster than pips? Why are natural sounds detected faster than pips? Clara Suied Department of Physiology, Development and Neuroscience, Centre for the Neural Basis of Hearing, Downing Street, Cambridge CB2 3EG, United Kingdom

More information

Pitch: The Perceptual Ends of the Periodicity; but Of What Periodicity?

Pitch: The Perceptual Ends of the Periodicity; but Of What Periodicity? Pitch: The Perceptual Ends of the Periodicity; but Of What Periodicity? 1 Minoru TSUZAKI ; Sawa HANADA 1,2 ; Junko SONODA 1,3 ; Satomi TANAKA 1,4 ; Toshio IRINO 5 1 Kyoto City University of Arts, Japan

More information

Topic 4. Single Pitch Detection

Topic 4. Single Pitch Detection Topic 4 Single Pitch Detection What is pitch? A perceptual attribute, so subjective Only defined for (quasi) harmonic sounds Harmonic sounds are periodic, and the period is 1/F0. Can be reliably matched

More information

EFFECT OF REPETITION OF STANDARD AND COMPARISON TONES ON RECOGNITION MEMORY FOR PITCH '

EFFECT OF REPETITION OF STANDARD AND COMPARISON TONES ON RECOGNITION MEMORY FOR PITCH ' Journal oj Experimental Psychology 1972, Vol. 93, No. 1, 156-162 EFFECT OF REPETITION OF STANDARD AND COMPARISON TONES ON RECOGNITION MEMORY FOR PITCH ' DIANA DEUTSCH " Center for Human Information Processing,

More information

1 Introduction to PSQM

1 Introduction to PSQM A Technical White Paper on Sage s PSQM Test Renshou Dai August 7, 2000 1 Introduction to PSQM 1.1 What is PSQM test? PSQM stands for Perceptual Speech Quality Measure. It is an ITU-T P.861 [1] recommended

More information

UNIVERSITY OF DUBLIN TRINITY COLLEGE

UNIVERSITY OF DUBLIN TRINITY COLLEGE UNIVERSITY OF DUBLIN TRINITY COLLEGE FACULTY OF ENGINEERING & SYSTEMS SCIENCES School of Engineering and SCHOOL OF MUSIC Postgraduate Diploma in Music and Media Technologies Hilary Term 31 st January 2005

More information

2 Autocorrelation verses Strobed Temporal Integration

2 Autocorrelation verses Strobed Temporal Integration 11 th ISH, Grantham 1997 1 Auditory Temporal Asymmetry and Autocorrelation Roy D. Patterson* and Toshio Irino** * Center for the Neural Basis of Hearing, Physiology Department, Cambridge University, Downing

More information

Do Zwicker Tones Evoke a Musical Pitch?

Do Zwicker Tones Evoke a Musical Pitch? Do Zwicker Tones Evoke a Musical Pitch? Hedwig E. Gockel and Robert P. Carlyon Abstract It has been argued that musical pitch, i.e. pitch in its strictest sense, requires phase locking at the level of

More information

Consonance perception of complex-tone dyads and chords

Consonance perception of complex-tone dyads and chords Downloaded from orbit.dtu.dk on: Nov 24, 28 Consonance perception of complex-tone dyads and chords Rasmussen, Marc; Santurette, Sébastien; MacDonald, Ewen Published in: Proceedings of Forum Acusticum Publication

More information

Modeling sound quality from psychoacoustic measures

Modeling sound quality from psychoacoustic measures Modeling sound quality from psychoacoustic measures Lena SCHELL-MAJOOR 1 ; Jan RENNIES 2 ; Stephan D. EWERT 3 ; Birger KOLLMEIER 4 1,2,4 Fraunhofer IDMT, Hör-, Sprach- und Audiotechnologie & Cluster of

More information

Music Source Separation

Music Source Separation Music Source Separation Hao-Wei Tseng Electrical and Engineering System University of Michigan Ann Arbor, Michigan Email: blakesen@umich.edu Abstract In popular music, a cover version or cover song, or

More information

CHAPTER 20.2 SPEECH AND MUSICAL SOUNDS

CHAPTER 20.2 SPEECH AND MUSICAL SOUNDS Source: STANDARD HANDBOOK OF ELECTRONIC ENGINEERING CHAPTER 20.2 SPEECH AND MUSICAL SOUNDS Daniel W. Martin, Ronald M. Aarts SPEECH SOUNDS Speech Level and Spectrum Both the sound-pressure level and the

More information

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2 To use sound properly, and fully realize its power, we need to do the following: (1) listen (2) understand basics of sound and hearing (3) understand sound's fundamental effects on human communication

More information

MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION

MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION Michael Epstein 1,2, Mary Florentine 1,3, and Søren Buus 1,2 1Institute for Hearing, Speech, and Language 2Communications and Digital

More information

Effect of room acoustic conditions on masking efficiency

Effect of room acoustic conditions on masking efficiency Effect of room acoustic conditions on masking efficiency Hyojin Lee a, Graduate school, The University of Tokyo Komaba 4-6-1, Meguro-ku, Tokyo, 153-855, JAPAN Kanako Ueno b, Meiji University, JAPAN Higasimita

More information

PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF)

PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF) PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF) "The reason I got into playing and producing music was its power to travel great distances and have an emotional impact on people" Quincey

More information

A SEMANTIC DIFFERENTIAL STUDY OF LOW AMPLITUDE SUPERSONIC AIRCRAFT NOISE AND OTHER TRANSIENT SOUNDS

A SEMANTIC DIFFERENTIAL STUDY OF LOW AMPLITUDE SUPERSONIC AIRCRAFT NOISE AND OTHER TRANSIENT SOUNDS 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 A SEMANTIC DIFFERENTIAL STUDY OF LOW AMPLITUDE SUPERSONIC AIRCRAFT NOISE AND OTHER TRANSIENT SOUNDS PACS: 43.28.Mw Marshall, Andrew

More information

Experiments on tone adjustments

Experiments on tone adjustments Experiments on tone adjustments Jesko L. VERHEY 1 ; Jan HOTS 2 1 University of Magdeburg, Germany ABSTRACT Many technical sounds contain tonal components originating from rotating parts, such as electric

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Musical Acoustics Session 3pMU: Perception and Orchestration Practice

More information

Effect of harmonic rank on sequential sound segregation

Effect of harmonic rank on sequential sound segregation Downloaded from orbit.dtu.dk on: Jan 06, 2019 Effect of harmonic rank on sequential sound segregation Madsen, Sara Miay Kim; Dau, Torsten; Moore, Brian C.J. Published in: Hearing Research Link to article,

More information

THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO. J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England

THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO. J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England THE PSYCHOACOUSTICS OF MULTICHANNEL AUDIO J. ROBERT STUART Meridian Audio Ltd Stonehill, Huntingdon, PE18 6ED England ABSTRACT This is a tutorial paper giving an introduction to the perception of multichannel

More information

International Journal of Computer Architecture and Mobility (ISSN ) Volume 1-Issue 7, May 2013

International Journal of Computer Architecture and Mobility (ISSN ) Volume 1-Issue 7, May 2013 Carnatic Swara Synthesizer (CSS) Design for different Ragas Shruti Iyengar, Alice N Cheeran Abstract Carnatic music is one of the oldest forms of music and is one of two main sub-genres of Indian Classical

More information

Speech and Speaker Recognition for the Command of an Industrial Robot

Speech and Speaker Recognition for the Command of an Industrial Robot Speech and Speaker Recognition for the Command of an Industrial Robot CLAUDIA MOISA*, HELGA SILAGHI*, ANDREI SILAGHI** *Dept. of Electric Drives and Automation University of Oradea University Street, nr.

More information

Concert halls conveyors of musical expressions

Concert halls conveyors of musical expressions Communication Acoustics: Paper ICA216-465 Concert halls conveyors of musical expressions Tapio Lokki (a) (a) Aalto University, Dept. of Computer Science, Finland, tapio.lokki@aalto.fi Abstract: The first

More information

CTP431- Music and Audio Computing Musical Acoustics. Graduate School of Culture Technology KAIST Juhan Nam

CTP431- Music and Audio Computing Musical Acoustics. Graduate School of Culture Technology KAIST Juhan Nam CTP431- Music and Audio Computing Musical Acoustics Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines What is sound? Physical view Psychoacoustic view Sound generation Wave equation Wave

More information

Loudness and Sharpness Calculation

Loudness and Sharpness Calculation 10/16 Loudness and Sharpness Calculation Psychoacoustics is the science of the relationship between physical quantities of sound and subjective hearing impressions. To examine these relationships, physical

More information

Acoustic Prosodic Features In Sarcastic Utterances

Acoustic Prosodic Features In Sarcastic Utterances Acoustic Prosodic Features In Sarcastic Utterances Introduction: The main goal of this study is to determine if sarcasm can be detected through the analysis of prosodic cues or acoustic features automatically.

More information

Music Representations

Music Representations Lecture Music Processing Music Representations Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals

More information

Welcome to Vibrationdata

Welcome to Vibrationdata Welcome to Vibrationdata Acoustics Shock Vibration Signal Processing February 2004 Newsletter Greetings Feature Articles Speech is perhaps the most important characteristic that distinguishes humans from

More information

Real-time magnetic resonance imaging investigation of resonance tuning in soprano singing

Real-time magnetic resonance imaging investigation of resonance tuning in soprano singing E. Bresch and S. S. Narayanan: JASA Express Letters DOI: 1.1121/1.34997 Published Online 11 November 21 Real-time magnetic resonance imaging investigation of resonance tuning in soprano singing Erik Bresch

More information

Analysis, Synthesis, and Perception of Musical Sounds

Analysis, Synthesis, and Perception of Musical Sounds Analysis, Synthesis, and Perception of Musical Sounds The Sound of Music James W. Beauchamp Editor University of Illinois at Urbana, USA 4y Springer Contents Preface Acknowledgments vii xv 1. Analysis

More information

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are In: E. Bruce Goldstein (Ed) Encyclopedia of Perception, Volume 1, Sage, 2009, pp 160-164. Auditory Illusions Diana Deutsch The sounds we perceive do not always correspond to those that are presented. When

More information

Quarterly Progress and Status Report. Violin timbre and the picket fence

Quarterly Progress and Status Report. Violin timbre and the picket fence Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Violin timbre and the picket fence Jansson, E. V. journal: STL-QPSR volume: 31 number: 2-3 year: 1990 pages: 089-095 http://www.speech.kth.se/qpsr

More information

ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION

ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION ONLINE ACTIVITIES FOR MUSIC INFORMATION AND ACOUSTICS EDUCATION AND PSYCHOACOUSTIC DATA COLLECTION Travis M. Doll Ray V. Migneco Youngmoo E. Kim Drexel University, Electrical & Computer Engineering {tmd47,rm443,ykim}@drexel.edu

More information

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong Appendix D UW DigiScope User s Manual Willis J. Tompkins and Annie Foong UW DigiScope is a program that gives the user a range of basic functions typical of a digital oscilloscope. Included are such features

More information

2. AN INTROSPECTION OF THE MORPHING PROCESS

2. AN INTROSPECTION OF THE MORPHING PROCESS 1. INTRODUCTION Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals,

More information

The perception of concurrent sound objects through the use of harmonic enhancement: a study of auditory attention

The perception of concurrent sound objects through the use of harmonic enhancement: a study of auditory attention Atten Percept Psychophys (2015) 77:922 929 DOI 10.3758/s13414-014-0826-9 The perception of concurrent sound objects through the use of harmonic enhancement: a study of auditory attention Elena Koulaguina

More information

CSC475 Music Information Retrieval

CSC475 Music Information Retrieval CSC475 Music Information Retrieval Monophonic pitch extraction George Tzanetakis University of Victoria 2014 G. Tzanetakis 1 / 32 Table of Contents I 1 Motivation and Terminology 2 Psychacoustics 3 F0

More information

AN ALGORITHM FOR LOCATING FUNDAMENTAL FREQUENCY (F0) MARKERS IN SPEECH

AN ALGORITHM FOR LOCATING FUNDAMENTAL FREQUENCY (F0) MARKERS IN SPEECH AN ALGORITHM FOR LOCATING FUNDAMENTAL FREQUENCY (F0) MARKERS IN SPEECH by Princy Dikshit B.E (C.S) July 2000, Mangalore University, India A Thesis Submitted to the Faculty of Old Dominion University in

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Psychological and Physiological Acoustics Session 5aPP: Recent Trends

More information

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices Yasunori Ohishi 1 Masataka Goto 3 Katunobu Itou 2 Kazuya Takeda 1 1 Graduate School of Information Science, Nagoya University,

More information

Department of Psychology, University of York. NIHR Nottingham Hearing Biomedical Research Unit. Hull York Medical School, University of York

Department of Psychology, University of York. NIHR Nottingham Hearing Biomedical Research Unit. Hull York Medical School, University of York 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 Peripheral hearing loss reduces

More information

Digital music synthesis using DSP

Digital music synthesis using DSP Digital music synthesis using DSP Rahul Bhat (124074002), Sandeep Bhagwat (123074011), Gaurang Naik (123079009), Shrikant Venkataramani (123079042) DSP Application Assignment, Group No. 4 Department of

More information

A prototype system for rule-based expressive modifications of audio recordings

A prototype system for rule-based expressive modifications of audio recordings International Symposium on Performance Science ISBN 0-00-000000-0 / 000-0-00-000000-0 The Author 2007, Published by the AEC All rights reserved A prototype system for rule-based expressive modifications

More information

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm ALEJANDRO RAMOS-AMÉZQUITA Computer Science Department Tecnológico de Monterrey (Campus Ciudad de México)

More information

Influence of tonal context and timbral variation on perception of pitch

Influence of tonal context and timbral variation on perception of pitch Perception & Psychophysics 2002, 64 (2), 198-207 Influence of tonal context and timbral variation on perception of pitch CATHERINE M. WARRIER and ROBERT J. ZATORRE McGill University and Montreal Neurological

More information

Sound Quality Analysis of Electric Parking Brake

Sound Quality Analysis of Electric Parking Brake Sound Quality Analysis of Electric Parking Brake Bahare Naimipour a Giovanni Rinaldi b Valerie Schnabelrauch c Application Research Center, Sound Answers Inc. 6855 Commerce Boulevard, Canton, MI 48187,

More information

Symmetric interactions and interference between pitch and timbre

Symmetric interactions and interference between pitch and timbre Symmetric interactions and interference between pitch and timbre Emily J. Allen a) and Andrew J. Oxenham Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455 (Received 17 July

More information

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE Copyright SFA - InterNoise 2000 1 inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE I-INCE Classification: 6.1 INFLUENCE OF THE

More information

Speaking in Minor and Major Keys

Speaking in Minor and Major Keys Chapter 5 Speaking in Minor and Major Keys 5.1. Introduction 28 The prosodic phenomena discussed in the foregoing chapters were all instances of linguistic prosody. Prosody, however, also involves extra-linguistic

More information

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics 2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics Graduate School of Culture Technology, KAIST Juhan Nam Outlines Introduction to musical tones Musical tone generation - String

More information

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal

ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING. University of Porto - Faculty of Engineering -DEEC Porto, Portugal ACCURATE ANALYSIS AND VISUAL FEEDBACK OF VIBRATO IN SINGING José Ventura, Ricardo Sousa and Aníbal Ferreira University of Porto - Faculty of Engineering -DEEC Porto, Portugal ABSTRACT Vibrato is a frequency

More information

Measurement of overtone frequencies of a toy piano and perception of its pitch

Measurement of overtone frequencies of a toy piano and perception of its pitch Measurement of overtone frequencies of a toy piano and perception of its pitch PACS: 43.75.Mn ABSTRACT Akira Nishimura Department of Media and Cultural Studies, Tokyo University of Information Sciences,

More information

Processing Linguistic and Musical Pitch by English-Speaking Musicians and Non-Musicians

Processing Linguistic and Musical Pitch by English-Speaking Musicians and Non-Musicians Proceedings of the 20th North American Conference on Chinese Linguistics (NACCL-20). 2008. Volume 1. Edited by Marjorie K.M. Chan and Hana Kang. Columbus, Ohio: The Ohio State University. Pages 139-145.

More information

9.35 Sensation And Perception Spring 2009

9.35 Sensation And Perception Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 9.35 Sensation And Perception Spring 29 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Hearing Kimo Johnson April

More information

Sound design strategy for enhancing subjective preference of EV interior sound

Sound design strategy for enhancing subjective preference of EV interior sound Sound design strategy for enhancing subjective preference of EV interior sound Doo Young Gwak 1, Kiseop Yoon 2, Yeolwan Seong 3 and Soogab Lee 4 1,2,3 Department of Mechanical and Aerospace Engineering,

More information