NSL 30787 5 Neuroscience Letters xxx (204) xxx xxx Contents lists available at ScienceDirect Neuroscience Letters jo ur nal ho me page: www.elsevier.com/locate/neulet 2 3 4 Q 5 6 Earlier timbre processing of instrumental tones compared to equally complex spectrally rotated sounds as revealed by the mismatch negativity Corinna A. Christmann a,, Thomas Lachmann a, Stefan Berti b a Center for Cognitive Science, Cognitive and Developmental Psychology Unit, University of Kaiserslautern, Kaiserslautern, Germany b Department of Clinical Psychology and Neuropsychology, Institute for Psychology, Johannes Gutenberg-University Mainz, Mainz, Germany 7 8 9 0 2 3 4 h i g h l i g h t s We tested whether harmonically rich sounds have sensory processing benefits. Equally complex natural sounds and spectrally rotated sounds were applied. The mismatch negativity was measured in a passive oddball paradigm. MMN peak latency was shorter in natural compared to the spectrally rotated sounds. Our results suggest that processing benefits are not an effect of sound complexity. 5 6 3 7 8 9 20 2 22 23 24 25 26 27 28 29 30 a r t i c l e i n f o Article history: Received 9 May 204 Received in revised form 9 August 204 Accepted 20 August 204 Available online xxx Keywords: Event related brain potentials (ERP) Mismatch negativity (MMN) Auditory processing Stimulus complexity Musical sounds Spectrally rotated sounds a b s t r a c t Harmonically rich sounds have been shown to be processed more efficiently by the human brain compared to single sinusoidal tones. To control for stimulus complexity as a potentially confounding factor, tones and equally complex spectrally rotated sounds, have been used in the present study to investigate the role of the overtone series in sensory auditory processing in non-musicians. Timbre differences in instrumental tones with equal pitch elicited a MMN which was earlier compared to that elicited by the spectrally rotated sounds, indicating that harmonically rich tones are processed faster compared to non-musical sounds without an overtone series, even when pitch is not the relevant information. 204 Published by Elsevier Ireland Ltd. 32 33 34 35 36 37 38 39 40. Introduction Prior research demonstrated that harmonically rich sounds like vowels and tones are processed more efficiently (e.g. faster discrimination of sound) on the sensory level by the human brain compared to single sinusoidal tones [ 3]. These findings were explained in terms of pitch facilitation [4], meaning that the harmonic overtone series supports pitch perception. Moreover, improved discrimination has also been reported in a study manipulating the duration of the stimuli, for which pitch information is not relevant []. Corresponding author at: Center for Cognitive Science, Cognitive and Developmental Psychology Unit, University of Kaiserslautern, Postfach 3049, 67653 Kaiserslautern, Germany. Tel.: +49 63 205 3630; fax: +49 63 205 5034. E-mail address: cchristm@rhrk.uni-kl.de (C.A. Christmann). Single sinusoidal tones are composed of a single frequency and thus do not contain harmonics of the fundamental frequency and are therefore physically less complex compared to harmonically rich sounds. Therefore, it is hard to decide whether enhanced processing of complex sounds is a result of the harmonic structure, or solely of higher stimulus complexity. For instance, Berti [5] demonstrated that complex sounds may trigger different routes of sensory processing compared to sinusoidal tones. This might be due to the broader activation of neurons in the auditory sensory cortices by complex stimuli compared to sinusoidal stimuli. In addition, it has been shown that instrumental tones are processed by a different cortical area compared to chords [6] (see [7] for a recent review), indicating that stimulus complexity might influence the processing of musical sounds. To address this question more directly, the present study tested whether musical sounds (tones) possess processing advantages compared to non-musical 4 42 43 44 45 46 47 48 49 50 5 52 53 54 55 56 http://dx.doi.org/0.06/j.neulet.204.08.035 0304-3940/ 204 Published by Elsevier Ireland Ltd. spectrally rotated sounds as revealed by the mismatch negativity, Neurosci. Lett. (204), http://dx.doi.org/0.06/j.neulet.204.08.035
NSL 30787 5 2 C.A. Christmann et al. / Neuroscience Letters xxx (204) xxx xxx 57 58 59 60 6 62 63 64 65 66 67 68 69 70 7 72 73 74 75 76 77 78 79 80 8 82 83 84 85 86 87 sounds without harmonic overtone series, when stimulus complexity is controlled. One objective and reliable correlate of automatic sensory processing is the so-called mismatch negativity (MNN) of the human event related potential (ERP) [8,9]. Within a classical oddball paradigm, the ERP of a frequently presented standard stimulus is subtracted from the ERP of an infrequently presented deviant stimulus. The resulting difference curve shows a negative peak between 50 and 250 ms, known as the MMN. In early studies applying the MMN, pure sinusoidal tones were applied [8,0,] but it has been shown that the MMN can also be elicited with stimuli of higher complexity like noise bursts, harmonic tones, chords, and speech stimuli like vowels, syllables and words [2]. Moreover, the MMN is also sensitive for differences in timbre [3 5] and, therefore, is very suitable for the investigation of the sensory processing of musical stimuli. To test for processing differences, we applied an auditory oddball paradigm with two types of complex stimuli serving either as standard and deviant stimuli. As mentioned above, if the proposed effective processing of musical sounds is due to pitch facilitation, the MMN elicited by harmonically rich sounds should differ from equally complex sounds lacking the harmonic structure. In order to create non-musical sounds with the same complexity as the tones, we chose a procedure that was originally introduced for speech research: spectral rotation [6 8]. If the middle frequency is chosen carefully, this means that the middle frequency must not be a harmonic of the original tone; the overtone series of the tone will be eliminated completely by this procedure. Therefore, the resulting spectrally rotated sound is perceived as noise-like. To the best of our knowledge, this is the first time that spectral rotation was applied to musical stimuli in auditory research. 2. Material and methods Fourteen right-handed students of the Johannes Gutenberg- University Mainz, Germany, participated in the study. All participants were native speakers of German and reported normal hearing. Moreover, in accordance with the Declaration of Helsinki all participants gave written consent after the nature of the experiment was explained to them. The data sets of three participants were excluded for the reason of a bad signal to noise ratio in the EEG data due to a high number of artifacts. The remaining eleven participants had an age range of 8 30 years (5 male). Four of them never learned to play an instrument; the others never played an instrument at a professional level. Two different natural tones were used, one generated by a saxophone, the other one by a clarinet. The pitch was at 85 Hz (F#) for both, so the two tones differed only with respect to timbre; both sounds had a duration of 45 ms. To create non-harmonic stimuli with the same complexity as the natural tones, we produced one spectrally rotated counterpart for each tone, using a MATLAB (Version: R20a, Mathworks, Ismaning, Germany) script for stimulus processing provided by Scott and colleagues [20]. Comparable to prior research, 2000 Hz was chosen as center frequency for the rotation, this means that all frequencies are mirrored at this frequency. In order to maintain the perceived naturalness of the tones, which could be affected by the usage of the low pass filter (for more details see [6,7,2]), we modified the procedure to create spectrally rotated stimuli with a complete spectrum. This was achieved by adding all frequencies above 4000 Hz of the tone to the spectrally rotated stimulus with Audition (Version: CS5.5, Adobe, Munich, Germany). As a result the lower part, below 4000 Hz, was modified by the inversion, whereas the upper frequencies were not affected 88 89 90 9 92 93 94 95 96 97 98 99 00 0 02 03 04 05 06 07 08 09 0 2 3 4 5 6 7 Fig.. Spectrograms of the two tones (above) and the two spectrally rotated tones (below). The sounds based on the clarinet are on the left side, those based on the saxophone on the right side. Frequencies above 4000 Hz were not affected by the spectral rotation. Below 4000 Hz, all frequencies were mirrored around 2000 Hz. (The black line is only included for display purposes and marks 4000 Hz.) spectrally rotated sounds as revealed by the mismatch negativity, Neurosci. Lett. (204), http://dx.doi.org/0.06/j.neulet.204.08.035
NSL 30787 5 C.A. Christmann et al. / Neuroscience Letters xxx (204) xxx xxx 3 A F3 Fz F4 200 00 00 200 300 400 200 00 00 200 300 400 200 00 00 200 300 400 voltage [µv] -200-00 00 200 300 400 - MMN Cz 200 00 00 200 300 400 time [ms] RM Pz LM Clarinet Spectrally rotated clarinet MMN 200 00 00 200 300 400 200 00 00 200 300 400 200 00 00 200 300 400 B F3 Fz F4 200 00 00 200 300 400 200 00 00 200 300 400 200 00 00 200 300 400 RM 200 00 00 200 300 400 Cz 200 00 00 200 300 400 Pz 200 00 00 200 300 400 LM Saxophone Spectrally rotated saxophone 200 00 00 200 300 400 Fig. 2. Difference curves for natural tones and spectrally rotated sounds at all positions of electrode. Positive values are plotted upwards. In both conditions (natural tones and spectral rotated sounds) deviant stimuli elicited a MMN; in addition, polarity is reversed at the mastoids due to nose reference. (A) Difference waves for the clarinet and (B) difference waves for the saxophone. 8 9 20 2 22 23 24 25 26 27 28 29 30 3 32 33 34 35 36 37 38 39 (see Fig. ). The resulting stimuli show exactly the same range of frequencies and complexity as the unfiltered tones (see also [7,22]). The overtone series is eliminated by the rotation as the fundamental frequency is changed, therefore, the spectrally rotated sounds are perceived as noise-like. After the EEG session, each participant completed a questionnaire, in which he/she was asked to identify all kinds of stimuli they heard during the session. All participants identified the musical stimuli as tones and the spectrally rotated tones as some kind of noise. A classical oddball paradigm with a stimulus-onset asynchrony (SOA) of 500 ms was applied. All stimuli were presented via an external soundcard (UGM96, ESI Audiotechnik GmbH, Leonberg, Germany) binaurally via a closed headphone (Beyerdynamic DT 770) with a sound pressure level (SPL) of 60 db(a). The SPL was measured with an artificial head (HSM III.0, HEAD acoustics, Aachen, Germany). Presentation (version 4.5, Neurobehavioral Systems, Albany, California) was used to control the experimental protocol. All sessions took place in an acoustically attenuated and electrically shielded chamber. There were separate blocks for the natural tones (clarinet and saxophone) and the spectrally rotated sounds. Additionally, every stimulus was presented as standard in one block and as deviant in another. In total, four blocks were presented to each participant. The sequence of the blocks was mixed for each subject. Each block contained 050 standard stimuli (p = 84%) and 200 deviant stimuli (p = 6%). The electroencephalogram (EEG) was recorded continuously with a SynAmps amplifier (Neuroscan, Sterling, Virginia); electrode impedance was kept under 5 kohm. Seven Ag/AgCl electrodes were attached according to the 0 20-system at the following positions: F3, Fz, F4, Cz, Pz and additionally upon the left and right mastoid (LM and RM). The reference electrode was placed on the tip of the nose. Moreover, the vertical and horizontal electrooculogram (EOG) were recorded to control for eye movements. The EEG was recorded with a sampling rate of 500 Hz; an online 0.05 70 Hz bandpass and an additional 50 Hz notch filter were applied. All EEG analyses were performed with the MATLAB toolbox ERPLAB [23], which is integrated in the EEGLAB toolbox [24]. First, an offline band-pass filter ranging from to 30 Hz was used. The ERPs were computed separately for each standard and deviant. The time window ranged from 200 ms before to 500 ms after stimulus onset. The first 200 ms served as baseline for the averaged signal. The first 0 standards of each block were excluded from ERP 40 4 42 43 44 45 46 47 48 49 50 5 52 53 54 55 56 57 58 59 60 6 spectrally rotated sounds as revealed by the mismatch negativity, Neurosci. Lett. (204), http://dx.doi.org/0.06/j.neulet.204.08.035
NSL 30787 5 4 C.A. Christmann et al. / Neuroscience Letters xxx (204) xxx xxx Table Mean peak latency and standard error of the mean (SE) for each instrument (clarinet vs. saxophone), stimulus type (tone vs. rotated tone) and position of electrode. Position of electrode F3 Fz F4 Cz Pz LM RM Clarinet Tone Mean 74.23 67.45 65.79 72.72 80.96 82.0 65.82 SE 0.23 9.9 9.99 0.66 4.68 3.64.9 Rotated sound Mean 2.79 20.2 2.86 96.62 87.70 2.79 222.87 SE 0.35.7 0.30 4.22 7.0 9.58 9.7 Saxophone Tone Mean 88.2 87.7 87.63 7.84 89.98 80.36 84.53 SE 9.94 0.0 0.4 2.07 7.32 8.46 9.55 Rotated sound Mean 206.52 209.90 206.84 200.98 204.0 208.04 27.83 SE.50.73.45 2.48 6.90 7.33 8.0 62 63 64 65 66 67 68 69 70 7 72 73 74 75 76 77 78 79 80 8 82 83 84 85 86 87 88 89 90 9 92 93 94 95 96 97 98 99 200 computation. In addition, all EEG epochs containing artifacts (i.e., eye movement or blinks) were also excluded from further analysis. In detail, epochs in which the absolute (peak-to-peak) voltage change in the EOG within a 200 ms sliding time window exceeded 75 V were not included in ERP averaging. Each individual ERP was based on at least 20 trials. The difference curve for each stimulus type (clarinet, saxophone, spectrally rotated clarinet, spectrally rotated saxophone) was formed by subtracting the ERP of the standard from the ERP of the deviant. The latency of the local amplitude peak of each difference curve was established within a time window between 00 and 300 ms for each electrode individually. The size of the MMN was calculated as the area under the difference curve within a time window of 50 ms around the peak latency [25]. 3. Results We conducted two repeated measure analyses of variance (ANOVA), one based on the area of MMN, the other one based on the latency of the local amplitude peak. Three within subject factors were included: Stimulus Type (natural tone vs. spectrally rotated sounds), Type of Instrument (clarinet vs. saxophone), and Position of Electrode (F3, Fz, F4, Cz, Pz, LM, RM). Greenhouse Geisser correction was applied whenever the numerator degrees of freedom were two or more; uncorrected degrees of freedom and corrected p- values are reported. The ANOVA based on the area of MMN revealed a main effect Position of Electrode, F(6, 60) = 33.47, p <.0, 2 p =.77, with decreasing size from frontal to central and the mastoids. There were no other significant main effects or interactions. Fig. 2 depicts the difference waveforms for each Stimulus Type, Position of Electrode and Type of Instrument (see Fig. 2(A) for the clarinet and (B) for the saxophone). The MMN was significant from zero at all relevant electrodes (all ts(0) 3.82, ps <.0). In contrast to the area of the MMN, the ANOVA based on the peak latency revealed a main effect of Stimulus Type (see also Fig. 2), F(, 0) = 30., p <.0, 2 p =.67. The MMN elicited by the tones was earlier compared to the spectrally rotated tones, t(0) = 3.90, p <.0, d =.7 for the clarinet, t(0) = 2.20, p =.05, d = 0.67 for the saxophone. All remaining main effects and interactions did not reach significance. Mean latencies and standard errors of the mean for each Type of Stimulus and Instrument and Position of Electrode are depicted in Table. sounds. Morphologically similar MMN responses were elicited in both stimulus conditions. Therefore, stimulus complexity cannot explain the latency of MMN, which, however, differed between the two types of stimuli, suggesting more efficient processing of the stimuli preserving the whole overtone series [4]. Importantly, the results of the present study are in line with the findings reported by Čeponiené and colleagues []: Differences in duration were processed more efficiently in harmonically rich tones compared to single sinusoidal tones. Our findings together with the studies by Čeponiené and colleagues [] and Tervaniemi and colleagues [2,3] support the assumption that sensory processing benefits from the internal structure of complex sounds (e.g. harmonic structure). In addition, our study suggests that stimulus complexity is capable of enhancing processing of auditory stimuli in a more general fashion (see also [5]), e.g. by recruiting additional neurons on the level of the auditory sensory system. The main purpose of the present study was to manipulate one sensory stimulus feature while maintaining stimulus complexity. We did this by means of spectral rotation, a procedure originally developed to transform speech to non-speech stimuli of similar complexity [6]. This shows that this approach is not limited to speech research. While in prior research the effect of complexity was measured by comparing complex vs. less complex sounds, our procedure allowed to modify the structure within a complex stimulus. The present study demonstrates that differences in the configuration of complex stimuli can influence their processing on the sensory level. As mentioned above this facilitation effect is not restricted to pitch perception. 5. Conclusions The present results support the idea of a general processing benefit for complex sounds and demonstrates that harmonically rich tones are processed more efficiently (i.e., with regard to the present study faster) compared to non-musical sounds without an overtone series. Moreover, the present study showed that the procedure of spectral rotation can also be applied in the context of musical notes. Uncited reference Q2 [9]. 208 209 20 2 22 23 24 25 26 27 28 29 220 22 222 223 224 225 226 227 228 229 230 23 232 233 234 235 236 237 238 239 240 24 242 243 244 20 202 203 204 205 206 207 4. Discussion Harmonically rich sounds have been shown to be processed more efficiently by the human brain compared to single sinusoidal tones [ 3], as the overtone series seems to facilitate pitch detection [4]. To control for stimulus complexity as a potentially confounding factor with regard to the harmonic structure, we applied natural tones and equally complex spectrally rotated References [] R. Čeponiené, K. Yaguchi, A. Shestakova, P. Alku, K. Suominen, et al., Sound complexity and speechness effects on pre-attentive auditory discrimination in children, Int. J. Psychophysiol. 43 (3) (2002) 99 2. Q3 [2] M. Tervaniemi, T. Ilvonen, J. Sinkkonen, A. Kujala, K. Alho, et al., Harmonic partials facilitate pitch discrimination in humans: electrophysiological and behavioral evidence, Neurosci. Lett. 279 () (2000) 29 32. 245 246 247 248 249 250 25 spectrally rotated sounds as revealed by the mismatch negativity, Neurosci. Lett. (204), http://dx.doi.org/0.06/j.neulet.204.08.035
NSL 30787 5 C.A. Christmann et al. / Neuroscience Letters xxx (204) xxx xxx 5 252 253 254 255 256 257 258 259 260 26 262 263 264 265 266 267 268 269 270 27 272 273 274 275 276 277 278 279 280 28 [3] M. Tervaniemi, E. Schröger, M. Saher, R. Näätänen, Effects of spectral complexity and sound duration on automatic complex-sound pitch processing in humans a mismatch negativity study, Neurosci. Lett. 290 () (2000) 66 70. [4] M. Tervaniemi, Musical sound processing: EEG and MEG evidence, in: I. Peretz, R.J. Zatorre (Eds.), The Cognitive Neuroscience of Music, Oxford University Press, Oxford, 2003, pp. 294 309. [5] S. Berti, Automatic processing of rare versus novel auditory stimuli reveal different mechanisms of auditory change detection, Neuroreport (202) 44 446. [6] K. Alho, M. Tervaniemi, M. Huotilainen, J. Lavikainen, H. Tittinen, et al., Processing of complex sounds in the human auditory cortex as revealed by magnetic brain responses, Psychophysiology 33 (4) (996) 369 375. [7] S. Koelsch, Towards a neural basis of music perception a review and updated model, Front. Psychol. 2 (20) 0. [8] R. Näätänen, A.W.K. Gaillard, S. Mäntysalo, Early selective-attention effect on evoked potential reinterpreted, Acta Psychol. 42 (978) 33 329. [9] R. Näätänen, Mismatch negativity (MMN) as an index of central auditory system plasticity, Int. J. Audiol. 47 (s2) (2008) 6 20. [0] R. Näätänen, Orienting and evoked potentials, in: H.D. Kimmel, E.H. van Olst, J.F. Orlebeke (Eds.), The Orienting Reflex in Humans, Erlbaum, New Jersey, 979, pp. 6 75. [] R. Näätänen, P.T. Michie, Early selective attention effects on the evoked potential. A critical review and reinterpretation, Biol. Psychol. 8 (979) 8 36. [2] R. Näätänen, P. Paavilainen, T. Rinne, K. Alho, The mismatch negativity (MMN) in basic research of central auditory processing: a review, Clin. Neurophysiol. 8 (2) (2007) 2544 2590. [3] K.N. Goydke, E. Altenmüller, J. Möller, T.F. Münte, Changes in emotional tone and instrumental timbre are reflected by the mismatch negativity, Cogn. Brain Res. 2 (3) (2004) 35 359. [4] M. Tervaniemi, I. Winkler, R. Näätänen, Pre-attentive categorization of sounds by timbre as revealed by event-related potentials, Neuroreport 8 (997) 257 2574. [5] P. Toiviainen, M. Tervaniemi, J. Louhivuori, M. Saher, M. Huotilainen, et al., Timbre similarity: convergence of neural, behavioral, and computational approaches, Music Percept. 6 (998) 223 24. [6] B. Blesser, Speech perception under conditions of spectral transformation: I. Phonetic characteristics, J. Speech Hear. Res. 5 (972) 5 4. [7] C.A. Christmann, S. Berti, C. Steinbrink, T. Lachmann, Differences in sensory processing of German vowels and physically matched non-speech sounds as revealed by the mismatch negativity (MMN) of the human event-related brain potential (ERP), Brain Lang. 36 (204) 8 8. [8] S.K. Scott, R.J.S. Wise, The functional neuroanatomy of prelexical processing in speech perception, Cognition 92 (-2) (2004) 3 45. [9] S. Koelsch, E. Schröger, M. Tervaniemi, Superior pre-attentive auditory processing in musicians, Neuroreport 0 (6) (999) 309 33. [20] S.K. Scott, C. Blank, S. Rosen, R.J.S. Wise, Identification of a pathway for intelligible speech in the left temporal lobe, Brain 23 (2) (2000) 2400 2406. [2] B.C.J. Moore, C.-T. Tan, Perceived naturalness of spectrally distorted speech and music, J. Acoust. Soc. Am. 4 () (2003) 408 49. [22] C.A. Christmann, T. Lachmann, C. Steinbrink, Evidence for a general auditory processing deficit in developmental dyslexia from a discrimination paradigm using speech vs. non-speech sounds matched in complexity, 204 (submitted for publication). [23] J. Lopez-Calderon, S.J. Luck, ERPLAB: an open-source toolbox for the analysis of event-related potentials, Front. Hum. Neurosci. 8 (23) (204). [24] A. Delorme, S. Makeig, EEGLAB: an open source toolbox for analysis of singletrial EEG dynamics including independent component analysis, J. Neurosci. Methods 34 () (2004) 9 2. [25] M. Beauchemin, L. de Beaumont, Statistical analysis of the mismatch negativity: to a dilemma, an answer, Tutor. Quant. Methods Psychol. () (2005) 8 24. 282 283 284 285 286 287 288 289 290 29 292 293 294 295 296 297 298 299 300 30 302 303 304 305 306 307 308 309 30 spectrally rotated sounds as revealed by the mismatch negativity, Neurosci. Lett. (204), http://dx.doi.org/0.06/j.neulet.204.08.035