Audio Engineering Society. Convention Paper. Presented at the 130th Convention 2011 May London, UK

Similar documents
THE EFFECT OF PERFORMANCE STAGES ON SUBWOOFER POLAR AND FREQUENCY RESPONSES

RECORDING AND REPRODUCING CONCERT HALL ACOUSTICS FOR SUBJECTIVE EVALUATION

The Cocktail Party Effect. Binaural Masking. The Precedence Effect. Music 175: Time and Space

Comparison between Opera houses: Italian and Japanese cases

Room acoustics computer modelling: Study of the effect of source directivity on auralizations

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Measurement of overtone frequencies of a toy piano and perception of its pitch

Cathedral user guide & reference manual

Using the BHM binaural head microphone

ECE438 - Laboratory 4: Sampling and Reconstruction of Continuous-Time Signals

Simple Harmonic Motion: What is a Sound Spectrum?

Perception of bass with some musical instruments in concert halls

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2

Effect of room acoustic conditions on masking efficiency

UNIVERSITY OF DUBLIN TRINITY COLLEGE

Binaural Measurement, Analysis and Playback

Concert halls conveyors of musical expressions

Convention Paper Presented at the 124th Convention 2008 May Amsterdam, The Netherlands

FPFV-285/585 PRODUCTION SOUND Fall 2018 CRITICAL LISTENING Assignment

A few white papers on various. Digital Signal Processing algorithms. used in the DAC501 / DAC502 units

Acoustic concert halls (Statistical calculation, wave acoustic theory with reference to reconstruction of Saint- Petersburg Kapelle and philharmonic)

LIVE SOUND SUBWOOFER DR. ADAM J. HILL COLLEGE OF ENGINEERING & TECHNOLOGY, UNIVERSITY OF DERBY, UK GAND CONCERT SOUND, CHICAGO, USA 20 OCTOBER 2017

ELECTRO-ACOUSTIC SYSTEMS FOR THE NEW OPERA HOUSE IN OSLO. Alf Berntson. Artifon AB Östra Hamngatan 52, Göteborg, Sweden

The interaction between room and musical instruments studied by multi-channel auralization

360 degrees video and audio recording and broadcasting employing a parabolic mirror camera and a spherical 32-capsules microphone array

New recording techniques for solo double bass

2. AN INTROSPECTION OF THE MORPHING PROCESS

Why do some concert halls render music more expressive and impressive than others?

Methods to measure stage acoustic parameters: overview and future research

SREV1 Sampling Guide. An Introduction to Impulse-response Sampling with the SREV1 Sampling Reverberator

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

What is proximity, how do early reflections and reverberation affect it, and can it be studied with LOC and existing binaural data?

StiffNeck: The Electroacoustic Music Performance Venue in a Box

White Paper Measuring and Optimizing Sound Systems: An introduction to JBL Smaart

Evaluation of a New Active Acoustics System in Performances of Five String Quartets

THE ACOUSTICS OF THE MUNICIPAL THEATRE IN MODENA

JOURNAL OF BUILDING ACOUSTICS. Volume 20 Number

A consideration on acoustic properties on concert-hall stages

Calibration of auralisation presentations through loudspeakers

Sound Magic Imperial Grand3D 3D Hybrid Modeling Piano. Imperial Grand3D. World s First 3D Hybrid Modeling Piano. Developed by

Using Extra Loudspeakers and Sound Reinforcement

LISTENERS RESPONSE TO STRING QUARTET PERFORMANCES RECORDED IN VIRTUAL ACOUSTICS

Multichannel source directivity recording in an anechoic chamber and in a studio

MASTER'S THESIS. Listener Envelopment

Lab 5 Linear Predictive Coding

ROOM LOW-FREQUENCY RESPONSE ESTIMATION USING MICROPHONE AVERAGING

Rapid prototyping of of DSP algorithms. real-time. Mattias Arlbrant. Grupphandledare, ANC

XXXXXX - A new approach to Loudspeakers & room digital correction

AmbDec User Manual. Fons Adriaensen

2018 Fall CTP431: Music and Audio Computing Fundamentals of Musical Acoustics

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics

CONCERT HALL STAGE ACOUSTICS FROM THE PERSP- ECTIVE OF THE PERFORMERS AND PHYSICAL REALITY

A SIMPLE ACOUSTIC ROOM MODEL FOR VIRTUAL PRODUCTION AUDIO. R. Walker. British Broadcasting Corporation, United Kingdom. ABSTRACT

Eventide Inc. One Alsan Way Little Ferry, NJ

EFFECTS OF REVERBERATION TIME AND SOUND SOURCE CHARACTERISTIC TO AUDITORY LOCALIZATION IN AN INDOOR SOUND FIELD. Chiung Yao Chen

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

A BEM STUDY ON THE EFFECT OF SOURCE-RECEIVER PATH ROUTE AND LENGTH ON ATTENUATION OF DIRECT SOUND AND FLOOR REFLECTION WITHIN A CHAMBER ORCHESTRA

PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF)

Spectral Sounds Summary

Using Extra Loudspeakers and Sound Reinforcement

THE DIGITAL DELAY ADVANTAGE A guide to using Digital Delays. Synchronize loudspeakers Eliminate comb filter distortion Align acoustic image.

FOR IMMEDIATE RELEASE

AcoustiSoft RPlusD ver

Chapter 2 Auditorium Acoustics: Terms, Language, and Concepts

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

Binaural sound exposure by the direct sound of the own musical instrument Wenmaekers, R.H.C.; Hak, C.C.J.M.; de Vos, H.P.J.C.

Building Technology and Architectural Design. Program 9nd lecture Case studies Room Acoustics Case studies Room Acoustics

Harmonic Analysis of the Soprano Clarinet

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong

SUBJECTIVE EVALUATION OF THE BEIJING NATIONAL GRAND THEATRE OF CHINA

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series

White Paper JBL s LSR Principle, RMC (Room Mode Correction) and the Monitoring Environment by John Eargle. Introduction and Background:

The influence of Room Acoustic Aspects on the Noise Exposure of Symphonic Orchestra Musicians

Quarterly Progress and Status Report. An attempt to predict the masking effect of vowel spectra

REBUILDING OF AN ORCHESTRA REHEARSAL ROOM: COMPARISON BETWEEN OBJECTIVE AND PERCEPTIVE MEASUREMENTS FOR ROOM ACOUSTIC PREDICTIONS

International Journal of Engineering Research-Online A Peer Reviewed International Journal

Acoustical comparison of bassoon crooks

Dynamic Spectrum Mapper V2 (DSM V2) Plugin Manual

Advance Certificate Course In Audio Mixing & Mastering.

Acoustic Parameters Pendopo Mangkunegaran Surakarta for Javanese Gamelan Performance

HAVERHILL OLD INDEPENDENT CHURCH

USING PULSE REFLECTOMETRY TO COMPARE THE EVOLUTION OF THE CORNET AND THE TRUMPET IN THE 19TH AND 20TH CENTURIES

Variation of sound properties in the stage and orchestra pit of two European opera houses. 1 Introduction

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

DTS Neural Mono2Stereo

THE VIRTUAL RECONSTRUCTION OF THE ANCIENT ROMAN CONCERT HALL IN APHRODISIAS, TURKEY

Open loop tracking of radio occultation signals in the lower troposphere

METHODS TO ELIMINATE THE BASS CANCELLATION BETWEEN LFE AND MAIN CHANNELS

DH400. Digital Phone Hybrid. The most advanced Digital Hybrid with DSP echo canceller and VQR technology.

Sound technology. TNGD10 - Moving media

TEPZZ A_T EP A1 (19) (11) EP A1 (12) EUROPEAN PATENT APPLICATION. (51) Int Cl.: H04S 7/00 ( ) H04R 25/00 (2006.

ACTIVE SOUND DESIGN: VACUUM CLEANER

TEPZZ 94 98_A_T EP A1 (19) (11) EP A1 (12) EUROPEAN PATENT APPLICATION. (43) Date of publication: Bulletin 2015/46

Voxengo Soniformer User Guide

BeoVision Televisions

Transcription:

Audio Engineering Society Convention Paper Presented at the 130th Convention 2011 May 13 16 London, UK The papers at this Convention have been selected on the basis of a submitted abstract and extended precis that have been peer reviewed by at least two qualified anonymous reviewers. This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio Engineering Society, 60 East 42 nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society. Development of a Virtual Performance Studio with application of Virtual Acoustic Recording Methods Iain laird 1, Dr Damian Murphy 2, Dr Paul Chapman 1 and Seb Jouan 3 1 Digital Design Studio, Glasgow School of Art, The Hub@ Pacific Quay, Pacific Drive, Glasgow, UK i.laird1@student.gsa.ac.uk, p.chapman@gsa.ac.uk 2 Audio Lab, Department of Electronics, University of York, Heslington, York, UK dtm3@ohm.york.ac.uk 3 Arup, 225 Bath Street, Glasgow, UK seb.jouan@arup.com ABSTRACT A Virtual Performance Studio (VPS) is a space that allows a musician to practice in a virtual version of a real performance space in order to acclimatise to the acoustic feedback received on stage before physically performing there. Traditional auralisation techniques allow this by convolving the direct sound from the instrument with the appropriate impulse response on stage. In order to capture only the direct sound from the instrument, a directional microphone is often used at small distances from the instrument. This can give rise to noticeable tonal distortion due to proximity effect and spatial sampling of the instrument s directivity function. This work reports on the construction of a prototype VPS system and goes on to demonstrate how an auralisation can be significantly affected by the placement of the microphone around the instrument, contributing to a reported PA effect. Informal listening tests have suggested that there is a general preference for auralisations which process multiple microphones placed around the instrument. 1. INTRODUCTION It has been previously observed that musicians are sensitive to changes in the acoustic response of a performance space and that they will adjust their playing technique according to the acoustic response they receive on stage [1]. A Virtual Performance Studio (VPS) is an application of established auralisation techniques which allow a musician to play into a virtual version of a performance space and receive the equivalent acoustic response they would expect to hear when playing on stage. This system would allow a musician to acclimatize to a performance space before a performance and adjust their technique accordingly in

rehearsal, rather than on stage in front of an audience. There are many exciting research strands that a technology such as this could provide including investigations into musician stage fright and the possible use of musicians in concert hall stage design, as well as applications in more generic virtual reality systems. A simple, concept VPS consists of the musician practicing in a semi-anechoic space equipped with an ambisonic speaker array. The direct sound of the instrument is captured by a microphone placed close to the instrument and is then convolved in real-time with an ambisonic impulse response of the target space with sources and receivers placed in locations reflecting the musician s head (receiver) and instrument (source). The resulting acoustic feedback is then replayed to the musician. The aim of this system is to provide equivalent perceptual acoustic conditions to that of the real space resulting from the use of impulse responses. To avoid system instability or re-auralising the reverberant material, a directional microphone is placed close to the instrument. Gorzel and Kearney [2] have shown that there are observable tonal distortion effects present in auralisations utilising a close, directional microphone technique. This paper reports on the construction of a prototype VPS and the subjective and objective observations made with specific reference to microphone technique. The paper will also report on a repetition of Kearney's work in Virtual Acoustic Recording in the context of the VPS. The paper is organised as follows: First, an overview of the prototype system and the technologies used is presented. Then a report is made on some initial subjective and objective observations during the testing of this system. Finally, the work describes the application of an inverse filtering technique as pioneered by Kearney [10] and discusses its use within the context of a VPS. 2. VIRTUAL PERFORMANCE SPACE The Virtual Performance Space was developed in October 2010 in the ixdlab at Pacific Quay in Glasgow. The ixdlab is an auralisation suite capable of playing back ambisonic audio over a 12-channel speaker array. It has been acoustically treated to ensure very low background noise and minimal room presence imposed upon any audio played over the array. The space also features a retro-projected stereographic display to facilitate the use of combined 3D audio visual presentations. The VPS is a prototype ambisonic-based virtual performance environment where a musician can play into a virtual space and receive the appropriate 3D acoustic response as if they were physically there. 2.1. System Architecture A schematic view of the initial system is shown below in figure 1. The musician plays into a single AKG C414B microphone (cardioid) which is powered and increased to line level with an M-Audio Audio buddy and sent to a PC via a SonicCore A16 AD/DA converter connected to an RME Digiface soundcard. The audio is routed to MaxMSP where a patch convolves the incoming signal with an ambisonic impulse response with the use of four SIR2 VST convolution reverb objects [3] (one for each channel of an ambisonic impulse response). The resulting B-format signal is transferred to another MaxMsp Patch which decodes the ambisonic audio and sends the resultant 12 channels of audio through the Figure 1 Schematic of VPS system showing real-time and offline subsystems Page 2 of 12

RME Digiface and SonicCore A16 to a calibrated array of 12 Yamaha MSP5A powered loudspeakers which surround the musician in a dodecahedral shape. The SIR2 plug-ins are set to output an 100% wet and 0% dry signal and are otherwise set carefully to avoid introducing any unwanted pre-delay or other effects. The ambisonic impulse response has been obtained using CATT Acoustic modeling software [4]. 2.2. Impulse Response Modeling The signal from the instrument is auralised using a B- format impulse response obtained from an acoustic model of a fictional performance space. The model was constructed in Rhino and then imported into CATT- Acoustic using the dxf2geo tool which comes bundled with the CATT software package. The source was modeled as a directional point source 70cm above the ground and 20cm in front of the musician s head which was modeled 1.4m above the ground as an ambisonic receiver. These values were chosen to reflect a seated musician. The directional response was modeled using CLF data obtained from [5]. The impulse response was modeled with 50,000 rays tracked over a period of 5000ms and was postprocessed within CATT to omit the direct sound of the impulse response. As the musician will receive the direct sound from their own instrument it does not require auralisation. The impulse responses were then output as four mono 32-bit.wav files for import into the SIR2 convolvers. Figure 2 Plans of Acoustic model used for initial VPS Testing The space shown in Figure 2 was given surface characteristics obtained from [6] to give a sensible decay characteristic. At the musician's position on stage the Sabine reverberation time of the space is highest at 125 Hz at 1.37s and lowest at 4 khz at 0.79s. This model shows two additional sources on stage which were used to help verify the model operation and to give functionality to play with other virtual musicians. An additional receiver is modeled as an audience member to give functionality to listen back to the musician s performance from a remote perspective. 2.3. Objective Analysis To help validate the approach and the operation of the system, it was decided to use a logarithmic sine sweep measurement technique within the VPS system to compare the acoustic properties of the virtual space with those obtained from the acoustic model. This technique is widely used to measure the impulse response of enclosed spaces and is based on work completed by Farina et al [7]. The experiment utilised two separate subsystems, the first was the VPS system as described previously and the second created the logarithmic sine sweep, played it through a loudspeaker and recorded the ambisonic response from a Soundfield microphone. The Soundfield ST350 microphone was set up in the sweetspot of the ambisonic array at a height of approximately 1.4m with the loudspeaker located at a height of 70cm Page 3 of 12

and 20cm in front of the microphone to match the details of the acoustic model. The signal from the loudspeaker was captured by the AKG C414B and auralised through the VPS, this response was recorded by the Soundfield microphone. For the purposes of the comparison, the direct sound was retained on the modeled response but was omitted in the.wav files used for convolution. The measured sine sweeps obtained on each channel of the Soundfield microphone were convolved with the inverse sweep generated previously. The convolution was achieved by complex multiplication in the frequency domain using Matlab. The resulting signals contain the impulse response of the system in the second half of the signal, with any non-linear distortion in the system appearing in the first half. The non-linear parts of the signal can then be truncated leaving the response of the measured system [7]. Initial comparisons of the measured and modeled responses showed significant latency in the VPS which were found to be caused mainly by the soundcard buffer size. Reduction to the lowest stable buffer size of 64 samples improved the performance of the system. Further latency compensation was achieved by carefully omitting the propagation delay found before the first reflection of the impulse response used by the convolvers. These additional edits were made using Audacity at the sample edit level. The impulse response was obtained for each B-format channel and compared in Matlab with the modeled impulse responses from CATT. The direct sound of the measured response appears to lag the modeled response slightly by approximately 0.5ms, this delay was deemed negligible. Figure 3 below shows a comparison of the measured and modeled impulse responses from the W channel. It was noticed in the comparison that the phase of the first reflection in the measured impulse response was found to be incorrect or inverted. Assuming the performer is positioned at a central location on stage the floor reflections will always be the first received by the performer. This VPS was attempting to reconstruct this reflection using an ambisonic array equipped with four speakers on the floor however there will naturally be some reflection from the floor, which may interfere or conflict with this synthesized reflection. The floor will also be very close to the radius of the speaker array which may cause errors when decoding a reflection from this direction. Therefore, it may be necessary to omit this reflection from the convolution system and recreate it naturally with a representative wooden stage floor. Further work is necessary to determine the perceptual significance of this first reflection and also the role of the floor in simulations such as this. Figure 3 Comparison of the Modeled impulse response (top) and the measured impulse response of the W channel (bottom). Reflection at 0.011s shown in measured response caused by a nearby TV screen. Page 4 of 12

Figure 4 Spectrogram of modeled impulse response (top) and measured impulse response (bottom). Furthermore, certain musical instruments, such as cellos and basses, have a register that extends to low frequencies which often use the stage as a resonating surface. It remains unclear if the absence of such a surface can affect the perception of the content of acoustic feedback. Some related work in this area has been published recently by Abercrombie and Braasch [8] who attempt to recreate the tactile sensations of low frequency vibrations through a stage floor. This would further support the idea of using a real stage floor for virtual performance studios. It was noted that the overall spectral decay characteristics measured in the VPS (Figure 4) were similar. The modeled response shows clearly the separate octave band decays which are a consequence of the modeling technique. The measured response is not as distinct but retains much of the same decay characteristics. The measured impulse responses also revealed a large amplitude reflection happening soon after the first reflection arrives at the musician position (0.011s). This was found to be caused by a TV screen situated near the receiver as the time of arrival was approximately equal to the propagation from the source, reflected off the screen and back to the receiver. This may have an impact on how visual projections are made in future VPS systems. It also highlights the potential of other distorting objects such as solid music stands which could result in an additional early reflection but also could obstruct view of the speakers in the ambisonic array, potentially affecting the spatial accuracy of such a system. The phase of some of the reflections was found to be different in the measured and modeled impulse responses as reported by Gorzel et al [2]. Measured impulse responses generally show a zero mean value whereas the modeled impulse responses do not. Page 5 of 12

Further tests in the area will involve using this technique to obtain acoustic metrics such as Early Decay Time (EDT), RT 60 and clarity (C 80 ) which can be compared to those displayed by the acoustic model. 2.4. Subjective Analysis To gain further insight into the operation of the VPS, a Saxophonist was asked to play in the environment and later make informal comments on the experience; the major points are listed below. The musician utilised the system as described previously with the microphone pointing down the bell of the instrument. The musician was seated with his head positioned in the sweet spot of the ambisonic array and was asked to practice scales and a few short pieces for as long as they wished as per their normal practice routines. 2.4.1. Musician Movement A factor which contributed to the discomfort of the musician was the placement of the fixed microphone near the instrument. As the musician played it was evident that the level of virtual acoustic feedback would change as the musician moved and gestured near the microphone. This led to discomfort as the musician could not freely gesture and move whilst practicing. 2.4.2. Reverberation Level It was evident from the comments provided that the overall level of virtual acoustic feedback had a significant effect on the perceived realism of the simulation. With the virtual acoustic feedback presented at insufficient level, it was obvious that the musician was playing in an acoustically treated room and not the hall itself; set too high and the system would become unstable. Therefore further work is needed to develop a method of producing a realistic gain architecture for the system. This may involve the use of Gade's support ratio [9] or a direct to reverberant sound level. There may be a need to obtain or model the Sound Power output of the instrument for this purpose. 2.4.3. Dynamic Range Compression The use of this auralisation technique was found to give un-natural amplification to sounds which do not normally contain enough energy to propagate beyond the first order reflections. Sounds such as key clicks and breath sounds were found to be auralised at the same level as the tonal content. This was found to contribute to the PA effect described below. This highlights the need for an additional processing stage in the VPS which performs dynamic expansion of the audio signal in order to attenuate lower amplitude sounds from the musician but allow the auralisation of the desired parts of the instrument sound. There remains some future work to be done in finding appropriate expansion ratios which attenuate the signal in an acoustically accurate way without distortion caused by aspects such as hysteresis. 2.4.4. PA Effect The musician reported the feeling that the performance was not truly acoustic in the virtual space, more that it sounded as if it had been amplified using a PA system and then played through in the virtual space. The tone of the virtual acoustic feedback did not appear to completely relate to the perception of the overall tone of the instrument i.e. it was too bright with the microphone placed near the saxophone bell. This was thought to be due to numerous factors including the use of geometrical modeling and the application of a fixed directivity function onto a signal that has already sampled part of that directivity. These observations highlighted that a close microphone technique may be contributing some tonal distortion of the instrument and instigated a more concentrated effort to investigate this particular aspect as documented below. 3. MICROPHONE PLACEMENT AND PROXIMITY EFFECT The change in timbre introduced by a directional microphone placed at small distances in front of a sound source is a well documented effect, known as the proximity or bass tip-up effect, owing to the emphasis placed on the low frequency part of the spectrum. Kearney and Gorzel [2] showed that this effect amongst others can contribute to the listener s perceived quality of auralised material. This section of the report documents a repetition of work done by the aforementioned authors and builds upon it within the context of a VPS. This section also investigates the importance of microphone placement around the musician. An experiment was devised based on a repetition of Kearney's proximity effect test [10] and subsequently creating non-real time, binaural auralisations using Page 6 of 12

multiple microphones placed near different musical instruments. The aim of the experiment was to observe the tonal differences in auralisations obtained from recordings at different locations around the instrument. The first part of the experiment involves using a microphone with switchable polar pattern and a loudspeaker to obtain the spectrum of the microphone at close distances from the source. These spectra are processed to obtain a correction filter which compensates for the proximity effect of the cardioid response. The second part is a recording session in semi-anechoic conditions, recording musicians with a pair of the previously used microphones placed at different positions around the instrument at the same distance described before. The correction filter is applied to these recordings which are then auralised offline with a binaural impulse response obtained from a CATT model of a simple space. Finally, these recordings are auditioned by a small group of four informal listeners to test if they can hear the differences in timbre from each microphone. 3.1. Proximity Effect Correction Regularised inverse filters can be used to correct for spectral distortion inherent in the auralisation system, including the proximity effect caused by placement of the microphone at low radii from the instrument. The experiment used an AKG C414B microphone with switchable polar pattern to record sounds at a small distance from a Genelec 1029A loudspeaker. A logarithmic sine sweep was played through the loudspeaker and recorded with both omnidirectional and cardioid polar patterns. The differences between the two frequency responses exposed the extent of the proximity effect which was then corrected using a regularised inverse filtering technique pioneered by Kirkeby and Nelson [11]. The regularised inverse filters are obtained using (1) as follows omnidirectional spectra which is obtained by complex division in the frequency domain. The loudspeaker and microphone were connected to an M-Audio 1814 soundcard which was connected by firewire to an Intel Macbook. The distance between the loudspeaker and microphone was measured as 8cm in front of the centre of the cabinet; this distance was chosen to avoid the microphone being too close to the musician while they were playing as the same microphone position is required to be used when recording the musicians. The impulse response of the loudspeaker was obtained using a logarithmically swept sine wave produced in Matlab which was swept for 10 seconds between 1Hz and 22050Hz. A Matlab script produces also the inverse sweep which was then convolved with the measured sweep by multiplication in the frequency domain. The sine sweeps were played and recorded using Audacity at a sample rate of 44.1 khz and a 32 bit floating point bit depth. Equation 1 is applied in Matlab to the difference between the cardioid and omnidirectional frequency responses, obtained using complex division in the frequency domain. The resulting frequency response is the inverse filter needed to correct signals recorded by the cardioid microphone at this distance. β was kept at a constant value of 0.005 over the whole spectrum. The impulse response of the filter was obtained by applying an Inverse Fast Fourier Transform to the filter frequency response and applying a circular shift, effectively swapping the first and second halves of the impulse response. This response was used by the FILTER function in Matlab to process the recorded signals offline. Sˆ( ) ( ) Sˆ( ) S( ) ( ) H (1 ) Where H(ω) is the frequency response of the inverse filter, S(ω) is the frequency response of the system, S ˆ ( ) is its complex conjugate and β(ω) is a frequency dependent error function. In this experiment S(ω) refers to the difference between the cardioid and Page 7 of 12

Figure 5 Measured Spectra of cardioid polar pattern (solid), omnidirectional polar pattern (dashed) and corrected cardioid pattern (dotted) Figure 6 Frequency response of regularized inverse filter where β=0.005. 3.2. Instrument Recording Two C414B microphones were placed at the same 8cm distance (as used in the previous measurement) from the practicing musicians who were asked to play individually in a semi-anechoic environment. The microphones were placed referring to the 3 to 1 guideline used by studio engineers to avoid phase cancellation effects. This rule recommends that microphones be placed apart by at least three times the distance from the source to the microphone. Musicians were asked to play any phrase or piece they liked and were asked to avoid any excessive movement whilst doing so. The recordings were made using Audacity running on an Intel Macbook and an M-Audio 1814 firewire soundcard. The recordings were made at 44.1 khz at a 32 bit floating point bit depth. Microphones were placed near the mouth of the flautist and also adjacent to their right hand at the other end of the instrument. For the clarinetist, microphones were placed at the bell of the instrument and also adjacent to the musician s right hand. For the Alto Saxophone microphones were placed at the bell and also adjacent to the lower bend of the instrument, next to the C key. All microphones were kept as close to 8cm away from the instrument as was practically possible. Three output.wav files were produced from each instrument, two of which were from the each microphone recording the instrument, the third was an equal mix of both channels to a mono.wav file. This Page 8 of 12

would allow comparison of auralisations at either microphone location against a mixture of them both. 3.3. Auralisation An acoustic model of a simple space was made in a similar method to the initial tests documented previously. The source was modeled as an omnidirectional point source 70cm above the ground and 20cm in front of the musician s head which was modeled 1.4m above the ground as a binaural receiver. This reflected the position of the instrument and musician in a seated position on stage. The impulse response was modeled with 50,000 rays tracked over a period of 5000ms. The impulse responses were then output as four mono 32-bit.wav files for import into the SIR2 convolvers. The modeled space is used as a temporary meeting room in the same building as the ixdlab (Figure 7). It has a metal floor, plaster walls and two large glass windows at either end of the space. The ceiling has many visible ducts and pipes running across it. This was modeled as a flat surface in the acoustic model with increased diffusion coefficients. It has a volume of 274.4m 3 and has a modeled maximum T30 at 2 khz of 2.57 seconds. This modeled space will be used in future work, aiming to compare the use of modeled impulse responses against measured impulse responses as part of a VPS. The proximity of the room to the ixdlab will also allow comparison between simulations and the real space. Once the recordings were corrected for proximity effect, each recording was auralised using a binaural impulse response obtained from the acoustic model as described in 2.2, substituting the ambisonic receiver for a binaural receiver. The HRTF used to render the binaural material within CATT was obtained from the LISTEN project carried out in IRCAM [12] using response file Listen_1003_plain_44.dat. The auralisations were made to be auditioned offline by a group of casual listeners. This meant the direct sound from the simulation was included in the impulse response used. The auralisations were played to a small group of informal listeners with no prior knowledge of the experiment. They were invited to listen to the recordings and comment on the timbre of the instruments they had heard. The recordings were presented to listeners from a laptop running Audacity over a pair of headphones whilst situated in the ixdlab. Figure 7 Plans of model used for Proximity Effect Auralisations Page 9 of 12

3.4. Results 3.4.1. Objective Results To confirm the correct operation of the correction filter, the logarithmic sine sweep recorded previously by the cardioid microphone was filtered using the calculated inverse filter. It was found that the amplitude envelope of the signal produced (Figure 8), closely matched that of the sine sweep recorded with the omnidirectional microphone, confirming its correct operation. A slight delay was also noted between the corrected and omnidirectional sine sweeps caused by the length of the filter impulse response. The results shown in figure 5 showing the clear emphasis of frequencies between 100Hz and 1 khz for the cardioid also agree with those obtained by Kearney [10]. Figure 8 Time Trace of sine sweep recorded with cardioid microphone (top), omnidirectional microphone (middle) and the corrected sine sweep (bottom) 3.4.2. Spectra of Microphone Signals To illustrate the differences between recorded signals, the spectrum was obtained for a single note in the recording from each microphone in addition to the mixed down auralisation. The plot below shows the differences in spectra for one particular note from the saxophone recordings. Figure 9 Spectra recorded from one note on the Alto Saxophone. Dot-dashed spectrum shows the spectrum obtained from the bend microphone, dashed spectrum shows the spectrum obtained at the bell microphone and solid spectrum shows the two previous spectra combined. Page 10 of 12

It can be seen in Figure 9 that the microphone placed at the bell has significantly more energy between 800 Hz and 4 khz than in the recording made at the bend of the instrument. The recording made at the bend microphone has more content in the low frequencies especially in the fundamental and succeeding three partials of this note. When mixed together, the spectrum of the resultant signal contains the dominant parts of both signals i.e. low frequency resonance and high frequency detail. As the directivity of any musical instrument tends to fluctuate constantly due to a great many factors, the differences observed in the spectra will also fluctuate depending on factors such as microphone position, the note being played by the instrument and playing technique of the musician. 3.5. Subjective Results The tonal difference between the auralised microphone signals was identified by the majority of listeners on all instruments. It was noted that the difference was more noticeable on the clarinet and saxophone recordings than on the flute recording. The Saxophone recording at the bell of the instrument was generally reported as sounding very bright, lacking the characteristic resonance of the instrument. The microphone placed at the lower bend of the instrument was thought to lack high frequency content and detail by comparison. The mix of both recordings was thought to sound like a more complete capture of the entire instrument timbre. The clarinet recording at the bell was also described as having a brighter tone than the recording made at the other location next to the right hand of the musician. It was found to contain a noticeable amount of breath noise and also picked up a particular resonance on one of the notes which played much louder than others in the phrase auditioned. The other signal from near the right hand of the musician was generally reported to lack high frequency content although to a lesser extent than with the saxophone. The recording of the flute adjacent to the right hand of the musician was again reported to lack the high frequency detail of the recording made near the mouth of the flautist. It was noticed by the majority of listeners that intakes of breath made by the musician where much louder in the recording made near the flautist s mouth. 4. DISCUSSION The proximity effect correction used appeared to function well in this set of experiments, however there remain some fundamental practical issues with this particular application of the technology. The first is that in order to correct for proximity effect, a microphone with switchable polar pattern was used to accurately obtain the target response. This then limits the VPS to use of similar microphones which may not be desirable. Furthermore, the use of fixed microphone positions limits the musician to a certain position relative to the microphones so that the inverse filter can optimally compensate for proximity to the source. Keeping a position like this when practicing may become very uncomfortable for the musician. This work has shown informally that careful microphone placement is necessary when considering the construction of a VPS system as there is a noticeable effect on timbre. Ongoing developments in the field of Spherical Harmonic Analysis may have an application for VPS systems as this approach aims to capture the time-varying directivity function of the musical instrument. However, practical problems involved in the fixed position of the instrument inside a spherical microphone array must be addressed alongside issues regarding musician comfort. The auralisation of non-musical parts of the instrument s sound, i.e. key clicks and breath sounds at the same amplitude as tonal content appears to contribute to the perception of the PA effect. The extent of this will depend on many factors including where the microphones are placed around the instrument. A form of dynamics processor such as an expander may improve this by providing attenuation to sounds that are lower in amplitude than the main sound that is to be auralised. 5. CONCLUSIONS It is evident from this initial research that much work remains to be done in the development of an effective real-time interactive virtual performance system. This work has reported on the construction of a prototype VPS system utilizing modeled ambisonic impulse responses and real-time convolutions. It went on to apply previous work in spectral correction to recorded instruments before auralisation and confirmed Page 11 of 12

results reported by Kearney. It has been shown by informal listening tests that there are noticeable differences in tone after auralisation caused by placement of microphones around an instrument and has demonstrated and discussed the implications of this relating to future VPS systems. 6. FUTURE WORK Future work on this particular experiment will involve spatially separating the virtual sources in CATT to more accurately reflect the position of the microphones. This will test if there are any perceivable differences in acoustic feedback caused by mixing down the recorded signals for a mono-source auralisation. As the acoustic model was of a room close by to the VPS system, it is hoped to perform an equivalent measured auralisation to compare the performance of the model against a measurement. Additionally, it will be possible to allow the musician to play in the real space beforehand to compare how both of these modeling techniques perform against a real-life scenario. This will form the basis for a larger body of work. 7. ACKNOWLEDGEMENTS Thanks to all the volunteer musicians and test subjects involved. This work was supported by Arup. 8. REFERENCES [1] Ueno, K., Kosuke, K., & Lawai, K. (2007) Musicians' adjustment of performance to room acoustics Part 1: Experimental performance and interview in simulated soundfield. 19th International Congress on Acoustics, Madrid [5] Physikalisch-Technische Bundesanstalt, Working Group 1.63, Clarinet Directivity CLF Files http://www.ptb.de/en/org/1/16/163/directivity/richtchar. htm Accessed August 2010 [6] Sharland, I. (2005) Flakt Woods Practical Guide to Noise Control (9 th Edition), Flakt Woods Limited [7] Farina, A. (2000). Simultaneous Measurement of Impulse Response and Distortion With a Swept-Sine Technique. 108th AES Convention. Paris, France. [8] Abercrombie, C. & Braasch, J. (2010). Auralization of audio-tactile stimuli from acoustic and structural measurements. Audio Engineering Society Vol. 58 (10), 818-827. [9] Gade, A. C. (1989). Investigations of musicians' room acoustic conditions in concert halls. Part 1: methods and laboratory experiments. Acustica, vol. 69, 193-203. [10] Kearney, G. (2009). Auditory scene synthesis using virtual acoustic recording and reproduction. PhD Thesis University of Dublin, Trinity College [11] Kirkeby, O. & Nelson, P. (1996). Fast deconvolution of multi-channel systems using regularisation. ISVR Technical Report no. 255, Southampton [12] Warufsel, O. (2002) LISTEN HRTF Database, IRCAM, http://recherche.ircam.fr/equipes/salles/listen/index.html Accessed August 2010. [2] Gorzel, M., Kearney, G., & Boland, F. R. (2010). Virtual acoustic recording:an interactive approach. Proc. of the 13th internation conference on Digital Audio Effects (DAFX). Graz, Austria, September 2010 [3] SIR2 Convolution Reverb VST Plug-ins http://www.knufinke.de/sir/sir2.html, Christian Knufinke, 2007, Accessed September 2010 [4] CATT Acoustic Modeling Software V8.0k http://www.catt.se, Accessed August 2010 Page 12 of 12