Hidden melody in music playing motion: Music recording using optical motion tracking system

Similar documents
Measurement of overtone frequencies of a toy piano and perception of its pitch

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT

XC-77 (EIA), XC-77CE (CCIR)

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Advanced Techniques for Spurious Measurements with R&S FSW-K50 White Paper

Vibration Measurement and Analysis

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

Hybrid active noise barrier with sound masking

COMPARED IMPROVEMENT BY TIME, SPACE AND FREQUENCY DATA PROCESSING OF THE PERFORMANCES OF IR CAMERAS. APPLICATION TO ELECTROMAGNETISM

UNIVERSITY OF DUBLIN TRINITY COLLEGE

A Real Time Infrared Imaging System Based on DSP & FPGA

ON THE INTERPOLATION OF ULTRASONIC GUIDED WAVE SIGNALS

Using the new psychoacoustic tonality analyses Tonality (Hearing Model) 1

R&S FSW-B512R Real-Time Spectrum Analyzer 512 MHz Specifications

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing

R&S FSW-K160RE 160 MHz Real-Time Measurement Application Specifications

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

Introduction to Data Conversion and Processing

Open loop tracking of radio occultation signals in the lower troposphere

ACTIVE SOUND DESIGN: VACUUM CLEANER

Area-Efficient Decimation Filter with 50/60 Hz Power-Line Noise Suppression for ΔΣ A/D Converters

Concert halls conveyors of musical expressions

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

DPD80 Infrared Datasheet

Psychoacoustics. lecturer:

Adaptive decoding of convolutional codes

Essentials of the AV Industry Welcome Introduction How to Take This Course Quizzes, Section Tests, and Course Completion A Digital and Analog World

DETECTING ENVIRONMENTAL NOISE WITH BASIC TOOLS

Ocean bottom seismic acquisition via jittered sampling

HEAD. HEAD VISOR (Code 7500ff) Overview. Features. System for online localization of sound sources in real time

Music Radar: A Web-based Query by Humming System

Understanding Layered Noise Reduction

Research on sampling of vibration signals based on compressed sensing

Building Video and Audio Test Systems. NI Technical Symposium 2008

TSG 90 PATHFINDER NTSC Signal Generator

APPLICATIONS OF DIGITAL IMAGE ENHANCEMENT TECHNIQUES FOR IMPROVED

TERRESTRIAL broadcasting of digital television (DTV)

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

REPORT DOCUMENTATION PAGE

Laboratory 5: DSP - Digital Signal Processing

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Transporting NV Standardized Testing from the Lab to the Production Environment

Spatial-frequency masking with briefly pulsed patterns

Studies for Future Broadcasting Services and Basic Technologies

ECE 5765 Modern Communication Fall 2005, UMD Experiment 10: PRBS Messages, Eye Patterns & Noise Simulation using PRBS

White Paper JBL s LSR Principle, RMC (Room Mode Correction) and the Monitoring Environment by John Eargle. Introduction and Background:

2 MHz Lock-In Amplifier

EPI. Thanks to Samantha Holdsworth!

Adaptive Resampling - Transforming From the Time to the Angle Domain

ni.com Digital Signal Processing for Every Application

DPD80 Visible Datasheet

2. AN INTROSPECTION OF THE MORPHING PROCESS

RECOMMENDATION ITU-R BT.1201 * Extremely high resolution imagery

Proceedings of Meetings on Acoustics

UNIT-3 Part A. 2. What is radio sonde? [ N/D-16]

DIGITAL COMMUNICATION

Effect of room acoustic conditions on masking efficiency

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking

PCM ENCODING PREPARATION... 2 PCM the PCM ENCODER module... 4

Politecnico di Torino HIGH SPEED AND HIGH PRECISION ANALOG TO DIGITAL CONVERTER. Professor : Del Corso Mahshid Hooshmand ID Student Number:

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

Practical considerations of accelerometer noise. Endevco technical paper 324

Performing a Sound Level Measurement

Speech and Speaker Recognition for the Command of an Industrial Robot

System Identification

USING PULSE REFLECTOMETRY TO COMPARE THE EVOLUTION OF THE CORNET AND THE TRUMPET IN THE 19TH AND 20TH CENTURIES

DESIGNING OPTIMIZED MICROPHONE BEAMFORMERS

4 MHz Lock-In Amplifier

End-to-end simulations of a near-infrared pyramid sensor on Keck II

OPTICAL MEASURING INSTRUMENTS. MS9710C 600 to 1750 nm OPTICAL SPECTRUM ANALYZER GPIB. High Performance for DWDM Optical Communications

A SQUID-BASED BEAM CURRENT MONITOR FOR FAIR / CRYRING

IEEE P a. IEEE P Wireless Personal Area Networks. hybrid modulation schemes and cameras ISC modes

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Agilent PN Time-Capture Capabilities of the Agilent Series Vector Signal Analyzers Product Note

MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003

Techniques for Extending Real-Time Oscilloscope Bandwidth

specification hyperion colorimeter

Robert Alexandru Dobre, Cristian Negrescu

Impact of DMD-SLMs errors on reconstructed Fourier holograms quality

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

1 Introduction to PSQM

Topic 10. Multi-pitch Analysis

The characterisation of Musical Instruments by means of Intensity of Acoustic Radiation (IAR)

1 Ver.mob Brief guide

Sound and Vibration Data Acquisition

MTI-2100 FOTONIC SENSOR. High resolution, non-contact. measurement of vibration. and displacement

CEDAR Series. To learn more about Ogden CEDAR series signal processing platform and modular products, please visit

Masking effects in vertical whole body vibrations

specification hyperion colorimeter

Acoustical Testing 1

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

VCE VET MUSIC TECHNICAL PRODUCTION

Experiment 13 Sampling and reconstruction

Transcription:

PROCEEDINGS of the 22 nd International Congress on Acoustics General Musical Acoustics: Paper ICA2016-692 Hidden melody in music playing motion: Music recording using optical motion tracking system Min-Ho Song (a) (a) fourms group, Department of Musicology, University of Oslo, Norway, minho.song@imv.uio.no Abstract This paper shows a feasibility study of recording a sound using optical marker-based motion tracking cameras. Optical marker-based motion tracking system can record the motion of moving object using multiple high-speed infrared (IR) cameras. Recent development of the device enables capturing the detailed motions with high spatial precision of 0.01m and high sampling rate up to 10kHz. Therefore, not only the global movements of human body or handheld instruments but also the local acoustic vibrations can be recorded within the motion data, which can be transformed to actual sound radiating from the acoustic instrument. To evaluate the feasibility, several light-weight reflective markers were attached to various positions on the string instruments. Several musical excerpts were selected considering the cameras Nyquist sampling rate. The instruments were played by professional players changing the loudness of the excerpts. The playing motions were recorded with a high-quality optical motion tracking system. Since the global motion trajectory is a relatively slow motion having the frequency component lower than 10Hz, an audible signal could be retrieved from the motion tracking data with low-pass filter. Although the current professional motion tracking system requires significantly high signal-to-noise ratio and can only retrieve the sound up to far less than 5kHz, but the result of the experiment shows that the optical marker-based motion tracking system can be useful in recording sound information from visual domain. Keywords: Optical Motion Tracking, Sonification, Music Playing Motion, Sound Retrieval

Hidden melody in music playing motion: Music recording using optical motion tracking system 1 Introduction An optical motion tracking system is one of the widely used motion data acquisition methods in musical gesture/motion studies. The system consists of multiple high-speed infrared (IR) cameras that can record the trajectories of moving points (retro-reflective markers) in the threedimensional space along the time axis. The recent improvement of these cameras makes it possible to record with HD resolution and high frame rate (up to 10kHz), which enable to capture the detailed fast motion that we could not see before. In this study, we try to retrieve radiating sound of the musical instrument from the local acoustic vibrations recorded within the musician s global movements using the optical motion tracking cameras. There are related works that using high-speed cameras to recover sound from the visual data [1, 2]. Since these works do not use physical markers and use image-processing techniques, these methods have the advantage of simplified data acquisition process. However, we decided to use marker-based recording in this study because it can have some advantages over markerless methods. First, marker-based measurement is more accurate than marker-less techniques. Second, the fast varying data can be retrieved from the moving object because the 3D trajectory of the object is known also. The method is straightforward; the trajectories of attached reflective markers are recorded with high frame rate and the global movement caused by the musician s motion is removed and transformed into sound. 2 Constraints 2.1 Nyquist-Shannon criterion Table 1: Frequency range of conventional motion capture cameras Optitrack (fps) ViCon (fps) Qualisys (fps) Normal mode 100-360 60-420 180-484 High-speed mode unknown unknown 10000 Considering Nyquist-Shannon criterion [3, 4], only the sound with bandwidth lower than half of the optical camera s maximum frequency (frame per second) can be reconstructed. Table 1 shows the maximum frequency range (frames per second: fps) of several conventional optical motion tracking cameras. It should be note that sound with spectrum lower than 200Hz can only be theoretically retrieved using normal mode. However, the speed is related with data processing ability of the device and if we sacrifice the Field-of-View (FOV), then the measurable 2

bandwidth can increase up to 5kHz (a half of 10kHz), not perfect, but reasonable range to retrieve sound. 2.2 Detectability To detect a position of the marker in space, optical motion capture cameras emit IR light and the reflected light from the marker is received back again. If the reflected light energy (E (!"#$) ) is lower than the detectability threshold (E (!"#) ) of photo sensor, the marker cannot be detected. In order to assure the successful detectability of marker, the emitting IR light should have high intensity (I (!"#$) ) or the surface of marker (S) should be large enough to reflect sufficient energy for detection. The condition for successful detection is shown in Eq. (1) below. E (!"#$) I(!"#$) f S E (!"#) (1) When the camera operates with high frequency, the emitting light energy decreases as the shutter speed increases. In other words, cameras cannot have sufficient time duration for emitting adequate light energy. For example, when operating with 200Hz, ideally the camera can open the shutter for 5ms but if the frequency is 10kHz, the camera has only 0.1ms. This causes detectability problem and there comes for a need of big marker as we can see in Eq. (1). However, attaching big size markers on a musical instrument can be a problem because it can limit the movement freedom of the musician or even change the sound quality of the instrument. Therefore, in the case of recording music playing motion, the size of reflective marker should be selected in advance and the possible retrievable frequency bandwidth is determined afterwards. 2.3 Residual There are several factors that influence the error in estimating the position of markers [5]. Since the calculation of the position of a marker needs matrix inversion process, naturally, it involves a numerical error. Also the intrinsic, extrinsic parameters of motion cameras cannot be free from errors [6]. There errors make a marker position to be determined with limited resolution due to random noise. If the small signal we want to see has low Signal-to-Noise ratio, the signal cannot be retrieved or it would need a special noise reduction filter to increase the SNR. 3 Experiments 3.1 Pre-test: Loudspeaker 3

In this pre-test, we tried to test whether the fast vibration (no global movement) over the frequency range of the normal-mode can be recorded with high-speed mode of the cameras. In this test, the input signal of linear sine sweep signal (100Hz to 4kHz, duration of 10 second, Figure 1 left-bottom) was given to a loudspeaker (Marantz LD20) and the vibration of the diaphragm was recorded with four motion capture cameras (Oqus 400, Qualisys Ltd.). The recording with the sampling rate of 10kHz was not possible with 4mm lightweight halfspherical marker due to detectability (darkening) problem. After trying with the 8kHz-sampling rate, we could obtain a marker trajectory shown in Figure 1 (middle-top). Comparing the spectrogram of the original linear sine sweep signal with the retrieved one (Figure 1 middlebottom), we can see the increasing frequency component in the low frequency region but soon it disappears in the high frequency region around 2kHz (it is audible up to 2kHz). This is due to the characteristic of the moving-coil loudspeaker that the mechanical impedance increases at higher frequencies [7]. As the mechanical impedance increases, the vibration amplitude on the diaphragm decreases and masked by the random noise of the camera. We increased the input signal by 60dB and the retrieved sound is audible up to around 3kHz. It implies that the motion cameras can record a sound up to several khz, which contains important feature in speech and music signal. The result shows that if the measurement random noise is low or the sound source has high electroacoustic efficiency, then we could retrieve higher frequency sound. Figure 1: Linear sine sweep signal is retrieved from the loudspeaker diaphragm. 4

nd 22 International Congress on Acoustics, ICA 2016 st Acoustics for the 21 Century 3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various string instruments, but for thin strings, marker attaching acted as an external force to the string, therefore a 6-string bass guitar was selected as the test instrument. Six 4mm half spherical markers were attached to the strings. The attaching positions were selected as where the piezo-transducers lie below. The musician played chromatic scale with A string (3rd string). The player was asked to move freely during the play. In Figure 2, we can see that the marker trajectory of A string (3rd yellow point in the middle) shows the slow trend of musical playing movement, but also we can see small high frequency vibration together. To remove the player s movement, the high-pass filter with cutoff frequency at 20Hz was applied and transformed to sound signal. The result is given in Figure 3. Although we are having some loss in the high frequency components, we can clearly see the chromatic scale is retrieved and it is clearly audible. Figure 2: 6-string bass guitar playing for the experiment of sound recording using motion capture cameras. The motion of A string (yellow point in right-top figure) contains both musician s motion and also the string vibration (right-bottom figure). 5

Figure 3: Chromatic scale is played and retrieved using motion capture cameras. 4 Conclusions The experimental results show that the improvement of motion capture camera is now taking us to see fast and small movement that we couldn t see before. The method has several disadvantages because it requires special cameras and also need to attach a reflective marker to sound producing position, which may force the target sound to modulate. However, if we can overcome the three constraints, there is a chance of using smaller markers with higher frequencies, able to detect very small movement with optical motion captures cameras. Acknowledgments The author would like to thank Anders Tveit for participating on motion recording. References [1] Davis, A.; Rubinstein, M.; Wadhwa, N.; Mysore, G. J.; Durand, F.; Freeman, W. T. The visual microphone: passive recovery of sound from video. ACM Transactions on Graphics, 33(4), 2014, pp. 79(1)-79(10). [2] Akutsu, M.; Oikawa, Y.; Yamasaki, Y. Extract voice information using high-speed camera. Proceedings of Meetings on Acoustics, 19(1), 055019 (2013); [3] Nyquist, H. Certain topics in telegraph transmission theory. Transaction in AIEE. 1928; 47: 617-44 (Reprint as classic paper in: Proceedings of the IEEE. 2002 Feb; 90(2)). [4] Shannon, C. Communication in the presence of noise. Proceedings in Institute of Radio Engineers. 1949; 37(1): 10-21 (Reprint as classic paper in: Proceedings of the IEEE. 1998 Feb; 86(2)). [5] Jensenius, A. R.; Nymoen, K.; Skogstad, S. A.; Voldsund, A. A Study of the Noise-Level in Two Infrared Marker-Based Motion Capture Systems, In Proceedings of the 9th Sound and Music 6

Computing Conference - "Illusions". Logos Verlag Berlin. ISBN 9783832531805. Paper. 2012. s258-263 [6] Pollefeys, M.; Koch, R.; Van Gool, L. Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters. International Journal of Computer Vision, 32(1), 1999, pp.7-25. [7] Kinsler, L. E.; Frey, A. R.; Coppens, A. B.; Sanders, J. V. Fundamentals of Acoustics, 4th Edition ISBN 0-471-84789-5. Wiley-VCH, 1999., pp.406-411. 7