MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis Part 5.1: Intensity alexander lerch November 4, 2015

instantaneous features overview text book Chapter 4: Intensity (pp. 71 78) sources: slides (latex) & Matlab github repository lecture content loudness perception and decibels dynamics in music instantaneous features

introduction intensity-related descriptors commonly used waveform view 1 1 1 x(t) (pop) 0.5 0-0.5 x(t) (stringquartet) 0.5 0-0.5 x(t) (speech) 0.5 0-0.5-1 -1-1 0 5 10 0 5 10 0 5 10 t [s] t [s] t [s] level monitoring (PPM, VU,... ) terms and definitions

human perception 1/2 perception has non-linear relation to intensity: model: logarithmic relation v db (n) = 20 log 10 ( v(n) v 0 v 0 : reference constant (0 db point) digital: v 0 = 1 dbfs scaling factor: 1 db JNDL )

excursion: level computation if v(n) = 0 : computation of log 10 (0) work-arounds a add constant ɛ v db (n) = 20 log 10 (v(n) + ɛ) b add if statement v trunc (n) = { v(n), if v(n) ɛ ɛ, otherwise

excursion: level computation if v(n) = 0 : computation of log 10 (0) work-arounds a add constant ɛ v db v db [db] 20 15 10 5 v db (n) = 20 log 10 (v(n) + ɛ) ǫ = 1e 01 ǫ = 1e 02 ǫ = 1e 03 ǫ = 1e 04 matlab source: matlab/displaylogepsilon.m 0-40 -35-30 -25-20 -15-10 -5 0 v db [db]

excursion: level computation if v(n) = 0 : computation of log 10 (0) work-arounds a add constant ɛ v db (n) = 20 log 10 (v(n) + ɛ) b add if statement v trunc (n) = { v(n), if v(n) ɛ ɛ, otherwise

human perception 2/2 decibel scale is not loudness scale: equal-sized steps on the decibel scale not perceived as equal-sized loudness steps perceptual loudness depends on frequency cochlear resolution masking effects

human perception 2/2 120 SPL [db] 100 80 60 40 20 0 90 phon 80 phon 60 phon 40 phon 20 phon 0 phon 10 2 10 3 10 4 f [Hz] matlab source: matlab/displayequalloudnesscontours.m

dynamics in music score: only several rough dynamic steps,e.g.: pp, p, mf, f, ff comparably vague instructions on volume modifications, e.g.: crescendo, decrescendo, sf dynamics influenced by instrumentation timbre number of voices context and musical tension MIDI: 128 velocity steps no standardized relation to magnitude, power,...

features: root mean square 1/2 v RMS (n) = 1 K i e(n) i=i s(n) x(i) 2

features: root mean square 1/2 v RMS (n) = 1 K i e(n) i=i s(n) x(i) 2 value of this feature for the hypothetical prototype signals silence sinusoidal (Amplitude A)

features: root mean square 1/2 v RMS (n) = 1 K i e(n) i=i s(n) x(i) 2 f [khz] 10 5 0 1 5 10 15 20 25 matlab source: matlab/displayfeatures.m x(i) 0-50 v(n) -1 0 5 10 15 20 25 t [s]

features: root mean square 2/2 common variants (sample processing only): reduce computational complexity vrms(n) 2 = x(ie(n))2 x(i s(n 1)) 2 + vrms(n 2 1) i e(n) i s(n) + 1 v RMS(n) = vrms 2 (n) single pole approximation v tmp(i) = α v tmp(i 1) + (1 α) x(i) 2 v RMS(i) = v tmp(i)

features: weighted root mean square x(i) H(z) RMS v(n) H(z): A, B, C weighting RLB (BS.1770)...

features: weighted root mean square x(i) H(z) RMS v(n) H(z): A, B, C weighting RLB (BS.1770)... H(f) [db] 10 0-10 -20 BS.1770 MC ITU-R BS.468 A Weighting C Weighting Z Weighting matlab source: matlab/displayloudnessweighting.m 10 2 10 3 10 4 f [Hz]

features: peak envelope (max) v Peak (n) = max x(i) i s(n) i i e(n)

features: peak envelope (max) v Peak (n) = max x(i) i s(n) i i e(n) x(i) f [khz] 10 5 0 1 0 5 10 15 20 25 v(n) matlab source: matlab/displayfeatures.m -1 0 5 10 15 20 25 t [s] -50

features: peak envelope (PPM) 1/2 x(i) x(i) α AT λ v PPM(i) z 1

features: peak envelope (PPM) 1/2 x(i) x(i) α AT λ v PPM(i) z 1 release state ( x(i) < v PPM (i 1) λ = α RT )

features: peak envelope (PPM) 1/2 x(i) x(i) α AT λ v PPM(i) z 1 release state ( x(i) < v PPM (i 1) λ = α RT ) v PPM (i) = v PPM (i 1) α RT v PPM (i 1) = (1 α RT ) v PPM (i 1)

features: peak envelope (PPM) 1/2 x(i) x(i) α AT λ v PPM(i) z 1 attack state ( x(i) v PPM (i 1) λ = 0)

features: peak envelope (PPM) 1/2 x(i) x(i) α AT λ v PPM(i) z 1 attack state ( x(i) v PPM (i 1) λ = 0) v PPM (i) = α AT ( x(i) v PPM (i 1) ) + v PPM (i 1) = α AT x(i) + (1 α AT ) v PPM (i 1)

features: peak envelope (PPM) 2/2 10 f [khz] x(i) 5 0 1 0 5 10 15 20 25-1 0 5 10 15 20 25 t [s] discuss differences between peak meter and max per block -50 v(n) matlab source: matlab/displayfeatures.m

features: zwicker loudness Stimulus Outer Ear Transfer Function Excitation Patterns Specific Loudness Overall Loudness v Loud

features: zwicker loudness Stimulus Outer Ear Transfer Function Excitation Patterns Specific Loudness Overall Loudness v Loud outer ear transfer function 1 1 D. Hammershøi and H. Møller, Methods for Binaural Recording and Reproduction, Acta Acustica united with Acustica, vol. 88, no. 3, pp. 303 311, May 2002.

features: zwicker loudness Stimulus Outer Ear Transfer Function Excitation Patterns Specific Loudness Overall Loudness v Loud excitation patterns 1 1 M. Schleske, Vibrato of the musician, [Online]. Available: http://www.schleske.de/en/our-research/handbook-violinacoustics/vibrato-of-the-musician.html (visited on 07/29/2015).

features: zwicker loudness Stimulus Outer Ear Transfer Function Excitation Patterns Specific Loudness Overall Loudness v Loud specific loudness 1 1 U. of Salford, Customised metrics, [Online]. Available: https://www.salford.ac.uk/computing-science- engineering/research/acoustics/psychoacoustics/sound-quality-making-products-sound-

features: zwicker loudness Stimulus Outer Ear Transfer Function Excitation Patterns Specific Loudness Overall Loudness v Loud overall loudness v loud = i z i

derived features number or ratio of pauses dynamic range statistical features from (RMS) histogram...

summary lecture content 1 why are intensity-related features often in db 2 how does the db-scale relate to loudness 3 what are typical intensity-related features