ENGINEERING REPORT A Comparisonof SomeLoudnessMeasures for LoudspeakerListening Tests* RONALD M. AARTS, AES Member Philips Research Laboratories, 5600 JA, Eindhoven, The Netherlands Simple weighting methods and the ISO loudness models are compared with listeneradjusted loudness levels. For a loudness level of about 80 phons, B-weighting appeared to be the best method while A-weighting is unreliable. 0 INTRODUCTION only intended for rank ordering of noises according to loudness and not for measuring absolute loudness, while In listening tests, opinions that are formed about the ISO 532 methods are based on psychoacoustical sound - quamy _:'-' and stereo imaging are influenced by data of human ears and can be used to measure absolute many factors in addition to the one that may be of loudness. In the following sections it will be discussed specific interest; see Toole [1] for a brief overview, why A-weighting is not recommended, and a better One of the sources of variability is loudness. The loud- alternative will be examined. ness balancing of loudspeakers during listening tests is considered to be very important. Among a variety I EQUAL LOUDNESS of publications [1]-[8] it was recently noted by Gabrielsson et al. [8] that an increase in sound level will In the equal-loudness-level contours for pure tones, increase the perceived fullness, and spaciousness, and plotted in Fig. 1, two psychoacoustic phenomena may will give a better clarity and fidelity, be observed: 1) the contours are heavily frequency de- In a previous paper [9] the calculation of the loudness of loudspeakers was discussed. Some standardized loudness calculations were compared with the traditional _ 130 method relying on A-weighted sound levels and with subjective loudness measurements obtained through 110 X 110phon L listening tests. One of the conclusions of that paper _ X_N,,x_ /% was that the A-weightedsound-levelmethod was not _ 90 fn the loudness differences between the loudspeakers are a> 70 recommendedfor accurate loudness balancing and that _,,\ "' The recommended measures (ISO 532) correlated "cn well Withthe subjectiveratingsof thevarioussubjects. _ 30 hardly influenced by the program choice. 50 "-- /_ some and too complicated for everyday use. It is the -o*' 10 However, aim of themany present consider paperthese to extendthe methodscomparisonof to be cumber- a_ =o ' ;_"'"'"'""_ loudness measures to include other simple ones, such _ -10... as the B-, C-, and D-weighting functions. It should be 31.5 125 500 2000 8000 noted that the A-, B-, C-, and D-weighting curves are Frequency (Hz).- * Manuscript received 1991 June 17; revised 1991 No- Fig. 1. Normal equal-loudness level contours for pure tones vember15. (binauralfree-fieldlistening,frontalincidence).from[12]. 142 d. Audio Eng. Soc., Vol. 40, No. 3, 1992 March
ENGINEERING REPORT LOUDSPEAKER LISTENING TESTS pendent and 2) the curves are level dependent. The B-weighting is used for intermediate levels, while D- latter is illustrated by Fig. 2. Fig. 2 shows that the weighting is used for very high levels, such as aircraft normalized differences between the 80-phon curve and noise. A-weighting, which is traditionally used for the 20-, 40-, 60-, and 100-phon curves are increasing general purposes, is supposed to be an approximation for decreasing frequency below 200 Hz. The shapes of the 40-phon contour. This level is much too low for of equal-loudness contours have been used in the design loudspeaker listening tests. When the 80-phon contour of sound-level meters, which attempt to give an ap- is used to obtain a weighting function by normalizing proximate measure of the loudness of complex sounds, it to 0 db at 1 khz, the curve labeled 80-phon weighting Such meters contain weighting networks so that the will result, as shown in Fig. 3. As a reference, A- meter does not simply sum the power at all frequencies weighting and B-weighting are also plotted in Fig. 3. but, instead, weights the power at each frequency ac- It appears, however, that at low frequencies A-weighting cording to the shape of the equal-loudness contours, is too strong while B-weighting is a reasonable ap- At low sound levels low-frequency components con- proximation of the 80-phon weighting curve. Another tribute little to the total loudness of a complex sound, way to demonstrate the weakness of a simple weighting so A-weighting is used to reduce the contribution of in general and A-weighting in particular is the following. low frequencies to the final meter reading. At high When a subject listens to a pure tone at 200 Hz or 2 levels all frequencies contribute more or less equally khz, each with the same sound-pressure level of 60 to the loudness sensation, so that a more nearly linear db, each tone will give about the same loudness (Fig. weighting characteristic, the C network, is used [10]. 4). The A-weighted value of the 200-Hz tone does not 10.0 '*'"'"'"'"...,.....-.... 0.0...... _' '_ -10.0-20.0 --- 40... 60 20-30.0... 20 100 lk 10k Frequency (Hz) Fig. 2. Differences between 80-phon curve and 20-, 40-, 60-, and 100-phon curves, respectively. Difference curves have been normalized to 0 db at 1 khz. 15.0 5.0... A > '_,_o-15.0, -25.0..' m'-i -5.0 '_ ''""_---' '''- ' (/3 13_ (_ -35.0 /...... B80-phon weighting weighting c -1-45.0 O _o -55.0... 20 100 1k 1Ok Frequency (Hz),- Fig. 3. A-, B-, and 80-phon (free-field) weighting functions. J. Audio Eng. Soc., Vol. 40, No. 3, 1992 March 143
AARTS ENGINEERING REPORT reflect the perceived strength, however, weighting functions are plotted together in Fig. 5. In If the subject listens to the two tones simultaneously, the Appendix a computer procedure to compute these with a frequency separation of more than a critical functions is presented. In the following sections these band, the perceived loudness will increase by about global comparisons will be tested against listener-ad- 10 phons (GD) with respect to a single tone. (The suffix justed loudness levels. GD stands for group and diffuse field; see [9] or [11].) The A-weighted level will remain the same as for the 2 SUBJECTIVE LOUDNESS MEASUREMENTS 2-kHz tone presented alone. This is due to the too rigorous weighting at low frequencies (at higher levels), To test the usefulness of the objective loudness meaand because the addition of signals has different effects sures, these values will be compared with listener-adin psychoacoustics than for electrical signals. The ad- justed loudness levels. These subjective values were dition rules for loudness are incorporated in the more obtained by an experiment that is summarized here; advanced loudness measurements, as discussed in [9]. the details can be found in [9]. Ten subjects, one at a However, if loudspeakers under test are similar, then time, listened to six different loudspeakers LS1-LS6 there is no serious objection to a simpler weighting. (the same as those used in [9]), including the standard, For reference purposes, the A-, B-, C-, and D- or reference, LS1, at a distance of 3.5 m. The loudspeakers were of different brands and covered wide ranges of price and quality. They exhibited very dis- 60... 49 db(a) especially at low frequencies. The listening room was I r/ 4' 59 58 phon db(b) (GD) similar a soundproof frequency roomresponses arranged and different equipped efficiencies, as a normal 200 60 db living room. The loudspeakers could not be seen by (a) the subjects, due to an acoustically transparent but visually opaque screen. They were connected to a switching 60... 60 phon (GD) 61 db(a) remotely controlled by the suoj... A [ "* 60 db(b) were placed in the signal path from the CD player to facilitywhichcontaineda set of high-qualityrelays, -o 2000 60 db the power amplifier. Each loudspeaker could be attenuatedby the experimenterby adjustingthe knob cor- 03 -- (b) respogmtng to Lit< IOUU_[J_UlK_I matwu_ pli:tylllg. ILlie stimuli were presented by reproducing pink noise via 60... 61 db{a) could compare loudspeakers LS2-LS6 to the reference I I ] =* 69 62 phon db(b) (GD) the loudspeaker six different LS1loudspeakers as often as they ESl-ES6. desired. The The subjects loud- 200 2000 63 db speakers LS2-LS6 were to be matched by the subjects so that they perceived a loudness level equal to that of Frequency (Hz) = the standard. The subjects gave a signal to the exper- (c) imenter to lower or raise the volume of the loudspeaker Fig. 4. Levels of tones. (a) At 200 Hz. (b) At 2 khz. (c) For under test. When the subject was satisfied with all both tonessimultaneously, loudness levels (which took approximately 10 min), 15.0 5.0,/ '... ;;;... ""--%'"' _ -25.0 _ -- Aweighting... Bweight!ng Q_ -35.0 /... Cwelghtmg *' -45.0 7 D weighting O 03-55.0... 20 100 lk 10k Frequency (Hz) Fig. 5. A-, B-, C-, and D-weighting functions. 144 J. Audio Eng. Soc., Vol. 40, No. 3, 1992 March
ENGINEERING REPORT LOUDSPEAKER LISTENINGTESTS the level of each attenuator was stored. This was done to random variations only." Using the T2 values from once per subject. These values were averaged (over Table 2, one cannot reject the zero hypothesis for B- the 10 subjects) and are hereafter referred to as Lsubi. weighting and ISO 532B. The hypothesis is rejected In a previous experiment [9] it was shown that the for the other five methods. The entry tx in Table 2 subjects could reproduce this task, even after a retention denotes the level of significance of the T2 test, which period of 15 min, with good accuracy. The reference is the probability of making the decision to reject the loudspeaker was used as an anchor or standard. Its zero hypothesis when in fact it is true (type I error). volume setting remained constant during all the tests, The power of this test cannot be calculated explicitly. resulting in a loudness level of 80 phons (GD) for pink However, it can been shown that the power of the present noise. See Table 1 for a comparison with other measures, test is much higher than the power of a one-dimensional test and is sufficient to reject some methods. 2.1 Results One may conclude that the B-weightingand ISO To compare the results of the various methods for 532B methods provide results similar to those of the the loudspeakers tested the error was calculated as subjective assessments. The five other methods are not consistent with the subjective ratings. It should be noted Am j = (Lin 1 -- Lm j) -- Lsubj (1) that the ISO 532 method is intended for general absolute loudness measures applicable for various levels and where Lm 1 is the sound level of the reference loud- sound sources, while in this present case only relative speaker (LS]) using method m, Lmj is the sound level loudness measures of comparable sound sources are of of the jth loudspeaker using method m, and Lsub j is interest. the averaged relative level adjusted by the subjects for the jth loudspeaker. The results of the listening test, 3 CONCLUSIONS using Eq. (1), are summarized in Table 2. (The first column is the unweighted sound-pressure level.) The Experimental evaluations were made of seven mcaentry HT 2 is Hotelling's T2 [15], given as surement techniques to identify those that would be useful for the adjustment of loudness levels of loud- T2m-- 8tmcov-1 gm (2) speakers for listening tests, at a level of 80 phons. The most satisfactory results were obtained by the use of where gm is the vector of differences of method m (the a B-weighted measure of sound level. This provided columns of Table 2) and cov is the covariance matrix results similar to those derived from subjective adof the subjects' ratings. A large value of T2 indicates justments by a population of 10 subjects. The elaborate a large deviation from the subjects' ratings. Clearly, ISO 532B method also gave good results. The A- Table 2 shows that D-weighting is not applicable. The weighted measure yielded poor results and therefore second worst method is the A-weighted sound level, is not recommended for accurate loudness balancing. The simple B-weighting is surprisingly the best in this test. The statistical significance of the data presented 4 REFERENCES in Table 2 can be tested against the following zero hypothesis: "The differences between the various [1] F. E. Toole, "Subjective Evaluation: Identifying methods (unweighted; A-, B-, C-, and D-weighted; and Controlling the Variables," in Proc. AES 8th Int. ISO-A and ISO-B) and the subjective ratings are due Conf. (Washington, DC, 1990 May 3-6, pp. 95-100. Table 1. Comparison of some loudness measures for pink-noise source with SPL of 48 db in each one-third octave in the range of 20 Hz to 20 khz. SPL Soundlevel ISO532A ISO532B (db) (dba) (dbb) (dbc) (dbd) [phons(od)] [phons(gd)] 62.91 59.93 60.31 61.69 67.04 75.71 80.33 Table 2. Difference between objective and subjective measurements. SPL Soundlevel ISO532A ISO532B (db) (dba) (dbb) (dbc) (dbd) [db(od)] [db(gd)] LS2 0.530-1.140-0.300 0.490-1.690-0.050-0.460 LS3 LS4 1.235 0. 130-1.175 -- 1.360 0.405 0.080 1.235 0.290-1.845 -- 1.910-0.555 -- 1. 130-0.355 --0.690 LS5 1.135-1.875-0.185 1.165-2.785-1.295-0.995 LS6 --0.450 --1.530 --1.580 --0.670 --0.690 --0.030-0.750 HT2 13.62 48.22 4.16 14.58 74.35 18.90 6.54 ct <10-5 <10-15 --0.7 <10-6 <10-15 <10-1 --0.1 J.AudioEng.Soc.,Vol.40,No.3, 1992March 145
AARTS ENGINEERING REPORT [2] F. E. Toole, "Listening Tests--Turning Opinion (1987). into Fact," J. Audio Eng. Soc. (Engineering Reports), [ 13] IEC 651, "Sound Level Meters," Int. Electrovol. 30, pp. 431-445 (1982 June). technical Commission, Geneva, Switzerland (1979) [3] F. E. Toole, "Subjective Measurements of [14] IEC 537, "Frequency Weighting for Measure- Loudspeaker Sound Quality and Listener Performance," ments of Aircraft Noise (D-Weighting)," Int. Electro- J. Audio Eng. Soc., vol. 33, pp. 2-32 (1985 Jan./ technical Commission, Geneva, Switzerland (1976) Feb.). [15] B. J. Winer, Statistical Principles in Experi- [4] F. E. Toole, "Loudspeaker Measurements and mental Design (McGraw-Hill, New York, 1962). Their Relationship to Listener Preferences, Parts 1 and 2," J. Audio Eng. Soc., vol. 34, pp. 227-235 (1986 APPENDIX Apr.); pp. 323-348 (1986 May). COMPUTATION OF A-D WEIGHTING [5] A. Ill6nyi, and P. Korp_ssy, "Correlation between FUNCTIONS Loudness and Quality of Stereophonic Loudspeakers," Acustica, vol. 49, pp. 334-336 (1981 Dec.). A computer procedure is listed in order to compute [6] Y. Tannaka and T. Koshikawa, "Correlations the A-, B-, C-, and D-weighting functions. The time between Soundfield Characteristics and Subjective constants for the filters for A-, B-, and C-weighting Ratings on Reproduced Music Quality," J. Acoust. Soc. are from [ 13], those for D-weighting from [ 14]. Am., vol. 86 (1989 Aug.). [7] A. Gabrielsson and B. Lindstr6m, "Perceived Sound Quality of High-Fidelity Loudspeakers," J. Audio PROCEOURE abcdweight(f: (* f = freq. (Hz) '_) REAL; VAR aw, b... dw: REAL); Eng. Soc., vol. 33, pp. 33-53 (1985 Jan./Feb.). CONST ca = 8.0002266419162E-01; cb = 9,8767069950664E-01; [8] A. Gabrielsson, B. Hagerman, T. Bech-Kristen- cc = 6.6709544848173E-09;cd = 6.8966888496476E-05; sen, and G. Lundberg, "Perceived Sound Quality of plo = 20.6: p2c = 12200.0: Reproductions with Different Frequency Responses and pla = 10v.7; pza = 73v.9; Sound Levels," J. Acoust. Soc. Am., vol 88, pp. plb = 158.5; 1359--1366 (1990 Sept.). pla = 282.7; p2d = 1160.0; p3dl = 17i2.0; [9] R. M. Aarts, "Calculation of the Loudness of p3d2 = 2628.0; zldl = 519.8; zld2 = 876.2: Loudspeakers during Listening Tests," J. Audio Eng. ps_c sqr(plc); ps2c sqr(p2c); psla = sqr(pla); ps2a = sqr(p2a); pslb = sqr(plb); Soc., vol. 39, pp. 27-38 (1991 Jan./Feb.). psld = sqr(pld ); ps2d = sqr(p2d ); ps3dl = sqr(p3dl); tr'n_'ulb. C. J. l,_uutc,ns... An Introduction,_,'_,,,_""_ Psy- ps3d2 = 5qr(p3d2): zsldl =sqi (zldl); z,3ld2 = sqr(zld2)_ chology of Hearing (Academic Press, London, 1982). VARf2, hw: REAL; [11] ISO 532-1975(E), "Acoustics--Method for BEGIN f2,= _q_(f); Calculating Loudness Level," 1st ed 1975, Int. Stan- _,= aw:= f2 cw /* f2 (f2 / + sqrt((f2 pslc) / + (f2 psla) + ps2c) * (f2/ + cc: ps2a)) / ca ; dards Organization (1977) bw:= cw * f / sqrt(f2 + pslb) / cb ; hw:= 1 / (sqr(ps3dl + ps3d2 - f2) + 4 _: f2 * ps3dl.); [12] ISO 226-1987(E), "Acoustics--Normal Equal- hw..= hw * (sqr(zsldl + zstd2 - f2) + 4 * f2 _ zsldl); dw:= f * sqrt(hw/(f2 + psld) / (f2 + ps2d)) / cd: Loudness Level Contours," Int. Standards Organization END; THE AUTHOR Ronald M. Aarts was born in Amsterdam, The Neth- players. In 1984 he joined the acoustics group of the erlands, in 1956. He received a B.Sc. degree in elec- Philips Research Laboratories and was engaged in the trical engineering in 1977, then joined the optics group development of CAD tools for loudspeaker systems. of Philips Research Laboratories where he was engaged Mr. Aarts has published a number of technical papers in research into servos and signal processing for use and reports and holds several patents in his field. He in both video long-play players and Compact Disc is a member of the AES and the ASA. 146 J. Audio Eng. Soc., Vol. 40, No. 3, 1992 March