Journal of Voice Vol. 16, No. 1, pp. 1 7 2002 The Voice Foundation Normative Voice Range Profiles of Male and Female Professional Voice Users L. Heylen, F. L. Wuyts, F. Mertens, M. De Bodt, and P.H. Van de Heyning Department of Otolaryngology, Head and Neck Surgery, Department of Communication Disorders, University Hospital, University of Antwerp, Antwerp, Belgium Summary: Normative voice range profiles (VRP) are calculated for a group of male and female teachers, based on 43 and 46 recordings, respectively. All individuals had healthy voices. These normative VRPs contain 95% prediction intervals for both frequency and intensity. They are based on a series of mathematical transformations of the original individual VRPs in order to maintain in the normative VRPs the typical oval VRP shape, including the dip between modal and falsetto register. The normative VRPs presented are directly applicable in the clinical practice of otolaryngologists and speech-language pathologists. Key Words: Voice range profile Normal voice Teachers Professional voice users Healthy voice. INTRODUCTION The voice range profile (VRP) is a graphical representation that reflects a speaker s ability to produce maximum and minimum vocal intensities at fixed percentages of his maximum frequency range of phonation under controlled conditions of vowel production and mouth opening. 1 Measurements derived from the voice range profile appear to be valuable for the assessment of normal and pathological voices since they reflect not just the patient s vocal frequency and intensity capabilities, but also the interaction between these two variables. 2 This information is used by the voice clinician for both diagnostic and therapeutic purposes. Accepted for publication May 18, 2001. Address correspondence and reprint requests to Professor Dr. Floris L. Wuyts, Department of Otolaryngology, Head and Neck Surgery, Department of Communication Disorders, University Hospital, University of Antwerp, Belgium, Wilrijkstraat 10, B-26 Edegem. e-mail: floris.wuyts@ua.ac.be Several studies focus on the different characteristics of the VRP, such as highest and lowest frequency or intensity. 3-5 Separate characteristics, such as the highest frequency or lowest intensity, may indeed be hampered by diseased vocal folds, but most of these characteristics have a low sensitivity and specificity. Therefore, some authors have successfully attempted to circumvent this difficulty by introducing an index that contains more variables at the same time, thereby increasing the sensitivity and specificity. 6 However, the need still exists to inspect the global VRP in the approach of the vocal capability of a pathological voice. Therefore, several authors 1,7 11 established normative or representative VRPs for different populations. No statistical analysis, however, is performed on these data. Another type of reported normative VRP is the rescaled VRP. 3,4,12,13 In this category we find VRPs in which every individual frequency range is put on a scale of % with intervals of 10%. By means of interpolation, the minimum and maximum intensities are calculated on these intervals. This method requires every individual VRP in 1
2 L. HEYLEN ET AL the clinical practice to be rescaled before it can be compared with the above-mentioned normative VRP. Sulter 14 constructed normative VRP based on the rescaling methodology, but by applying Fourier transformation he succeeded in respecting the original shape. He calculated minimum and maximum intensities with standard deviations for each 10% frequency level. Shultz-Coulon 15 published comparable VRP but without standard deviations or similar measures of variability. Another approach is established by superposition of the VRPs of several individuals. 16 This results in a contour that comprises all the included VRPs. As such, a long and flat super VRP is obtained which lacks the normal oval shape. Finally, a fourth type of normative VRP is based on statistical calculation of median or mean values. 17 At each frequency the median was calculated for both maximal and minimal intensities. A similar procedure was applied by Pabon 5 who used the means. Although these methodologies are an improvement over the exemplary, or superimposed VRPs, the lack of standard deviations or other measures of variability make it awkward to compare a patient s VRP with one of these standard VRPs. Additionally, the calculation of mean or median values for intensities may be easy to do in the mid regions of the VRP, but at the lower and higher frequencies this calculation becomes quite unreliable due to the oval shape of the VRP. Therefore, we present a normative VRP that contains 95% prediction intervals for both frequency and intensity, and is based on a series of mathematical transformations of the original VRP. By using this method, the typical shape of the VRP was respected. The subject group, consisting of teachers for whom we first applied this method to construct the normative VRP, can be described as professional voice users. 18 According to Koufman and Isaacson 18 professional voice users are characterized by the fact that any voice problem seriously hampers the execution of their daily work. MATERIALS AND METHODS Subjects The VRP of 43 normal male (mean age = years, range = 22 54 years) and 46 normal female teachers (mean age = 31 years, range = 22 51 years) were recorded. Inclusion was based on a laryngoscopic inspection with a rigid scope by an otolaryngologist to exclude vocal fold pathology such as nodules, polyps, cysts, etc. The perceptual evaluation of voice quality was performed by a trained speech-language pathologist (FM). Data recording VRP recording was performed using the Phonomat automatic voice field measuring system equipment (Homoth Medizin-elektronik GmbH & COKG, Stimmfeldmessung, Version 3.0) according to the procedure recommended by the Union of European Phoniatricians. 1 The Phonomat records the mean fundamental frequency (F 0, in Hz) and sound pressure level [db(a)] of a sustained vowel production and provides the patient with visual feedback. The subjects were instructed to inhale in a comfortable way 19 and to sustain the vowel /!/ for at least two seconds using a habitual pitch and loudness, as used during daily speech. Next, they were encouraged to explore their voice range possibilities by decreasing and increasing pitch and loudness as a warm up exercise. 20 Next, a rough contour was then generated by having the subject vocalize at the lowest and highest F 0 s using the lowest and highest intensities at each frequency extreme. The upper and lower contours were completed for each semitone within the subjects frequency range, so that intensities are recorded for each of the 12 semitones per octave. PROCEDURE FOR GENERATION OF THE NORMATIVE VRP The detailed mathematical procedure is explained in the Appendix. However, here are the main steps of the normative VRP calculation. Step 1. The frequencies of the registered VRPs are first converted into semitones in order to avoid the disadvantages of the logarithmic scale for the frequencies. Step 2. The second manipulation makes all VRPs start at the very same semitone. Hereto, each VRP has to be shifted over a number of semitones, so that all the initial semitones coincide with the lowest semitone of the entire group of VRPs. The amount of semitones is dependent on the lowest semitone of each individual VRP, and so this number differs for
NORMATIVE VOICE RANGE PROFILES OF PROFESSIONAL VOICE USERS 3 the different VRPs. Next, the average and standard error of the number of semitones needed to shift (or translate) all VRPs is calculated. This average translation is denoted as T (See Appendix). Step 3. Each VRP is compressed so that the final semitones of all VRPs also coincide. This is achieved by multiplying each VRP semitone range by a factor, which again differs for each VRP, since they all have different semitone ranges. The average and standard error of these compression factors is then calculated and represented by C (see Appendix). Figure 1 presents the VRPs of this group after translation and compression Step 4. The fourth phase in the process is the interpolation of the individual VRPs in order to have the corresponding intensities of all VRPs for each semitone. Then we calculate the mean upper and lower intensities and the 95% prediction (µ±2s) intervals for each semitone separately. However, this yields a mean VRP that needs to be rescaled again. Hereto, this norm VRP is stretched by a factor C, since it was first compressed by this amount. Then the stretched VRP is shifted over T semitones in the opposite way it was originally shifted. Based on the standard errors of T and C, the error on the initial and final semitones is determined. Finally, the semitones are converted again into frequencies. The result is a normative VRP. RESULTS The norm VRPs were obtained by shifting the VRPs over an average number of T = (10.5 ± 0.4) semitones for the population of 46 female teachers. Then an average compression of C = 0.697 ± 0.015 was needed to align the individual VRPs for the highest semitone. For the 43 male teachers the average translation was T = (5.8 ± 0.5) semitones and the average compression was C = 0.662 ± 0.017. Figure 2 represents the normative VRPs for female teachers, where the mean intensities (± two standard deviations) are plotted for each semitone. Based on this VRP, we constructed a graph (Figure 3) with three contours, defining the normative VRP as well as the inner (mean 2 s) and outer limit (mean +2 s) VRPs. Similarly, the normative VRP for male teachers is depicted in Figure 4 (see page 5), where Figure 5 represents the mean VRP together with VRP contours FIGURE 1. The VRPs of female teachers group after translation and compression. 20 0 5 10 15 20 Semitones 77.8 156 220 311 4 622 8 Female Teachers FIGURE 2. Norm VRPs of female teachers (N=46). The vertical and horizontal error bars indicate the 95% prediction intervals. These intervals contain 95% of the intensity/frequency of the normal population. Note: The figures in large format can be requested from floris.wuyts@ua.ac.be for inner and outer limits of normality. In order to be considered normal, a teacher s VRP should be situated within both limits. DISCUSSION In daily practice, the clinician prefers to make use of visual tools in his approach to treating the patient with voice problems. Indeed, humans rely in their judgment highly on pattern recognition. This is probably one of the reasons explaining the success of the VRP. Consequently, next to normative data regarding
4 L. HEYLEN ET AL FIGURE 3. Normative VRP for female teachers with three contours, defining the normative VRP (dots), as well as the inner (mean 2 s) and outer limit (mean + 2 s) VRPs (lines). Note: The figures in large format can be requested from floris.wuyts@ua.ac.be 77.8 156 220 311 4 622 8 Female Teachers Normative VRP (mean) Upper and lower VRP (µ ± 2 s) FIGURE 5. Normative VRP for male teachers with three contours, defining the normative VRP (dots), as well as the inner (mean 2 s) and outer limit (mean + 2 s) VRPs (lines). Note: The figures in large format can be requested from floris.wuyts@ua.ac.be 77.8 156 220 311 4 622 8 Male Teachers Normative VRP (mean) Upper and lower VRP (µ ± 2 s)
NORMATIVE VOICE RANGE PROFILES OF PROFESSIONAL VOICE USERS 5 77.8 156 220 311 4 622 8 Male Teachers FIGURE 4. Normative VRPs of male teachers (N=43). The vertical and horizontal error bars indicate the 95% prediction intervals. These intervals contain 95% of the intensity/frequency of the normal population. Note: The figures in large format can be requested from floris.wuyts@ua.ac.be specific characteristics of the VRP, such as highest frequency, lowest intensity, slope of the upper contour, etc., 6 the need exists to establish normative VRPs focusing on the entire shape. Such a normative VRP serves as a tool to estimate the degree of deviance of an individual subject with vocal pathology in reference to the VRP of a healthy subject. As such, the efficiency of therapy can be assessed by the subsequent comparison of the VRPs. The normative VRPs of the professional voice users, i.e., the male and female teachers, that we present in this study are based on the calculation of the mean values and 95% prediction intervals, after mathematical translation and compression of the individual VRPs. Note that the shape of the VRP presented here is in concordance with that of an individual subject. For example, the dip between modal and falsetto register is clearly presented. The final goal of the calculation of a normative VRP is to compare an individual s VRP with this norm in order to make a statement about the degree of pathology of the patient s voice, and to evaluate therapy outcome. The innermost and outermost contours on Figures 3 and 5, or the vertical error bars on Figures 2 and 4, represent the 95% prediction intervals. Within these limits 95% of the healthy population is contained. When VRP points of an individual subject lie outside these limits, the VRP of this individual is deviant from normality. The more points that lie outside the normal ranges, the more this VRP can be regarded as pathological. CONCLUSION The normative VRPs presented here are directly applicable in the clinical practice of otolaryngologists and speech-language pathologist. They offer a basis for reference in the evaluation of individual voices. REFERENCES 1. Schutte HK, Seidner W. Recommendation by the Union of European Phoniatricians (UEP): Standardizing Voice Area Measurement/Phonetography. Folia Phoniatr. 1983;35:286 288. 2. Van de Heyning PH, Remacle M, Van Cauwenberge P. Functional assessment of voice disorders. report of the Royal Belgian Society for Ear, Nose, Throat, Head and Neck Surgery. Acta Oto-Rhino-Laryngologica Belgica. 1996;: 249 396. 3. Coleman RF, Mabis JH, Hinson JK. Fundamental frequency-sound pressure level profiles of adult male and female voices. J Speech Hear Res. 1977;20:197 204. 4. Gramming P. The Phonetogram: An Experimental and Clinical Study [doctoral dissertation]. Malmö: University of Malmö; 1988. 5. Pabon JPH. Objective voice-quality parameters in the computer phonetogram. J Voice. 1991;5 (3):203 216. 6. Heylen L, Wuyts FL, Mertens F, De BM, Pattyn J, Croux C et al. Evaluation of the vocal performance of children using a voice range profile index. J Speech Lang Hear Res. 1998; 41(2):232 238. 7. Klingholz F, Martin F. Die quantitative Auswertung der Stimmfeldmessung. Sprache-Stimme-Gehör. 1983;7:106. 8. Komiyama S, Watanabe H, Ryu S. Phonetographic relationship between pitch and intensity of the human voice. Folia Phoniatr. 1984;36:1 7. 9. Pedersen MF. Change of voice in puberty in choir girls. Acta otolaryngol (Stockh). 1984; Suppl. 412:46 49. 10. Uloza V, Siupsinskiene N. Voice range profile as a tool for clinical voice assessment and monitoring. In: Kotby MN, ed. Congress Proceedings of XXIII IALP Congress, Cairo, August 1995. Cairo: IALP;1995:37. 11. Wolf SK, Stanley D. Quantitative studies on the singing voice. J Acoust Soc Am. 1935;6:255 266. 12. Kotby MN, Abul-Ela MY, Orabi AA. Voice range profile as a quantitative measure of vocal function: normative data. In: Kotby MN, ed. Congress Proceedings of XXIII IALP Congress, Cairo, August 1995. Cairo: IALP;1995:46 49. 13. Awan SM. Phonetographic profiles and Fo-SPL characteristics of untrained versus trained vocal groups. J Voice. 1991; 5:41. 14. Sulter AM. Variation of Voice Quality Features and Effects of Voice Training in Males and Females [Doctoral dissertation]. Groningen: R.U.Groningen; 1996.
6 L. HEYLEN ET AL 15. Schultz-Coulon HJ. Stimmfeldmessung. Berlin, Heidelberg: Springer-Verlag; 19. 16. Hacki T. Die Beurteilung der quantitativen Sprechstimmleistungen. Das Sprechstimmfeld im Singstimmfeld. Folia Phoniatr. 1988;:1 196. 17. Böhme G, Stuchlick G. Voice profiles and standard voice profile of untrained children. J Voice. 1995;9:304 307. 18. Koufman JA, Isaacson G. The spectrum of voice dysfunction. In: Koufman JA, Isaacson G, eds. Voice Disorders. Philadelphia, Pa: W.B. Saunders; 1991. 19. Coleman RF. Sources of variation in phonetograms. J Voice. 1993;7:1 14. 20. Sundberg J. Phonatory breathing physiology behind vocal pedagogy. J Res Singing. 1987;10:3 21. APPENDIX This section describes the procedure for the normative VRP calculation for female teachers. This procedure consists of several phases. Phase 1. Conversion of frequency to semitones If f min is the lowest frequency of all the VRPs, then we denote f b as the frequency that corresponds with the semitone just below f min. Then each intermittent frequency of the VRPs is converted into semitones based on the equation: ST = log(f/f b )/0.0251. The number 0.0251 is a factor used to convert Hz to semitones, based on the fact that an octave containing 12 semitones corresponds to a doubling of the frequency, or log(2)/12=0.0251. The lowest frequency that is present among all female teachers is f min = 84.2 Hz, so f b =77.8 Hz (Comment: This frequency is indeed very low for a female person. The next lowest frequency registered among the female teachers was Hz. However there was no reason to reject this person out of the data). The semitone that corresponds with, e.g., 4 Hz then equals log (4/77.8)/0.0251 = 30. This method was chosen to prevent the lowest semitone of all the VRPs from being zero [log (1) = 0]. Consequently the pairs of points (frequency, intensity) of all VRPs are converted into (semitones, intensity). Phase 2. Translation We define L i as the lowest semitone for subject i. The index i ranges from 1 to n, with n the total number of subjects. The VRP of each subject i is shifted over L i semitones. So after translation, the semitones, ST, of that subject are ST T i,k=st i,k+li L i. In this expression k ranges from 0 to R i, with R i the total number of semitones of subject i. After translation, the semitones of subject i range then from 0 to R i. This procedure is repeated for all subjects. Accordingly, the average translation is T = l/n L i. In the example of the female teachers, the VRP of an individual j that initiates at 147 Hz and ends at 1046 Hz is converted to semitone 11, respectively 45, since f b is 77.8 Hz. The range is then R j = 34 semitones. Subsequently this VRP is shifted over L j = 11 semitones. Phase 3. Compression If r is the smallest range of all VRPs, then r = min(r i ), for all subjects (i=1,...,n). The VRP of a subject is now compressed by rescaling the semitones 0 to R i into 0 to C i, where C i =r/r i. This way all VRPs have the same range r. Consequently, the compression reduces the individual semitones of a subject i as: ST C i,m=st T i,m r/r i with index m going from 0 to R i. Accordingly, the average compression is calculated as C = 1/n C i. In the specific case of the VRP of subject j from the population of female teachers, C j equals 20/34, because the smallest VRP of the female teachers has a range of 20 semitones. The semitones 0, 1, 2,... until 34 of this subject are compressed to 0, 1 20/34, 2 20/34,..., 34 20/34. Phase 4. Interpolation Since this compression differs from one subject to another because of the differences of r i, the rescaled semitones are not the same for all individuals. Consequently the intensities have to be interpolated according to a detailed semitone scale with small intervals so that the semitones occurring in all individuals are part of this scale. This interpolation was performed on the basis of the detailed semitone scale from 0 till r in steps of 0.05 semitones. This scale contains p = r/0.05 steps. Phase 5. Calculation of the mean intensities For each step on the semitone scale we now calculate the mean value, the standard deviation (s) as well as the 95% prediction interval (average ± 2 s) of the intensity.
NORMATIVE VOICE RANGE PROFILES OF PROFESSIONAL VOICE USERS 7 Phase 6. Expansion and translation The detailed semitone scale ST D is subsequently expanded again with a factor 1/C and translated with a term T. ST NORM p = ST D P 1/C + (T+1) where the index p ranges from 0 to r/0.05. The 1 in the term T + 1 is necessary because of the original translation of the VRP to the semitone just under the semitone corresponding with the lowest occurring frequency. In the example of the female teachers, the detailed semitone scale (0,0.05,1,...,20) is expanded and translated to 11.54, 11.61, 11.69, 11.76,...,.84,.92. Phase 7. Conversion to frequencies After the selection of the integer semitones from the expanded and translated detailed semitone scale with the corresponding mean intensities, these semitones are converted again into the corresponding frequencies by means of the following formula: f i = f b 10 exp(st NORM N 0.0251) in which N indicates the integer semitones from all p = r/0.05 available semitones. In this way, we obtain for the female teachers the frequencies 155, 165, 174,..., 828 corresponding with the semitones 12, 13, 14,..., 41. Since the lowest semitone was 11.54, this one will also be added to the normative VRP. Figures 2 and 4 represent the norm VRPs of female and male teachers, respectively. The vertical error bars indicate the 95% prediction interval. These intervals contain 95% of the intensity of the normal population. Similarly, the horizontal error bars represent the 95% prediction intervals for the initial and final frequencies. The error on the frequencies was calculated on the basis of the error-propagation formula for the semitones (see formula ST NORM ): SE ST NORM = ( ) STNorm 2 SE 2 C + ( ) STNorm 2 SE 2 T. C T Subsequently the semitones were converted into frequencies again.