Lecture 1: What we hear when we hear music

Lecture 1: What we hear when we hear music What is music? What is sound? What makes us find some sounds pleasant (like a guitar chord) and others unpleasant (a chainsaw)? Sound is variation in air pressure. Pressure is a quantitative measurement of how much particles in the air are pressing on each other. The average air pressure at sea level is about 101.3 kpa, or 29.9 inches of mercury. Some people s ears are sensitive enough to detect relative changes in air pressure as low as.00000002%. When you clap your hands or hit a surface, you send a pressure wave---an increase in pressure---that radiates out in all directions. A simplified picture of how the wave travels is that each particle presses more on those around it, and in turn those particles press on those around them. (The particles themselves are not displaced very far; it is the disturbance that travels through the medium. It also bounces off some surfaces, losing a certain percentage of energy depending on the surface.) What makes the difference between noise and musical tones is the regularity of the variation in air pressure---that is, whether or not the variation of pressure repeats at regular intervals. If instead of clapping your hands you sing a note, you set up a repetitive variation in air pressure; often there are hundreds of these repetitions per second. How often the pressure goes up and down---that is, how many alternations of condensation and rarefaction they detect per second---is the frequency of the sound. Frequency is measured in units called Hertz (Hz); for example, 300 Hz means 300 repetitions per second. Most young people can hear sounds down to about 20 Hz and up to about 20000 Hz, but the upper end of this range comes down rapidly with age. If you stand in one place and listen to a musical note, your ears and your brain are processing variations in air pressure. If we could attach

a tiny pressure meter to your eardrum, we d see the needle on that meter going up and down. What your brain and ears analyze is this pattern of ups and downs over time; we can illustrate that pattern graphing the pressure as a function of time. Even the simplest of musical sounds---one voice singing, or one flute playing----can produce complicated pressure graphs. Here are some recorded for various wind instruments: [VariousSoundWaves.jpg] Notice that, at least over the short period of time illustrated, the variation in pressure is periodic---that is, the graph repeats the same shape at regular intervals. The difference between the shapes that repeat in these pressure graphs make a difference in the timbre or tone quality of the sound, and corresponds to the difference between the sound of a flute and the sound of a saxophone. Loudness How loud a sound seems to us depends on how much energy is being transmitted when the sound wave hits us, but it also depends on how our ears respond to that energy. More precisely, the intensity of a sound when it hits a surface (like your eardrum) is measured by how much energy is flowing per unit area. (So, the intensity Ι would be measured in units like watts per square meter.) But the ear or rather, the nerve endings in the ear doesn t respond linearly to changes in intensity. Suppose one person is singing a note; when another person joins in it seems louder, but when a third or fourth person joins in the increase in loudness isn t as noticeable. In fact, going from 1 person to 10 people singing is about the same increase in loudness as when we go from 10 people singing to 100 people singing. In other words, changes in loudness seem to be governed by how much the intensity is multiplied by, not how much is added. In order to model how our perception of loudness behaves in this way, we use a logarithm to measure loudness:

L = 10 log(ι/ι_0), where Ι_0 is the intensity of the softest sound perceptible to ears, a trillionth of a watt per square meter. (By comparison, a 25 watt bulb has intensity about 2 watts per square meter, seen from 1 meters away.) So, L = 120 + 10 log(ι) Loudness is measured in decibels (db); when a sound is 10 times more intense, the loudness increases by 10 db. For example, normal conversation is in the range of 50-60 db, traffic around 70 db, a subway train is around 90 db. Most music ranges between 30 and 100 db, with amplified rock music around up to 110 or 120, the threshold of pain. The intensity of a sound decreases as we move away from the source, and in fact is proportional to 1 over the distance squared. (This is called the inverse square law.) For example, if you move from a point x to a point y twice as far from the source of the sound, then Ι_y / Ι_x = 1/4, and the loudness will also decrease: L_y L_x =(120 + 10 log(ι_x))-(120 + 10 log(ι_y)) = 10 log(ι_y /Ι_x)=-6.02 by about 6 decibels. That s why you ll find me in the back row (if anywhere) at rock concerts. Note that although we have a formula for loudness, loudness is really a matter of perception, and that perception depends not only on intensity but also on frequency. In fact, sounds with the frequencies between about 800 and 8000 Hz will seem louder that sounds of the same intensity at other frequencies (see Figure 1.9, Benson). Here comes the Math(ematica) We ve seen that what distinguishes musical notes from other sounds is regularly repetition of the same variations in pressure. For example, let s look at the simplest of all regularly repeating shapes, the sine wave: (pressure) y = a sin(b t), where t=time.

Suppose time is in seconds; then this shape repeats every 2π/b seconds. This is the period of the sound; the frequency is the reciprocal: f=b/(2π). So, a sine wave of frequency f has equation y= sin(2π f t). Example: Type into Mathematica the command Play[ Sin[2π 220 t], {t,0,3} ] This plays, for 3 seconds, a sound whose pressure graph is a sine wave with frequency is 220 cycles per second (i.e., 220 Hertz). Now, everyone sing that note; then, just the men; then, just the women. Notice that we re not all singing the same note! Those with higher voices will be singing a related note, given by Play[ Sin[2π 440 t], {t,0,3} ] whose frequency is 440 Hertz. (This note may be familiar to you; it s the standard frequency for the A about middle C.) The way we think of these two sounds as the same is called octave equivalence. In other words, two notes are octave equivalent if their frequencies differ by a multiple of 2 (or, more generally, an integer power of two). So, an octave is an interval---that is, a ratio between frequencies---that corresponds to doubling or halving the frequency. These changes are called going up an octave, and going down an octave, respectively. For example, the 440 Hertz tone is an octave above the starting tone (220 Hertz), and the tone produced by Play[ Sin[2π 110 t], {t,0,3}] would be one octave below the starting tone, while that produced by Play[ Sin[2π 880 t], {t,0,3}] is two octaves above the starting tone. Sounds produced by sine waves are called `pure tones.

There s a certain boring quality to these sounds; we get more interesting sounds when we add together sine waves of different frequencies. First, two pure tones an octave apart: Play[.3 Sin[2π 220 t]+.7 Sin[2π 440 t],{t,0,2}] We can adjust how much of each tone goes into the sound by changing the weight each sine function gets in the sum; this is 70% upper, 30% lower. We can even adjust the weighting as the sound plays: Play[ t Sin[2π 220 t] + (4 - t) Sin[2π 440 t], {t, 0, 4}] But if we bring the frequency of the upper tone down toward the lower tone, we get something ugly. As the frequencies approach each other, we start getting a weird throbbing sound: Play[Sin[2π 220 t] + Sin[2π 226 t], {t, 0, 2}] [plot] These are known as beats ; here s how they arise. The sum of two sine functions can be expressed as a sine times a cosine. This is because of the angle sum formulas sin(a+b) = sin(a) cos(b) + cos (A) sin(b) sin(a-b) = sin(a) cos(b) cos(a) sin(b) which add to give sin(a+b) + sin(a-b) = 2 sin(a) cos(b). In this case, if A+B = 2π 226 t and A-B = 2π 220 t, then A = 2π 223 t and B = 2π 3 t. So, the result is 2 cos(2π 3t) sin(2π 223t), a sine wave of frequency 223 Hertz (halfway between the frequencies of the two waves we added) but whose amplitude is 2 cos(2π 3t), which goes up and down at a frequency of 6 Hz. So, the frequency of the beats is exactly difference between the two frequencies we re combining. Piano tuners take advantage of the phenomenon of `beats when they need to tune two piano strings to the same note; they keep on hitting the key that causes both strings to be struck, and adjusting the tuning pegs until the beating stops. Why sine waves? When you hear a sound, the sound wave sets a bunch of tiny strings in your ear vibrating. The way a string vibrates is

similar to the way a spring oscillates up and down. In both cases, the way in which the displacement y from rest position changes in time is modelled by a differential equation d^2y/dt^2 = -(k/m) y (*). Where this comes from: Newton s law F = m a says that the acceleration (which is d^2y/dt^2) is proportion to the force applied to the object. The force experience by the spring is proportional to how far away from rest position it is (Hooke s law). This distance is given by y, but the force acts in the opposite way, so that F = -k y for some positive constant k. So, F = ma translates to -k y = m d^2 y/dt^2. Bringing the constants to one side gives (*). The most general solution to this differential equation is y = a cos( k/m t) + b sin( k/m t) for arbitrary a and b. (You can check this.) So, why don t we talk about cosine waves? This general solution can always be rewritten as just a sine wave shifted by adding a constant to t: a cos( k/m t) + b sin( k/m t) = c sin( k/m t +phi), where c = sqrt(a^2+b^2) and phi=arctan(b/a). Here, c is the amplitude of the sine wave, which corresponds to its maximum intensity. Another reason why we re interested in sine (and cosine) waves is that any more complicated musical tone of frequency f can be expressed as a sum of pure tones of frequencies f, 2f, 3f, and so on, giving a sum a1 cos(2π f t) + b1 sin(2π f t) + a2 cos(2π 2f t) + b2 sin(2π 2f t) + a3 cos(2π 3f t) + b3 sin(2π 3f t) +... (Technically, this sum could have infinitely many terms; it s called a Fourier series.) In other words, real-world sounds made by voices or musical instruments are composed of many sine waves added together, whose frequencies are all multiples of one lowest frequency (called the fundamental). The mixture of these frequencies determines the timbre of the note: the difference between the tone

of, say, a clarinet and a trumpet lies in how large or small the coefficients a1,b1,a2,b2,... are. This way of assembling sounds from trig functions is known as Fourier synthesis. The Fourier approach can be justified mathematically (so that the shape of any pressure wave can be approximated as well as you like by a Fourier series) but it s also justifiable on musical grounds. Research into how the ear and the brain operate together to process sounds has shown that the basal membrane inside your inner ear is actually a machine for doing Fourier analysis---that is, calculating what the coefficients a_k are (see Benson, pages 9-11). Essentially, different spots on the membrane vibrate in response to disturbances at different frequencies, and the nerve endings at those spots report the frequency data to your brain. Because we have only finitely many nerve endings, tiny changes in frequency may not be perceptible; the smallest change most of us can perceive is about.3%, and that s for frequencies at the high end of our range of hearing. Frequency versus Pitch As our brains identify the various frequencies involved in a note, we assign a single frequency as the pitch of the note. Usually, this is the lowest frequency of all the sine- and cosine-terms that make up the Fourier series of the note. (This lowest frequency is known as the fundamental of the note, and the multiples of this frequency are the overtones.) However, sometimes our perception of pitch can be affected by amplitude: a low bass note seems lower if it is louder. Your brain can also be tricked by so-called auditory illusions, where it fills in the fundamental note even if it isn t there. For example, which of the following sounds seems like it s got a lower pitch? nu = 40; Play[ Sin[ nu (2 Pi (t +.25))] + Sin[2 nu (2 Pi t)] +.5 Sin[3 nu (2 Pi t)] +.25 Sin[5 nu (2 Pi t)], {t, 0, 4}, PlayRange -> {0, 2}] Play[ Sin[2 nu (2 Pi t)] +.5 Sin[3 nu (2 Pi t)]+.25 Sin[5 nu (2 Pi t)], {t,0,4}, PlayRange -> {0,2}]

The second one sounds deeper, but the first one has a lower fundamental... Slightly more about octave equivalence: Octave equivalence is an example of an abstract mathematical concept called an equivalence relation, which we ll study in more detail later. Octave equivalence can be applied to more than just single notes. We can also apply it to melodic lines and to chords (i.e., combinations of 3 or more notes). However, for melodic lines it only makes sense to apply it to the whole phrase, rather than just selective notes. Example: Amazing Grace first phrase, an octave higher, and then with random octave skips inserted. For chords, no-one would argue that moving all the notes of the chord up or down an octave leaves the chord the same. If we selectively move some of the notes in the chord up or down an octave, this is called inverting the chord. (A bad term a mathematician would call it permuting the chord, since it rotates the lowest note up to the top.) This operation doesn t change the set of pitch classes in the chord, but it changes the blend of sounds in the chord, and the harmonic function of the chord is also different. (E.g., the second inversion of a major triad is known as a 6-4 chord, and functions more like a dominant than the tonic.) Assignment Read Benson sections 1.1 through 1.5 and 1.8; Read Harkleroad pages 1 through 10; Homework problems: Benson p.19 #1, 2(hint: work backwards using formula (1.7.2)); p.25 #1, 2(hint: see formula (1.8.8)); and the following: A. What frequency is (i) one octave below 300 Hz? (ii) two octaves above 500 Hz? B. If you move three times as far away from a sound source, by how many decibels does its loudness decrease?