Guide to Mixing v1.0. Nick Thomas

Similar documents
PSYCHOACOUSTICS & THE GRAMMAR OF AUDIO (By Steve Donofrio NATF)

Mixing in the Box A detailed look at some of the myths and legends surrounding Pro Tools' mix bus.

Advance Certificate Course In Audio Mixing & Mastering.

I. LISTENING. For most people, sound is background only. To the sound designer/producer, sound is everything.!tc 243 2

Lecture 1: What we hear when we hear music

Note on Posted Slides. Noise and Music. Noise and Music. Pitch. PHY205H1S Physics of Everyday Life Class 15: Musical Sounds

Sub Kick This particular miking trick is one that can be used to bring great low-end presence to the kick drum.

A few white papers on various. Digital Signal Processing algorithms. used in the DAC501 / DAC502 units

456 SOLID STATE ANALOGUE TAPE + A80 RECORDER MODELS

Liquid Mix Plug-in. User Guide FA

L+R: When engaged the side-chain signals are summed to mono before hitting the threshold detectors meaning that the compressor will be 6dB more sensit

Dither Explained. An explanation and proof of the benefit of dither. for the audio engineer. By Nika Aldrich. April 25, 2002

The Warm Tube Buss Compressor

Eventide Inc. One Alsan Way Little Ferry, NJ

Dynamic Spectrum Mapper V2 (DSM V2) Plugin Manual

FPFV-285/585 PRODUCTION SOUND Fall 2018 CRITICAL LISTENING Assignment

Simple Harmonic Motion: What is a Sound Spectrum?

"Vintage BBC Console" For NebulaPro. Library Creator: Michael Angel, Manual Index

Mixing and Mastering Audio Recordings for Beginners

installation To install the Magic Racks: Groove Essentials racks, copy the files to the Audio Effect Rack folder of your Ableton user library.

Mixers. The functions of a mixer are simple: 1) Process input signals with amplification and EQ, and 2) Combine those signals in a variety of ways.

The basic concept of the VSC-2 hardware

PHYSICS OF MUSIC. 1.) Charles Taylor, Exploring Music (Music Library ML3805 T )

Abbey Road TG Mastering Chain User Guide

La Salle University. I. Listening Answer the following questions about the various works we have listened to in the course so far.

NOTICE. The information contained in this document is subject to change without notice.

SPL Analog Code Plug-in Manual

SPL Analog Code Plug-in Manual

CLA MixHub. User Guide

HOW TO DELIVER YOUR PRE-MASTER FILE

The Physics Of Sound. Why do we hear what we hear? (Turn on your speakers)

Musical Acoustics Lecture 15 Pitch & Frequency (Psycho-Acoustics)

Renaissance Compressor

CHANNEL STRIP. manual ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ ÀÀÀÀÀ

TL AUDIO M4 TUBE CONSOLE

Neo DynaMaster Full-Featured, Multi-Purpose Stereo Dual Dynamics Processor. Neo DynaMaster. Full-Featured, Multi-Purpose Stereo Dual Dynamics

WAVES Cobalt Saphira. User Guide

1 Prepare to PUNISH! 1.1 System Requirements. Plug-in formats: Qualified DAW & Format Combinations: System requirements: Other requirements:

ADSR AMP. ENVELOPE. Moog Music s Guide To Analog Synthesized Percussion. The First Step COMMON VOLUME ENVELOPES

Overview. A 16 channel frame is shown.

Fraction by Sinevibes audio slicing workstation

Eventide Inc. One Alsan Way Little Ferry, NJ

VU Meter. User Guide

Chapter 24. Meeting 24, Dithering and Mastering

Music for the Hearing Care Professional Published on Sunday, 14 March :24

OVERLOUD GEMS USER MANUAL

Digital Audio: Some Myths and Realities

001 Overview 3. Introduction 3 The Kit 3 The Recording Chain Technical Details 6

(Refer Slide Time 1:58)

DOD OWNER'S MANUAL 866 SERIES II GATED COMPRESSOR/LIMITER SIGNAL PROCESSORS

Meter Madness Mike Rivers

Advanced Audio Effects in GarageBand by Jeff Tolbert

White Paper JBL s LSR Principle, RMC (Room Mode Correction) and the Monitoring Environment by John Eargle. Introduction and Background:

y POWER USER MUSIC PRODUCTION and PERFORMANCE With the MOTIF ES Mastering the Sample SLICE function

S I N E V I B E S FRACTION AUDIO SLICING WORKSTATION

Introduction 3/5/13 2

Workshop. Mastering the Mastering Tool Kit

Natural Radio. News, Comments and Letters About Natural Radio January 2003 Copyright 2003 by Mark S. Karney

CMX-DSP Compact Mixers

New recording techniques for solo double bass

8/16/16. Clear Targets: Sound. Chapter 1: Elements. Sound: Pitch, Dynamics, and Tone Color

M-16DX 16-Channel Digital Mixer

Studio One Pro Mix Engine FX and Plugins Explained

UNIVERSITY OF DUBLIN TRINITY COLLEGE

VCE VET MUSIC INDUSTRY: SOUND PRODUCTION

soothe audio processor Manual and FAQ

Foundations and Theory

Voxengo PHA-979 User Guide

Syrah. Flux All 1rights reserved

A different way of approaching a challenge

Chapter Two: Long-Term Memory for Timbre

Dynamic Range Processing and Digital Effects

PROFESSIONAL AUDIO WORKSTATION. Tutorial

The simplest way to stop a mic from ringing feedback. Not real practical if the intent is to hear more of the choir in our PA.

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

MP212 Principles of Audio Technology II

Eventide Inc. One Alsan Way Little Ferry, NJ

CTP 431 Music and Audio Computing. Basic Acoustics. Graduate School of Culture Technology (GSCT) Juhan Nam

Internal Mixing of Dance Music

THE MIXING DESK. Colin Owen.

Beethoven s Fifth Sine -phony: the science of harmony and discord

BeoVision Televisions

This is why when you come close to dance music being played, the first thing that you hear is the boom-boom-boom of the kick drum.

THE DIGITAL DELAY ADVANTAGE A guide to using Digital Delays. Synchronize loudspeakers Eliminate comb filter distortion Align acoustic image.

RaneNote SETTING SOUND SYSTEM LEVEL CONTROLS

OVERLOUD GEMS USER MANUAL

CFX 12 (12X4X1) 8 mic/line channels, 2 stereo line channels. CFX 16 (16X4X1) 12 mic/line channels, 2 stereo line channels

Noise Tools 1U Manual. Noise Tools 1U. Clock, Random Pulse, Analog Noise, Sample & Hold, and Slew. Manual Revision:

Math and Music: The Science of Sound

Linear Time Invariant (LTI) Systems

Musical Signal Processing with LabVIEW Introduction to Audio and Musical Signals. By: Ed Doering

SPL Analog Code Plugin Manual

Chapter 14 D-A and A-D Conversion

HORACE X PERFORMANCE REQUIREMENTS (Technical)

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

Dr. Speaker Blower and Presents

SM12 & SM16 STEREO MIC/LINE MIXING CONSOLES OPERATION MANUAL

Oxford Limiter Plug-in Manual. For. Digidesign ProTools

Pitch. The perceptual correlate of frequency: the perceptual dimension along which sounds can be ordered from low to high.

White Paper Measuring and Optimizing Sound Systems: An introduction to JBL Smaart

Transcription:

Guide to Mixing v1.0 Nick Thomas February 8, 2009

This document is a guide to the essential ideas of audio mixing, targeted specifically at computer-based producers. I am writing it because I haven t been able to find anything similar freely available on the Internet. The Internet has an incredible wealth of information on this subject, but it is scattered across a disorganized body of articles and tutorials of varying quality and reliability. My aim is to consolidate all of the most important information in one place, all of it verified and fact-checked. This guide will not tell you about micing techniques or how to track vocals or what frequency to boost to make your guitars really kick. There s plenty of stuff written already on mixing live-band music. This guide is specifically for computer-based electronic musicians, and so it is tailored to their needs. On the other hand, this guide does not assume that you are making cluboriented dance music. Certainly the advice in here is applicable to mixing electro house or hip-hop, but it is equally applicable to mixing ambient or IDM. 1 On the other hand, dance music does pose special mixing challenges, such as the tuning of percussion tracks and the achievement of loudness, and these challenges are given adequate time, since they are relevant to many readers. In this document, I assume only very basic prior knowledge of the concepts of mixing. You should know your way around your DAW. You should know what a mixer is, and what an effect is, and how to use them. You should probably have at least heard of equalization, compression, and reverb. You should have done some mixdowns for yourself, so that you have the flavor of how the whole process works. But that s really all you need to know at this point. I do not claim to be an expert on any of this material. I have, however, had this guide peer-reviewed by a number of people, many of them more knowledgable about mixing than I. Therefore, I think it s fair to say that at the very least it does not contain many gross inaccuracies. I thank them for their effort. If you have questions, comments, or complaints of any kind about anything I ve written here, please write nhomas@gmail.com. 1 Indeed, the advice in here is applicable to, though not sufficient for, mixing even live band music. The defining characteristic of electronic music, other than being made with electronics, is that it has no defining characteristics. It can be anything, and so a guide to mixing electronic music has to be a guide to mixing anything. 1

Contents 1 Sounds 5 1.1 Frequency Domain.......................... 5 1.2 Patterns of Frequency Distribution................. 8 1.2.1 Tones............................. 8 1.2.2 The Human Voice...................... 8 1.2.3 Drums............................. 9 1.2.4 Cymbals............................ 9 1.3 Time Domain............................. 10 1.4 Loudness Perception......................... 10 1.5 Digital Audio............................. 12 1.5.1 Clipping............................ 13 1.5.2 Sampling Resolution..................... 14 1.5.3 Dynamic Range........................ 14 1.5.4 Standard Sampling Resolutions............... 15 1.5.5 Sampling Rate........................ 15 2 Preparation 17 2.1 Monitors................................ 17 2.2 Volume Setting............................ 17 2.3 Plugins................................ 18 2.4 Ears.................................. 18 2.5 Sound Selection............................ 19 3 Mixer Usage 20 3.1 Leveling................................ 20 3.1.1 Input Gain.......................... 21 3.1.2 Headroom........................... 21 3.1.3 Level Riding......................... 22 3.2 Effects and Routing......................... 22 3.2.1 Inserts............................. 22 3.2.2 Auxiliary Sends........................ 23 3.2.3 Busses............................. 23 3.2.4 Master Bus.......................... 24 3.2.5 Advanced Routing...................... 24 2

4 Equalization 25 4.1 Purposes................................ 25 4.1.1 Avoiding Masking...................... 25 4.1.2 Changing Sound Character................. 26 4.2 Using a Parametric Equalizer.................... 26 4.2.1 Setting the Frequency.................... 27 4.2.2 Setting the Q and Gain................... 28 4.2.3 Evaluating Your Results................... 28 4.2.4 High Shelf/Low Shelf Filters................ 29 4.2.5 Highpass/Lowpass Filters.................. 29 4.3 Typical EQ Uses........................... 30 4.3.1 General............................ 30 4.3.2 Kick Drums.......................... 31 4.3.3 Basslines........................... 31 4.3.4 Snare Drums......................... 31 4.3.5 Cymbals............................ 31 4.3.6 Instruments.......................... 31 4.3.7 Vocals............................. 32 5 Compression 33 5.1 Purposes................................ 33 5.1.1 Reducing Dynamics..................... 33 5.1.2 Shaping Percussive Sounds................. 34 5.1.3 Creating Pumping Effects.................. 34 5.1.4 When Not to Use Compression............... 34 5.2 How It Works............................. 34 5.2.1 Threshold, Ratio, and Knee................. 35 5.2.2 Attack and Release...................... 35 5.2.3 Compressor Parameters................... 36 5.3 Procedure for Setup......................... 36 5.4 More Compression.......................... 38 5.4.1 Limiters............................ 38 5.4.2 Serial Compression...................... 38 5.4.3 Parallel Compression..................... 39 5.4.4 Sidechain Compression.................... 39 5.4.5 Gates............................. 40 5.4.6 Expanders........................... 41 5.4.7 Shaping Percussive Sounds................. 41 5.4.8 Creating Pumping Effects.................. 42 5.4.9 Multiband Compression................... 42 6 Space Manipulation 44 6.1 Panning................................ 44 6.2 Stereo Sounds............................. 45 6.2.1 Phase Cancellation...................... 46 6.2.2 Left/Right Processing.................... 47 3

6.2.3 Mid/Side Processing..................... 47 6.3 Delays................................. 48 6.4 Reverb................................. 49 6.4.1 Purposes........................... 50 6.4.2 How It Works......................... 50 6.4.3 Convolution Reverb..................... 51 6.4.4 Mixing With Reverb..................... 51 7 Conclusion 53 7.1 Putting It All Together....................... 53 7.2 Final Thoughts............................ 54 4

Chapter 1 Sounds Before diving into the details of mixing, we need to look at some properties of sounds in general. This section is background information, but it is necessary to understand its contents in order to grasp a lot of the basic principles of mixing. A sound is a pressure wave traveling through the air. Any action which puts air into motion will create a sound. Our auditory system systematically groups the pressure waves that hit our ears into distinct sounds for ease of processing, much how our vision groups the photons that hit our eyes into objects. But, just like our vision can divide visual objects into smaller objects (a person can be divided into arms, legs, a head, etc.), our brains can analytically divide sounds into smaller sounds (for instance the spoken word cat can be divided into a consonant k, a vowel ahh, and another consonant t ). Similarly, just as our vision can group collections of small objects into larger objects (a collection of persons becomes a crowd ), our brains can group collections of sounds into larger sounds (a collection of handclaps becomes applause ). 1.1 Frequency Domain If you continue to subdivide physical objects into smaller and smaller pieces, you will eventually arrive at atoms, which cannot be further subdivided. There is a similarly indivisible unit of sound, and that is the frequency. All sounds can ultimately be reduced to a bunch of frequencies. The difference is that, where an object may be composed of billions of atoms, a sound typically consists of no more than thousands of frequencies. So, frequencies are a very practical way of analyzing sounds in the everyday context of electronic music. What is a frequency, anyway? A frequency is simply a sine-wave shaped disturbance in the air; an oscillation, in other words. They are typically considered in terms of the rate at which they oscillate, measured in cycles per second (Hz). Science tells us that the human ear can hear frequencies in the approximate range of 20Hz to 20,000Hz, though many people seem to be able hear 5

somewhat further in both directions. In any case, this range of 20Hz-20,000Hz comfortably encompasses all of the frequencies that we commonly deal with in our day to day lives. Unsurprisingly, different frequencies sound different, and have different effects on the human psyche. There is a continuum of changing flavor as you go across the frequency range. 60Hz and 61Hz have more or less the same flavor, but by the time you get up to 200Hz, you are in quite different territory indeed. It is worth noting that we perceive frequencies logarithmically. In other words, the difference between 40Hz and 80Hz is comparable to the difference between 2,000Hz and 4,000Hz. This power-of-two difference is called an octave. Humans can hear a frequency range of approximately ten octaves. I will now attempt to describe the various flavors of the different frequency ranges. As I do, bear in mind that words are highly inadequate for this job. First, because we do not have words to refer to the flavors of sounds, so I must simply attempt to describe them and hope that you get my drift. Second, because, as I have said previously, all of these flavors blend into each other; there are no sharp divisions between them. 1 With all that in mind, here we go. 20Hz-40Hz subsonics : These frequencies, residing at the extremes of human hearing, are almost never found in music, because they require extremely high volume levels to be heard, particularly if there are other sounds playing at the same time. Even then, they are more felt than heard. Most speakers can t reproduce them. That said, subsonics can have very powerful mental and physical effects on people. Even if the listener isn t aware that they re being subjected to them, they can experience feelings of unease, nausea, and pressure on the chest. Subsonics can move air in and out the lungs at a very rapid rate, which can lead to shortness of breath. At 18Hz, which is the resonant frequency of the eyeball, people can start hallucinating. It is suspected that frequencies in this range may be present at many allegedly haunted locales, since they create feelings of unease. Furthermore, frequencies around 18Hz may be responsible for many ghost sightings. Incidentally, many horror movies use subsonics to create feelings of fear and disorientation in the audience. 40Hz-100Hz sub-bass : This relatively narrow frequency range marks the beginning of musical sound, and it is what most people think of when they think of bass. It accounts for the deep booms of hip-hop and the hefty power of a kick drum. These frequencies are a full-body experience, and carry the weight of the music. Music lacking in sub-bass will feel lean and wimpy. Music with an excess of sub-bass will feel bloated and bulky. 2 100Hz-300Hz bass : Still carrying a hint of the feeling of the sub-bass range, this frequency range evokes feelings of warmth and fullness. It is body, 1 This also implies that the precise frequency ranges given for each flavor are highly inexact and really somewhat arbitrary. 2 It is a common beginner mistake to mix with far too much sub-bass. To do so may produce a pleasing effect in the short term, but in the long term it will become apparent that the excess of sub-bass is hurting the music by destroying its sense of balance and making it tiring to listen to. 6

stability, and comfort. It is also the source of the impact of drums. An absence of these frequencies makes music feel cold and uneasy. An excess of these frequencies makes music feel muddy and indistinct. 300Hz-1,000Hz lower midrange : This frequency range is rather neutral in character. It serves to anchor and stabilize the other frequency ranges; without it, the music will feel pinched and unbalanced. 1,000Hz-8,000Hz upper midrange : These frequencies attract attention. The human ear is quite sensitive in this range, and so it is likely to pay attention to whatever you put in it. These frequencies are presence, clarity, and punch. An absence of upper midrange makes music feel dull and lifeless. An excess of upper midrange makes music feel piercing, overbearing, and tiring. 8,000Hz-20,000Hz treble : Another extreme in the human hearing range. These frequencies are detail, sparkle, and sizzle. An absence of treble makes music feel muffled and boring. An excess of treble makes music harsh and uncomfortable to listen to. These frequencies, by their presence of absence, make music exciting or relaxing. Music that is meant to be exciting, such as dance music, contains large amounts of treble; music that is meant to be relaxing contains low amounts of treble. As people age, they gradually lose their ability to hear frequencies in this range. So now we understand the effects of invidiual frequencies on the human psyche. But sounds rarely consist of single frequencies; they are composed of multitudes of frequencies, and the way in which said frequencies are organized also has an effect on the human psyche. When multiple frequencies occur simultaneously in the same frequency range, their conflicting wavelengths cause periodic oscillations in volume known as beating. Beating is more noticeable in lower frequencies than in higher frequencies. In the sub-bass range, any beating at all becomes quite dominating and often disturbing, while in the treble range, frequencies are typically quite densely packed to no ill effect. Beating is also the underlying principle of the formation of musical chords. Combinations of tones which produce subtle beating are considered consonant, while combinations of tones which produce pronounced beating are considered dissonant. When considering chords in terms of beating, it is important to note that beating occurs not only between the fundamental frequencies of the tones involved, but also their harmonics. Thus, for instance, while two individual frequencies a major ninth apart will not produce beating, two tones a major ninth apart will, because their harmonics will produce beating. Beating also contributes to the character of many non-tonal sounds. For instance, the sound of a cymbal is partially due to the beating of the countless frequencies which it contains. Similarly, the thumpy sound of the body of an acoustic kick drum is partially due to the beating of bass frequencies. 7

1.2 Patterns of Frequency Distribution Having considered in general the psychological effects of individual frequencies and combinations of frequencies, let us now examine the specific frequency distribution patterns of common sounds. Obviously, it would be impossible to describe the frequency distribution patterns of every possible sound. Indeed, every frequency distribution describes one sound or another. So, in this section, we will simply examine the frequency distribution patterns of the sounds most commonly found in music. We will only examine four categories of sounds, but they cover a surprisingly large amount of ground; with them, we will be able to account for the majority of sounds found in most music. 1.2.1 Tones The simplest frequency organization structure is the tone. Tones are very common in nature, and our brains are specially built to perceive them. A tone is a series of frequencies arranged in a particular, mathematically simple, pattern. The lowest frequency in the tone is the called fundamental, and the frequencies above it are called harmonics. The first harmonic is twice the frequency of the fundamental; the second harmonic is three times the frequency; and so forth. This extension could theoretically go on to infinity, but because the harmonics of a tone typically steadily fall in volume with increasing frequency, in practice they peter out eventually. The character of a particular tone, often called its timbre, is partially determined by the relative volumes of the harmonics; these differences are a big part of what differentiates a clarinet from a violin, for instance. The reedy, hollow tone of a clarinet is partially due to a higher emphasis on the oddnumbered harmonics, while a violin tone gets its character from a more even distribution of harmonics. The bright tone of a trumpet is due to the high volume of its treble-range upper harmonics, while the mellower tone of a french horn has much more subdued upper harmonics. Tones are the bread and butter of much music. All musical instruments, except for percussion instruments, primarily produce tones. Synthesizers also mostly produce tones. 1.2.2 The Human Voice The human voice produces tones, and thus could justifiably be lumped into the previous section. But there is a lot more to it than that, and since the human voice is such an important class of sound, central to so much music, it is worth examining more closely. The human voice can make a huge variety of sounds, but the most important sounds for music are those that are used in speech and singing: specifically, vowels and consonants. A vowel is a tone. The specific vowel that is intoned is defined by the relative volumes of the different harmonics; the difference between an ehh and an ahh 8

is a matter of harmonic balance. In speech, vowel tones rarely stay on one pitch; they slide up and down. This why speech does not sound tonal to us, though it technically is. Singing is conceptually the same as speaking, with the difference being that the vowels are held out at constant pitches. A consonant is a short, non-tonal noise, such as t, s, d, or k. They are found in the upper midrange. The fact that consonants carry most of the information content of human speech may well account for the human brainear s bias towards the upper midrange. So, we can see that the human voice, as it is used in speech and singing, is composed of two parts: tonal vowels, and non-tonal consonants. That said, the human voice is very versatile, and many of its possible modes of expression are not covered by these two categories of sound. Whispering, for instance, replaces the tones of vowels with breathy, non-tonal noise, with consonants produced in the normal manner. Furthermore, many of the noises that are made, for instance, by beatboxers, defy analysis in terms of vowels and consonants. 1.2.3 Drums So far we have examined tones and the human voice. The human voice is quite tonal in nature, so in a certain sense we are still looking at tones. Now we will look at drum sounds, which, though not technically tones, are still somewhat tonal in nature. A drum consists of a membrane of some sort stretched across a resonating body. It produces sound when the membrane is struck. A drum produces a complex sound, the bulk of which resides in the bass and the lower midrange This lower component of the sound, which I call the body, does not technically fit the frequency arrangement of a tone, but usually bears a greater or lesser resemblance to such an arrangement, and thus the sound of a drum is somewhat tonal. In addition to the body component of the sound, which is created by the vibration of the membrane, part of the sound of a drum is created by the impact between the membrane and the striking object. This part of the sound, which I will refer to as the beater sound, has energy across the frequency spectrum, but is usually centered in the upper midrange and the treble. 1.2.4 Cymbals Now, having examined tones in general, the human voice, and drums, we come to the first (and only) completely non-tonal sounds that we will examine: cymbals. Cymbals are thin metal plates that are struck, like drums, with beaters. The vibrations of the struck plates create extremely complex patterns of frequencies, hence the non-tonal nature of cymbals. Cymbals have energy throughout the entire frequency spectrum, but the bulk of said energy is typically in the treble range, or in the midrange in the case of large cymbals such as gongs. There is also reason to believe that cymbals have significant sonic energy above the range of human hearing, since their energy 9

shows no signs of petering out near 20kHz. In any case, because cymbals have so much treble energy, they are a very exciting type of sound. 1.3 Time Domain Thus far we have analyzed sounds in terms of frequencies, and indeed this type of analysis, called frequency domain analysis, is a very useful way to analyze them. But there is another way to analyze sounds that is important to understand for the purposes of mixing, which is in terms of their waveforms. This type of waveform-based analysis is called time domain analysis. Time domain analysis essentially means looking at a sound not in terms of the sine waves that make it up, but in terms of the patterns of disturbance that it causes in whatever medium it is traveling through: air molecules, a human eardrum, a speaker cone, or the electrical signal in an audio cable, for instance. The intensity of the disturbance that the sound causes at any given instant is called its amplitude. The sound of a sound is determined by its patterns of changing amplitude; its waveform, in other words. When you combine two sounds (i.e., play them simultaneously through the same medium), their time-domain disturbances are added together; the instantaneous amplitude of the resulting sound at any given time is a simple mathematical sum of the instantaneous amplitudes of the separate sounds. This is why the final stage of mixing (i.e., combining the separate mixer tracks into one master track) is sometimes called summing. It literally is just a matter of taking the sum of everything. It is important to understand that any sound can be analyzed both in the frequency domain and the time domain. You can look at a sound as a collection of sine waves, or you can look at it as a pattern of disturbance in a medium. Both perspectives are useful for different things. 1.4 Loudness Perception Since loudness is such an important topic in mixing, it seems appropriate at this point to talk about the perception of loudness in general. Loudness is measured in decibels (db). Decibels are a relative, logarithmic measurement. Decibels are a logarithmic measurement in that amplitude increases exponentially with decibel value. Specifically, every 10dB increase or decrease of decibel value corresponds to a factor of ten increase or decrease in amplitude. In other words, increasing a sound s amplitude by 10dB multiplies its amplitude by ten. Increasing a sound s loudness by 20dB multiplies its amplitude by a hundred. Decreasing a sound s loudness by 30dB multiplies its amplitude by one thousandth. And so forth. Decibels are a relative measurement in that a measurement of decibels does not tell you precisely how loud a sound is; it can only tell you how loud it is 10

Figure 1.1: Sensitivity of the human ear across the audible frequency range. 11

relative to some reference amount, usually designated as 0dB. So, for instance, a level of 3dB is three decibels louder than the reference level, and a level of -3dB is three decibels quieter than the reference level. When discussing real-world sounds traveling through the air, loudness is most often measured in dbspl, or decibels of sound pressure level. This is a unit of measure based on the decibel, with the reference level of 0dBSPL being the quietest sound that is audible by a young adult with undamaged hearing. 3 The threshold of pain is generally placed around 120dBSPL. This range of 0dBSPL to 120dBSPL gives us the practical dynamic range 4 of human hearing. 80dBSPL is a good listening level for music. Loudness can be measured in two ways: it can be measured in terms of peak loudness, or in terms of average loudness. Peak loudness measures the amplitude of the highest instantaneous peaks in the sound. Average loudness measures the overall average amplitude level, taking into account all of the loud peaks and the quiet in-between spaces. 5 Peak loudness is good to know because peaks that are too loud will often cause audio equipment to overload. Average loudness is good to know because it reflects, more accurately than peak loudness, the human ear s actual perception of loudness. The level meters on most audio mixers measure peak loudness. Average loudness, when measured as described above, will still not be a terribly accurate measurement of human loudness perception. Loudness perception is complicated by the fact that the ear has a bias towards certain frequency ranges and away from others. The ear is most insensitive in the subsonic range, and becomes progressively more sensitive into the upper midrange, after which its sensitivity rapidly rolls off. The sensitivity also varies with volume, with the ear being less sensitive to bass and treble at lower volumes. The precise sensitivity curves are given in Figure 1.1. 1.5 Digital Audio Thus far we have only looked at how sounds work in the real world; we ve looked at sounds in the form of pressure waves in the air, and in the form of analog electrical signals. We have not yet looked at how sounds are represented in the computer, in their digital, numerical representation. Digital sound behaves in more or less the same way as real-world, analog sound, but there are still a number of special considerations that apply, so it is worth examining the basic ideas behind it. The defining characteristic of any kind of digital data, be it text, pictures, or movies, is that it is made of a bunch of numbers. Numbers are all that 3 Because human hearing sensitivity varies with frequency, this quietest audible sound metric is measured at a frequency of 1kHz, where human hearing is most sensitive. 4 The dynamic range of a system is the ratio between the quietest sound it can handle, and the loudest sound it can handle. 5 1 Average loudness is essentially T of the sound over time and T is the length of the time interval being measured. T 0 a(t)2 dt, where a(t) is the instantaneous amplitude 12

Figure 1.2: Analog to digital conversion. Figure 1.3: Digital clipping. computers know how to work with. When computers work with audio, the situation is no different: they must figure out how to take the continuous timedomain waveform of a sound and reduce it to a series of numbers. They accomplish this by sampling the waveform. What this means is that, when you record an audio signal into your computer, it captures it by measuring the instantaneous amplitude of the waveform at regular intervals. These individual measurements are called samples. This process of sampling turns the continuous, analog waveform into a numeric, digital approximation that looks a lot like a staircase. Figure 1.2 illustrates the effect. 1.5.1 Clipping The numeric value of a sample represents its amplitude. One of the limitations of digital systems is that they have a sharp, absolute limit on the maximum amplitude of the signals that can be represented; the computer will only count so high. Any amplitudes that are higher than the maximum countable amplitude will simply be clipped off, as shown in Figure 1.3. As you might guess, digital clipping generally sounds quite bad, and it is 13

to be avoided in most circumstances. 6 Whenever you are working with digital audio, you must make sure that it never exceeds the maximum digital amplitude. 1.5.2 Sampling Resolution Besides clipping, the process of analog to digital conversion can have a number of other detrimental effects on the quality of audio. Furthermore, processing audio when it is in digital form can further degrade the quality, due to rounding errors in the numerical digital processing algorithms. There are two attributes of a digital audio system that determine its fidelity: sampling rate 7 and sampling resolution. If both of these attributes are sufficiently good, then digital recording and processing will create little or no audible degradation of the sound quality. The sampling resolution of a system is the numeric accuracy of the individual samples. The more possible numeric values for a sample, the higher the sampling resolution is. Because computers work in binary, sampling resolution is typically described in terms of bits. A 4-bit digital system has 16 possible numeric values for each sample. 8 An 8-bit system has 256 possible values. A 16-bit system has 65,536 possible values, and a 24-bit system has 16,777,216 possible values. In general, an n-bit system has 2 n possible numeric values for each sample. A low sampling resolution will degrade the quality of the audio by introducing quantization noise. Quantization noise is the audible artifact that results from the rounding errors inherent in analog to digital conversion, as seen in Figure 1.2. It usually 9 manifests in the form of a low-volume hissing sound, somewhat similar to the sound heard in quiet sections on analog tapes and vinyl. This sound will mask subtle details in the sound and make sufficiently quiet sounds inaudible. 1.5.3 Dynamic Range The higher the bit resolution of a digital system is, the quieter the quantization noise is. The level of the quantization noise is what determines the system s total dynamic range; that is, the ratio between the quietest possible sound and the loudest possible sound. The quietest possible sound is restricted by the level of the quantization noise, and the loudest possible sound is restricted by the threshold for clipping. A digital system has a dynamic range of 6dB times the bit resolution. In other words, each bit of sampling resolution adds roughly 6dB of dynamic range. Thus, the dynamic range of a 16-bit system is about 96dB. The dynamic range 6 Digital clipping may, in certain circumstances and styles, be considered aesthetically desirable, but in the vast majority of cases it is considered an artifact. 7 See Section 1.5.5 for a discussion of sampling rates. 8 Figure 1.2 shows 4-bit sampling. 9 With particularly simple signals, particularly quiet signals, and particularly low sampling resolutions, the quantization noise may manifest quite differently, and usually in a more disturbing way. 14

of a 24-bit system is about 144dB, larger than the dynamic range of human hearing. Volume levels in the digital world are measured in full-scale decibels, or dbfs. The digital full-scale measurement system measures peak volume, not average volume. The 0dB reference point is set at the highest representable amplitude; in other words, 0dBFS is the loudness of the loudest possible sound. All other volume levels are negative; a sound with a level of -6dBFS has a peak level 6dB below the digital maximum, for instance. 1.5.4 Standard Sampling Resolutions There are two commonly used sampling resolutions: 16-bit and 24-bit. 16-bit is the resolution of audio CDs and most MP3s. It is typically used for the distribution of mixed-down music. Its dynamic range is sufficient for the vast majority of music. In the actual mixing process, it is preferable to use 24-bit. 24-bit has more dynamic range than 16-bit. While the difference doesn t matter much for finished mixdowns, it can make a difference when in the mixing process, because the extra dynamic range gives some slop room, allowing for the rounding errors introduced by digital processing to occur without significant audible effects. Some DAWs also have a 32-bit resolution. This usually refers to the socalled floating point representation of digital audio, as opposed to the usual fixed-point representation, which is what we have discussed so far. 32-bit floating point and 24-bit fixed point are, in a certain sense, the same thing. Without going into the technical differences between the two, 32-bit floating point audio has the same dynamic range as 24-bit fixed point audio, with the added advantage that audio above the 0dBFS threshold will not clip. Instead, the computer will effectively take bits from the bottom and add them to the top. This raises the quantization noise, but also raises the maximum representable amplitude, resulting in a net effect of the same amount of dynamic range. It is generally not a good idea to take advantage of floating point s ability to exceed the 0dBFS ceiling, because even in DAWs that fully support floating point, many plugins will convert their input audio to fixed point internally; when they do this, the audio will clip. So, even if you are working in floating point, it is best to act as if you were not, and keep all levels below 0dBFS at all times. 1.5.5 Sampling Rate The sampling rate of a digital system is the number of samples per second that it uses to represent the audio. For instance, audio CDs uses 44,100 samples per second. Sampling rates are measured in hertz (Hz), just like frequencies. Thus, the audio CD sampling rate might be written as 44,100Hz, or 44.1kHz. Intuitively, you might expect that a higher sampling rate would yield higher quality audio, and this intuition is correct. Specifically, sampling rate affects 15

the frequency response of the digital system; that is, the range of frequencies that it can represent. Digital systems have no minimum representable frequency; they can go all the way down to 0Hz. They do, however, have a maximum representable frequency, and it is determined by the sampling rate. Specifically, the maximum representable frequency is half of the sampling rate. Thus, with a sampling rate of 44.1kHz, the maximum representable frequency is 22.05kHz. This maximum frequency is referred to as the Nyquist frequency. The most common sampling rates are 44.1kHz, 48kHz, 96kHz, and 192kHz. The lowest of these, 44.1kHz, is typically used for distributing finished mixes. Since this sampling rate can represent all audible frequencies, you might wonder why anyone would ever use a higher sampling rate. The answer is that, besides allowing higher frequencies to be represented, higher sampling rates can also make certain audio processes sound better, with fewer sonic artifacts. Such processes include equalization 10 and compression 11, certain aspects of synthesis, such as filtering and waveform synthesis, and certain aspects of sampling, such as repitching. The drawback of higher sampling rates is that they imply higher CPU usage. For instance, going from 48kHz to 96kHz, you can expect most processes to use twice as much CPU, because they are processing twice as many samples in the same amount of time. 10 See Section 4. 11 See Section 5. 16

Chapter 2 Preparation In this section we will look at some things that you need to think about before you set out to mix a track. 2.1 Monitors First and foremost, you will have a devil of a time trying to mix your track if you can t hear it properly. You will want a good output device. Speakers are preferable to headphones, because they give a better picture of the stereo image of the music. After acquiring a good pair of speakers, you will need to spend some time and money fine-tuning your room acoustics for ideal monitoring. Headphones are cheaper than speakers, and require no tuning of room acoustics to perform well. Even if you own a good pair of speakers, you will still want to check your mix on headphones, because they can allow you to hear certain fine details in the music that would not show up otherwise. A fantastic monitoring system is not necessary for producing fantastic mixes, but it makes things easier. The worse your monitoring system is, the harder it will be to get good results, but it will always be possible. 2.2 Volume Setting In order to get the best results out of your monitoring equipment, you will need to make sure that you re monitoring at a good volume. A good volume is not too quiet and not too loud. In general, it s best to err on the side of too quiet. There are many reasons to use moderation in your volume setting: If your volume is too loud, then your ears will quickly become fatigued, and you will lose your ability to make accurate judgments about the mix. 17

If your volume is too quiet, then you will not be able to hear fine details in the music, and this will also impair your ability to make accurate judgments about the mix. Your ear s frequency response changes with volume. Louder music will also seem to have more bass and treble. Thus, if you monitor too loudly, then you will mix your music with too little bass and treble, and if you monitor too quietly, then you will mix your music with too much bass and treble. When working on drums and percussion tracks, and anything that needs to be really kicking and punchy, I would recommend working at a somewhat lower volume than you would for normal mixdown tasks. If you do this, you will probably end up with a punchier result. If you can make your drums sound punchy at a low volume, then they ll sound really punchy when you turn them up. On the other hand, getting your drums to sound punchy at a high volume is no challenge, and the results won t always translate to lower volumes. 2.3 Plugins Another prerequisite to getting a really good mix is ensuring that your DAW 1 is equipped with good plugins. Not all plugins are made equal, and you need to make sure that you re using good ones. Some DAWs will come bundled with usable plugins, but other DAWs will not. You need to know which camp your DAW falls into, and if it falls into the latter category, you need to get some good third-party plugins. At the very least, you need to make sure that you have a really good equalizer, compressor, and reverb plugin. It s also worthwhile to have some analyzer plugins: specifically, a spectrum analyzer and a waveform viewer. 2 A spectrum analyzer allows you to see the frequency domain characteristics of your sounds, and a waveform viewer allows you to see the time domain characteristics of your sounds. 2.4 Ears Your most important piece of gear, of course, is your ears. Develop a relationship with your ears that is based on trust and love. Try to keep them in good shape. Don t abuse them with excessive loud sounds. That s the love part. The trust part is this. You will not be able to successfully mix music unless you can have confidence in the things your ears tell you. You have to be able to take the attitude that if it sounds good, it is good. All of the advice you read can guide you in your mixing, but every decision ultimately has to be an ear-based decision. 1 Digital Audio Workstation, or DAW, is jargon for any music-making program, such as Ableton Live, Cubase, Pro Tools, or FL Studio. 2 Smartelectronix s s(m)exoscope is an excellent free waveform viewer. 18

2.5 Sound Selection This is the one thing that will make or break your mix. You have to make sure that you have selected sounds that will naturally fit well together. Essentially, you have to pick out your sounds and compose your track such that you minimize masking and fill out the frequency spectrum nicely, striking a balance between fullness and clarity. For more details on masking, see Section 4.1.1. You will not get a good mix if you do not have good sound selection. Period. Mixing techniques can make your sounds work better together. They cannot make your sounds work together if they do not basically work together to begin with. 19

Chapter 3 Mixer Usage Having spent some time working on prerequisites, we will now move into issues directly related to mixing. The most important tool for mixing is the mixer. Most DAWs today include mixers as a built-in basic feature. These mixers are traditionally modeled after analog hardware mixers, and share a lot of the same principles of operation. This guide assumes that you are using a software-based DAW mixer. A mixer consists of a series of channel strips. Each of these channel strips will correspond to one of the sounds in your mix: a virtual instrument, a drum kit, or a recorded vocal performance, for instance. Each channel strip contains a variety of tools to manipulate the sound going into it. The purpose of the mixer is to perform these manipulations, and then mix together the sounds coming from each channel strip, creating one audio signal that is the sum (both in the intuitive and mathematical sense) of all of the separate audio signals. 3.1 Leveling Each channel strip will prominently feature a level fader which controls the volume of the sound going into it (usually calibrated in terms of dbfs). The level faders are the most basic tool for balancing mixes. The process of adjusting the level faders to achieve a satisfactory balance is called leveling. This seems like a fairly easy thing to do, but it is surprisingly easy to get it wrong. Leveling is easy to get wrong partially because it s so easy to overthink it. The more you think about the levels, the more your perception becomes distorted, and the more likely you are to get things wrong. Leveling is really pretty easy if you approach it the right way. In general, if you have a good sound selection, then all of your sounds will be audible in any case, and tiny differences in level should not be of great importance. So leveling is just a matter of getting everything approximately right without losing perspective. The main guiding principle of leveling is that you should make the most important parts of your music the loudest. If you re writing dance music, you 20

probably want the drums and the bassline loudest, or whichever sounds are carrying the main groove. If you re writing pop music, you probably want the vocal line to be the loudest. If you re writing more left-of-field music, then you need to do some soul-searching and figure out which parts are the most important. Perhaps all of the parts are equally important, and you should level to achieve an even, unbiased presentation. There are two general ways to approach leveling. The first approach is to just level as you go. This approach generally works fine in my experience, as long as you don t put too much thought into it. But if at any point you re not feeling satisfied with your levels, and you want to completely re-do them, there is a simple procedure for doing so. To set your levels from scratch, start by dragging all of your faders down to zero. Then bring them up one by one, but put some thought into the order in which you bring them up. Generally speaking you should bring them up in order of importance, so that the most important (and loudest) parts come up first. This way you ensure a successful balance between the core elements of your track before considering the less important elements. 3.1.1 Input Gain Many mixers offer an input gain control, which allows you to adjust the volume of the input to a channel strip before any other processing occurs. This input gain control is useful for getting sounds that are far too loud or far too quiet in the ballpark, so to speak, so that the level faders aren t shoved off into the extreme ends of their ranges. 3.1.2 Headroom One important topic that we have yet to address is that of headroom. It is important when you are mixing to leave a certain amount of headroom; in other words, to not allow the level of your mix to exceed a certain peak loudness. For instance, if your mix never goes louder than -5dBFS, you would say that you have 5dB of headroom. There are two reasons to leave headroom in this manner: first, to avoid digital clipping with levels greater than 0dBFS, and second, to leave some space to perform mastering or finalizing processes (see Section 7.1). How much headroom you need to leave is an open question, but in general, when working in 24-bit audio, it is better to err on the side of too much than on the side of too little. Anywhere between 3dB and 20dB of headroom should be fine. 6dB is a pretty good amount for music with a modest dynamic range, such as pop music or electronic dance music. For music with a wide dynamic range, you will want more headroom, to leave space for any unexpectedly large peaks. In order to create a given amount of headroom, you will need to set your individual mixer tracks so that their levels are somewhat below the desired amount of headroom. If you want to leave 6dB of headroom, then you might set 21

your loudest mixer tracks so that their levels do not exceed -9dBFS. Of course, this is only a starting point, and depending on the nature of the interactions between your mixer tracks, it may not work for your mix. Naturally, your music will be quieter if it has a lot of headroom. Do not remove headroom because your music is too quiet; just turn up your monitoring volume. You will want to remove most or all of the headroom before you send your mix out into the world, but now is not the time to do that. You should only do so as one of the very last steps in the mixing process. See Section 7.1 for details. 3.1.3 Level Riding One last thing to consider when leveling is the concept of level riding. If you ride your levels, then what that means is that, rather than having your level faders always stay at a fixed position, they move up and down over the course of the track to shape the dynamics and the balance of the music. In my experience, level riding is very useful and important for music with a wide dynamic range. It is usually unnecessary with less dynamic music, such as electronic dance music. 3.2 Effects and Routing You can go pretty far using a mixer just to combine your various channel strips at different levels, but mixers can do so much more. As previously mentioned, channel strips have a variety of controls to manipulate the sounds going into them. These controls vary somewhat from mixer to mixer. You can be quite certain that you ll have a pan control (discussed in Section 6.1). You might also have a built-in equalizer; equalizers in general are discussed in Section 4. 3.2.1 Inserts One universally available feature is that of inserts. An insert allows you to use an effect plugin to process the sound going through the channel strip. This opens up a world of possibilities, and the bulk of the remainder of this mixing guide is concerned with the usage of various insert effects. Popular insert effects include: equalizers (Section 4), compressors (Section 5), limiters (Section 5.4.1), gates (Section 5.4.5), delays (Section 6.3), stereo effects (Section 6.2), and distortion, chorus, flangers, phasers, filters, ring modulators, vocoders, pitch shifters, exciters, harmonizers, auto-tuners, and FSU plugins (not discussed). 1 1 Most of the insert effects that are not discussed are not discussed because they are used to create dramatic changes in sound, rather than subtle sonic enhancements, and therefore fall somewhat outside the scope of a guide to mixing. 22

3.2.2 Auxiliary Sends Inserts are not the only way to make use of effect plugins. There is another method, known as auxiliary sends, or aux sends, which is useful in a slightly different set of situations. Insert effects are useful when you want to use an effect to process the sound of one channel. Aux sends are useful when you want to send several otherwise unrelated channels through an effect, or to blend a processed version of a channel with the normal, unprocessed version. When you add an aux send to your project, every channel strip will have a volume control corresponding to that aux send. That volume control, if turned up, will allow you to send varying amounts of each channel to the aux send. The audio thus sent to the aux send will be processed through the effect and added to the mix. Auxiliary sends are, in mixing, most often used for reverb (Section 6.4) and delays (Section 6.3). They are also useful for performing parallel compression (Section 5.4.3). Most DAWs provide two kinds of aux send: pre-fader and post-fader. These two types differ in their relationship to the main level fader of the channel. A pre-fader send happens before the fader, and a post-fader send happens after the fader. The practical effect of this is that changes in the level fader will not affect the send level of a pre-fader send, but they will affect the send level of a post-fader send. There are a variety of reasons to choose either, and it s best to make this decision on a case by case basis. 3.2.3 Busses Normally channel strips take their audio input from some source elsewhere in the DAW; a software synthesizer, a track of recorded audio, etc. But channel strips can also take their input from other channel strips. A channel whose input consists of multiple other channels is sometimes called a bus or a group channel. Busses are very useful. Essentially, what they allow you to do is to manipulate several channels as one. You can process them with the same effects, and you can control their levels as a unit, using the level fader on the bus. A common use of busses is on drum kits. Suppose that you have a drum kit with a separate channel for each drum sound: kick, snare, three toms, and four cymbals. You could then make a bus called drums, and route all of the drum sounds into that bus, so that they could be controlled as a unit. You can also have hierarchies of bus groupings: channels that are grouped into busses, which are themselves grouped into busses. A refinement of the previous drum kit example would be to first create a toms bus and route of all of the toms to it, and then a cymbals bus to which all of the cymbals are routed. Then your drum kit would be described by four channels: kick, snare, the toms bus, and the cymbals bus. You could then route all four to one big drums bus as before. 23

3.2.4 Master Bus There is one special bus which is present in every mix, called the master bus. The master bus is the bus that everything else goes through: it s the final destination of all the audio. You can use the master bus to apply insert effects to the mix as a whole. In general, you should leave the level fader on the master bus set to 0dBFS. In the context of a normal mixdown, there is no good reason to adjust it. There are a number of reasons you might want to adjust it, but in all cases there are better ways to do the same thing: 1. You might turn it up or down to adjust your monitoring level. Instead, you should adjust the volume using a hardware or software volume control outside your DAW. 2. You might turn it up to remove headroom at the end of the mixing process. Instead, you should use a limiter; see Section 7.1. 3. You might turn it down to add headroom. Instead, you should turn down all of the tracks going to the master bus by an equal amount, or turn down the input gain on the master bus, because if you add headroom by adjusting the master level fader, then the headroom adjustment will occur after any insert effects on the master bus, which is not desirable. 3.2.5 Advanced Routing Many DAWs allow even more sophisticated signal flow ( routing ) possibilities than the ones described above. For instance, it is often possible to send the output of a channel strip to multiple other channel strips. 2 Some DAWs have anything to anywhere routing, which means that you can send the output of any channel strip into any other channel strip with no restrictions, creating signal flow paths of arbitrary complexity. 2 This is useful for performing techniques such as parallel compression (Section 5.4.3.) 24