This article was originally published in the Encyclopedia of Animal Behavior published by Elsevier, and the attached copy is provided by Elsevier for the author's benefit and for the benefit of the author's institution, for noncommercial research and educational use including without limitation use in instruction at your institution, sending it to specific colleagues who you know, and providing a copy to your institution s administrator. All other uses, reproduction and distribution, including without limitation commercial reprints, selling or licensing copies or access, or posting on open internet sites, your personal or institution s website or repository, are prohibited. For exceptions, permission may be sought for such use through Elsevier's permissions site at: http://www.elsevier.com/locate/permissionusematerial Rosenthal G.G. (2010) Playbacks in Behavioral Experiments. In: Breed M.D. and Moore J., (eds.) Encyclopedia of Animal Behavior, volume 2, pp. 745-749 Oxford: Academic Press. 2010 Elsevier Ltd. All rights reserved.
Playbacks in Behavioral Experiments G. G. Rosenthal, Texas A&M University, College Station, TX, USA ã 2010 Elsevier Ltd. All rights reserved. Introduction Defined in an influential volume by Peter McGregor as the technique of rebroadcasting natural or synthetic signals to animals and observing their response, playback is to animal behavior what the polymerase chain reaction is to molecular biology. From neuroethology, to behavior genetics, to animal cognition, it is difficult to conceive of any area of animal behavior where playback experiments have not made a major contribution. Playbacks provide an analytical approach to studying how animals respond to stimuli, where an experimenter can quantitatively manipulate some signal components while holding others constant. Hunters and herders have long used artificial stimuli to manipulate animal behavior, and the mirrors, painted models, and chemical swabs used since the early days of ethology could well be considered playback stimuli. The term playback, however, is generally applied to the electronic presentation of temporally patterned stimuli applied in an experimental setting. Hundreds of studies across numerous taxonomic categories have used playback of visual and acoustic cues, while a smaller number have presented electrical and vibrational stimuli. While visual, vibrational, and electrical playbacks are mainly used in laboratory settings to study perception, cognition, and communication, acoustic playbacks are frequently performed in the field, where they can be used to census individuals, lure them to capture, or quantify territory size. Acoustic Playback Playback of acoustic signals dates back to the end of the nineteenth century, when the newly invented gramophone was used for audio playback of conspecific signals to rhesus macaques; subjects would thrust their arms into the gramophone s horn in search of the other monkey. By the mid-twentieth century, inexpensive and portable equipment for recording, analysis, and broadcast made sound playback broadly accessible to researchers. Sound playback is ubiquitous in studies of vocal communication in birds and anurans, and has also been widely used in fish, mammals, and insects. Most studies have focused on signaling in terrestrial environments, but there is a growing body of work on acoustic playback to aquatic animals. Stimulus Preparation The most straightforward type of stimulus in an acoustic playback is simply a recording of a natural vocalization. Numerous experiments have compared responses of subjects to conspecific versus heterospecific calls, local versus foreign song dialects, or vocalizations of familiar versus unfamiliar individuals. Playback of unmanipulated recordings can also be used to obtain demographic information, for example by transect counts of the number of males that respond to conspecific vocalizations. Playback of recorded sounds can also be used, in the absence of a receiver, to measure attenuation and degradation of signals. This is typically done by recording sound at a series of standard distances from a speaker. The experimenter can compare how different vocalizations are affected within the same environment, or alternatively compare attenuation and degradation of the same signal across environments. Natural recordings are often edited before presentation. Until the late twentieth century, natural sounds were recorded on magnetic tape, a variety of analog electronic devices used for filtering, and signal components repeated, excised, or rearranged via manipulations of audiotape. Contemporary editing is done on digitized sounds, using sound-editing software such as ProTools or Signal. At a minimum, editing involves application of band-pass frequency filters to minimize background noise and remove extraneous sounds. Different components of a signal can be cut and pasted, for example to increase or decrease the interpulse interval in a repeated call, or to evaluate the effect of song syntax (the structure of distinct vocal elements, or syllables ) on receiver response. Harmonic components of a signal can be removed, amplified, or attenuated, and signal components can be selectively accelerated or decelerated. These kinds of manipulations have proved invaluable in analyzing how receivers attend to signals. For example, in the túngara frog (see entry) Physalaemus pustulosus, males produce a two-component sexual advertisement call comprising a tonal frequency sweep (a whine ) and one or more broad-band, high-energy harmonic chucks. By adding and removing whines and chucks, researchers were able to determine that the whine is both necessary and sufficient to elicit a female response; females respond positively to whines alone, but fail to attend to isolated chucks. However, adding chucks to a call and increasing chuck number both increased the attractiveness of this compound signal. 745
746 Playbacks in Behavioral Experiments Synthetic stimuli, where acoustic signals are generated based on specified parameters, offer the most control and flexibility over stimulus design. Sound synthesis allows experimenters to independently decouple specific variables and create hypothetical, mathematically specified stimuli that are nonexistent in nature. Although analog synthesizers are still widely used by musicians, sound synthesis in animal behavior is now performed digitally using a variety of software packages. There are two major classes of approaches to sound synthesis. Tonal synthesis represents sounds as sums of sinusoidal functions varying in frequency, amplitude, and phase. Sinusoidal functions can be convoluted with any number of mathematical functions to produce, for example, signals that ramp up in amplitude over time or vary in pulse repetition rate. The parameters in tonal synthesis are all based on the physical properties of the sound itself, and are independent of the signaler. By contrast, physically based synthesis, which is less widely used in animal communication studies, reconstructs sounds based on a model of the sound production system; for example, linguists make extensive use of models of the human vocal production apparatus in generating speech sounds for playback studies. Synthesized sounds are widely used in neurophysiological studies, where they can be used to determine neural responses to specific acoustic parameters. Morphing one signal into another permits investigation of categorical perception. For example, acoustic intergrades between the ba and pa phonemes in human speech are always perceived by subjects as one or the other. A particularly powerful application of sound synthesis involves the ability to generate entirely novel stimuli. Several studies of túngara frogs, mentioned earlier, have assayed female mating preferences for the inferred calls of ancestral taxa, which are synthesized using acoustic parameters inferred by phylogenetic reconstruction. Stimulus Presentation The output device for audio playback is typically a commercially available loudspeaker, including underwater speakers for aquatic systems. The most important consideration is that the frequency response curve of the speaker be relatively flat over the range of frequencies and sound pressure levels being played back. For stimuli outside the range of human hearing (like infrasonic or near-infrasonic calls in elephants), speakers have to be specially modified. Ultrasonic playback (e.g., to bats or mice) requires specially designed electrostatic speakers. Playback of acoustic signals is typically sequential, with stimulus order varied and interstimulus intervals designed so as to minimize order effects. Simultaneouschoice experiments, where calls are paired antiphonally on opposite sides of an arena, is particularly common in laboratory experiments on frogs and insects. Studies of acoustic localization or interference can have multiple sounds playing out of multiple audio channels into an array of speakers. Depending on the receiver, a variety of assays are used to measure receiver response. For examples, males typically respond vocally to acoustic signals of other males, and these responses can often be quantified automatically with the appropriate software. In some species, receivers will exhibit specific postural changes in response to an acoustic signal (e.g., a copulation solicitation display involving raising of the tail in female birds). More general response measures include phonotaxis (approach to a speaker), habituation/dishabituation approaches (particularly useful in psychophysical assays of just meaningful differences), and changes in locomotor activities (e.g., number of perch changes in birds). Electrical and Vibrational Playback Electrical signals are produced by specialized electric organs in gymnotiform and mormyriform fishes, and substrate-borne vibrational, or seismic, signals have been most extensively studied in hemipteran insects. Despite vast differences in signal production and transmission, these can be parsed into frequency, temporal, and amplitude spectra just as acoustic signals can. For electrical playback, recorded or synthetic signals are transmitted via an amplifier to paired electrodes, often at either end of a plastic pipe that serves as a shelter. Electrical signals emitted in response to stimuli can then be recorded by the experimenter. For vibrational signals, the amplifier is connected to an electromagnet which vibrates the substrate, typically a plant, and vibrational responses are recorded with an accelerometer. Playback of white noise allows the experimenter to determine the response function of the substrate, and accelerometer recordings of playback stimuli provide an assessment of signal fidelity. Playback of Visual Stimuli Presentation of moving images to research animals dates back to the 1960s. As with acoustic playback, rhesus macaques provided proof of concept of visual playback, in this case by attending preferentially to ciné stimuli over stills or mirror images. Most studies of visual playback have focused on presenting subjects with moving images of other animals (conspecifics, closely related heterospecifics, predators, or prey) performing behaviors of interest to the experimenter.
Playbacks in Behavioral Experiments 747 Stimulus Preparation The ability to manipulate visual stimuli has gone hand in hand with advances in technology available to researchers; one early study, for example, evaluated the importance of temporal structure in Anolis displays by comparing the aggressive response of subjects to films played forwards and backwards. Analog video-editing techniques like chroma-keying (colloquially known as green-screening, where a selected color range can be changed or overlain with another video stimulus) allowed experimenters to standardize background features and isolate specific behaviors. Such approaches allowed of a range of tests on how visual cues in isolation elicit a particular behavioral response. For example, roosters produce ground alarm calls in response to a video of a ground predator on an adjacent monitor, and aerial alarm calls in response to a video of a looming hawk on an overhead monitor. The ability to digitize video represented an important methodological advance. Numerous studies in the 1990s applied frame-by-frame manipulation of video sequences. This is a tedious process in which a video sequence is digitized, individual frames are imported into an imageediting program, and each frame is individually altered. Numerous studies used this approach to independently decouple stimulus behavior and morphology. At 30 frames per second for the NTSC analog video standard used in most of the Western Hemisphere and 25 frames per second for the PAL standard used in most other countries, this represents a time-intensive process even for brief sequences; and each procedure results in only a single-parameter manipulation in a single exemplar. Frame-manipulated video is prone to producing a number of artifacts, including spatial discontinuities and aliasing effects (visual distortions caused by undersampling) between a manipulated trait and background features. Moreover, it is difficult to manipulate twodimensional projections of animals performing behaviors in three dimensions. Despite these concerns, frame manipulation can preserve much of the spatiotemporal complexity of an original video sequence while allowing broad flexibility in morphological manipulations. It is particularly appropriate for creating stimuli of short duration so that motion is confined to the plane of the screen. Most contemporary visual-playback studies use some form of synthetic computer animation: a familiar feature of popular films and television shows. Like synthetic acoustic, electrical, and vibrational signals, synthetic animations are mathematical descriptions of a set of features chosen by the experimenter. Visual signals are fundamentally distinct, however, in that while these other signals involve energy generated by the signaler, visual signals typically involve the manipulation of incident light by the receiver, for example sunlight reflecting off feathers or skin. Further, visual perception depends critically on stimulus contrast with background elements. Synthetic acoustic and other generated signals are typically presented in isolation, or occasionally coupled with masking noise or interfering cues. With synthetic animation, the experimenter needs to generate an entire visual scene, specifying the color, intensity, and spatiotemporal distribution of both the light regime and the visual background. Like sound synthesis, synthetic animations are parameter-based. This makes it possible to measure features of interest, like the relative size of morphological ornaments, on animals and their habitats (like the temporal frequency distribution of moving background vegetation) and then apply these to a synthetic model. This also allows the experimenter to generate morphologies and behaviors outside the range of natural variation; for example, chimeric individuals bearing traits of more than one species, or inferred ancestral states. A very large number of parameters is needed to specify even a simple visual scene; for example, the appearance of a single point on an animal s skin depends on how the color, brightness, opacity, and shininess of the point interact with light as a function of intensity, wavelength, and incident angle, all of which in turn vary over space and time depending on the animal s orientation, position, and posture relative to light sources, other objects in the scene, and the receiver. Since it is unfeasible for experimenters to collect quantitative data on every parameter of a scene, some parameters are often fixed to arbitrary values, while others are based on individual exemplars. For example, body patterns are often based on digital photos of a single individual. Complex motor patterns, meanwhile, are often derived by superimposing a synthetic model over video footage of a live animal. This technique, called rotoscoping, dates back to early twentieth century ciné animations. Stimulus Presentation Visual stimuli are generally presented on cathode-ray-tube (CRT) video or computer monitors. Flat-screen monitors, which provide a limited viewing angle, have proved to be less generally appropriate. Color fidelity, spatial and temporal resolution, and the lack of depth cues can pose problems in interpreting responses to video stimuli; these issues are discussed in more detail below. By contrast with acoustic playback experiments, where sounds are routinely broadcast to animals in the field, there is only one published study of video playback in the field, where a monitor placed in Anolis lizard territories elicited stereotypical responses from males and females. Video playback in the field is problematic, since ambient light tends to make it difficult to detect images on a video monitor, and since detection of a video is contingent on being in the line of sight of the monitor. Robots (see entry) may be more appropriate for field playback studies.
748 Playbacks in Behavioral Experiments In the laboratory, video stimuli are typically presented adjacent to an arena, cage, or aquarium that restricts the animal from moving behind the monitor, thus increasing the likelihood that visual stimuli are detected. As with audio playback, presentation may be sequential, with stimuli presented in succession, or simultaneously, with stimuli usually presented on opposite sides of the arena. Response is assayed by the performance of specific behaviors directed at a particular stimulus; for example, Anolis lizards perform a characteristic head-bob display in synchrony with a simulated intruder on video, while chickens produce a ground alarm call when confronted with a video of a raccoon. Many video playback experiments, however, rely on simple proximity measures as an assay of preference. By itself, proximity does not provide information about the behavioral context in which an animal is responding, raising concerns that animals may be attending to artifacts in stimulus representation (see section Signal Fidelity). Multimodal Playbacks In almost all communication systems, receivers are likely to attend to multiple sensory modalities. Given the large number of playback studies in each individual modality, it is surprising that relatively few studies have used a multimodal approach. In both pigeons and túngara frogs, female receivers respond more strongly to a combination of visual and acoustic cues than to either cue alone. In the frogs, a combination of synthetic and edited-video stimuli was used to show that females attend specifically to the form and inflation pattern of the male vocal sac a moving rectangle of the same size was no more effective than a blank screen at eliciting phonotaxis. A similar result is obtained by combining visual cues with substrate-borne vibration playback. For both wolf spiders and their predators, a combination of video and vibrational cues is more likely to elicit a response than either cue alone. Interactive Playbacks In nature, communication is an interactive process. Signalers dynamically adjust signal output, signal type, and signal parameters, depending on changes in orientation, position, and behavior in both intended receivers and eavesdroppers (e.g., predators or sexual rivals). Interactive playback attempts to mimic this property of senders: signal presentation is determined by receiver behavior. Typically, the experimenter specifies a set of rules (e.g., matching or escalating calls emitted by the subject in an aggressive display). Interactive playbacks are often manual, where the experimenter identifies a subject behavior and plays a signal in response. Since one of the benefits of playback is that subjects are presented with consistent, repeatable sets of stimuli, interactivity in playback experiments may not always be desirable, particularly if experimenter subjectivity or error is an important factor. A potentially more rigorous approach is to have real-time signal-processing algorithms automatically determine the interaction. This approach has been used successfully with acoustic and electrical signals; more recently, real-time tracking of subject behavior has been used akin to a video-game controller, determining the behavior of an animated stimulus on screen. Potential Hazards of Playback Techniques Playbacks offer a degree of control and precision that is unavailable from observational studies or direct manipulation of live exemplars; by their very nature, therefore, they are prone to a number of potential pitfalls that may limit the external validity of experimental results. As noted in the preceding section, the appropriateness of interactivity is a matter of some debate: a signal that is not contingent on subject behavior may elicit artifactual responses, while one that is interactive may make it more difficult to compare responses across trials. Two main issues have been the focus of attention: pseudoreplication, whereby playbacks fail to adequately sample natural signal variation, and signal fidelity, whereby playbacks fail to represent signals appropriately. Pseudoreplication With regard to playback experiments, pseudoreplication was defined by McGregor and colleagues as the use of an n (sample size) in a statistical test that is not appropriate to the hypothesis being tested. Many early acoustic playback experiments would use, for example, a single recorded exemplar per species when studying conspecific recognition in a territorial context. Without adequately sampling responses to multiple exemplars, it is impossible to discern whether differences in response are due to differences in stimulus classes or due to idiosyncratic differences among individuals. This problem can be addressed by using multiple natural exemplars and performing appropriate statistical analyses. Synthetic stimuli, which are generated from specified parameters, offer the opportunity to eliminate this idiosyncratic variation. Parameters can be modeled on data sampled from multiple natural signals. Even synthetic stimuli, however, leave open the possibility that responses depend on interactions between a manipulated parameter and a parameter that is arbitrarily fixed. For example, the attractiveness of a repeated acoustic mating signal might depend on an interaction between pulse rate and dominant frequency. Merely holding dominant frequency constant and varying pulse rate would provide an incomplete picture of how sexual selection acts on the
Playbacks in Behavioral Experiments 749 signal. This problem is particularly difficult with video animations, where the number of possible parameters is very large; in practice, many animations use sample data for a tiny fraction of model parameters (typically morphological traits), and behavior and texture are modeled after single exemplars. Signal Fidelity Signal fidelity has garnered particular attention for visual playback studies, but is nevertheless an important concern in other modalities. This is particularly the case for sound in aquatic systems, where signals in small aquaria are distorted by reverberation and resonance. In the field, the directionality of sound and the transmission of sound through acoustic microenvironments may also alter acoustic signals in artifactual ways. Video playback involves representing a threedimensional signal on a two-dimensional surface, breaking a continuous visual stimulus into discrete, pixilated still frames at temporal intervals on the order of 33 ms, and collapsing spectral radiance and irradiance functions into red, green, and blue outputs on a monitor. In the absence of real depth cues, a large object far away subtends the same visual angle as a small object close up. Occlusion cues (static objects at varying apparent distances from the foreground) can provide depth information in a two-dimensional image. The standard refresh rate of most video monitors (25 30 Hz) is just above the flicker-fusion threshold for humans. Many animals, particularly birds, have higher flicker-fusion frequencies. Depending on monitor type and the species being tested, subjects may perceive a series of static slides as opposed to a continuous image. Newer computer monitors can refresh over 120 Hz, which is suitable for most species. Color fidelity is perhaps the most intractable problem with video outputs for playback to non-human animals. The red, green, and blue phosphors or pixels of television or computer monitors are tuned to match the sensitivity of human red, green, and blue cone photoreceptors. By differentially stimulating each photoreceptor class, a video image is able to represent a wide range of illusory colors, or metamers. The correct perception of these metamers, however, is contingent on matching the three output classes to the sensitivity of receivers. Spectral tuning varies widely among and even within species. Researchers have developed a methodology for adjusting monitor color output to match sensitivity of known photoreceptor classes in a study species; problems arise, however, when testing animals with more than three color receptors. Colors can, however, be simulated by carefully selected color filters over the output screen. A harder problem is posed by the many highly visual animals, including many birds, fishes, and arthropods, that perceive color into the ultraviolet. Since video monitors do not emit directed ultraviolet light, current technology lacks a way to represent UV signal components in an experimental setting. Despite these caveats, playback across modalities has been an indispensable tool for understanding communication systems. Numerous studies have quantitatively compared responses with playback and live stimuli; when transmission among live animals is prevented in modalities other than the one being played back, playback stimuli are typically as effective as live animals in eliciting responses. Moreover, there are few, if any, studies where predictions made by playback experiments have been directly refuted by observational work or by complementary physiological or molecular measures. Conclusion Playback techniques have grown hand in hand with available technology. Acoustic, electrical, and vibrational playback collectively represent a mature technique for experimentally manipulating emitted signals. Visual signals are both more complex and more contingent on the environment in which they are produced and perceived, but ongoing advances in image acquisition, analysis, and presentation continue to expand the scope of questions that can be addressed using video playback. See also: Acoustic Signals; Electrical Signals; Experimental Design: Basic Concepts; Olfactory Signals; Robotics in the Study of Animal Behavior; Vibrational Communication; Visual Signals. Further Reading Hopp SL, Owren MJ, and Evans CS (eds.). Animal Acoustic Communication: Sound Analysis and Research Methods. Berlin: Springer-Verlag. Kroodsma DE, Byers BE, Goodale E, Johnson S, and Liu WC (2001) Pseudoreplication in playback experiments, revisited a decade later. Animal Behaviour 61: 1029 1033. McGregor PK (ed.) (1992) Playback and Studies of Animal Communication. New York, NY: Plenum Press. Oliveira RF, McGregor PK, Schlupp I, and Rosenthal GG (eds.) (2000) Special Issue: Video Playback Techniques in Behavioural Research. Acta Ethologica. 3, s. 1.