Assaf Nir DJ Darwin a genetic approach to creating beats Final project report, course 67842 'Introduction to Artificial Intelligence' Abstract In this document we present two applications that incorporate AI techniques, specifically genetic algorithmics, in order to create computer generated musical 'beats' that are suited to a user's taste. These 'beats', which are actually drum rhythms, are the product of a maximization of a 'fitness function' over a space of beats. While both applications share an evolution-simulation engine, one is completely automatic, while the other encompasses user interaction. Respectively, the fitness function is either explicit and pre-determined, or implicit and changes over time. Background - Music and Computer science The connection between music and Computer Science-Math is natural and inherent. One example of this connection is harmony, connecting the perception and pleasentness of chords to (amongst others) mathematical properties. Another aspect of music connecting it to CS is representation. Even before technology allowed the approximate representation of sound (From the musical box to todays silly-kids-onbus-ringtones), musical notation had been there- a lasting abstraction of music. Although some aspects of music can not be described as symbols (Sachmo's swing, Billie Holiday's sorrow, Django Reinhardt's missing fingers technique), musical notation is what allowes us to listen today to a song composed by mozart years ago, without having to deal with all that messy business of time travel etc. An addedbonus of musical notation is that it is fit for computer representation. Unlike sound recording, which must deal with the highly feared fourier transform (and with the
even more highly feared Audiofiles), notation representation is as easy as bits in arrays (The actual playback, of course, is a bit more complicated). For this reason, computers have been given the task of dealing with music almost from the start. As early as 1951, Australian musician-programmers used the CSIR-Mk1 (later changed to CSIRAC) to play music. Today, with fast microprocessors allowing real time playback, and the MIDI protocol providing a simple way to allow the computer to record and control various musical instruments, creating music with the computer has become as easy as...well, creating music with musical instruments. One problem remains- creating music is hard. It requires practice, talent and patient neighbors. Even able musicians, creating beautiful pieces inspiring love, or popular pieces making Christmas #1,Even for these few, the secret of creating a good song has not been mastered. For this reason (and more), many attempts have been made to make the computer more than a just a storing device - If music can be represented as bits, it is but natural that the computer manipulate them. Overview Modern music pieces can be roughly broken down into 3 components: melody, which is the main musical line (e.g. vocals, solo guitar, saxophone); harmony, which is the musical 'background' (e.g. rhythm guitar, brass section, piano chords); and the beat, or rhythm section, which is the mainly the percussion (e.g. drums). In our project we chose to concentrate on the beat, since it is intuitive and thus fit for musical novices, easy to manipulate and evaluate, and just makes you want to dance. Beats, being the main rhythmic line behind most music, classically come from a set family of standard beats, such as samba, waltz, 16-bit, shuffle etc., or some small variation on them. This 'fixation' on specific beats has somewhat faded in modern electronic music, either by using digital sequencers or by methods of sampling. While the first is the result of computer-generated patterns according to some user-set parameters, and the latter are determined a-priory by users, neither uses the power of computing to aid in enhancing creativity, explore new ideas or help in fine-tuning the beats according to the user's taste. Our goal was this: construct a program that
allows a user to generate a beat of his taste through exploration of new ideas. Using a genetic-algorithmic framework, we could define the user's taste as the fitness function, allowing him to de-facto determine which beats within a population survive and bread, and which are doomed. Evading the need to formally define a fitness function allows for even novice users, armed with musical taste alone, to generate pleasing and interesting beats. Additionally, allowing the use of formalized fitness functions lets the user use the program on 'auto-pilot' mode, letting it simulate the evolutionary process by itself. By thinking of beats as a population, users are exposed to different directions for their beats to evolve. Musical heuristics, based on musical theory and evolutionary guidelines, are used both in the algorithm (the generation, breading and mutation of the population) and in the GUI. This allows a fruitful experience and good results. In general lines, the flow of a user's experience is as follows. The user starts with a population of 6 different beats, each a representative of his 'breed' within the general population. Each beat is defined by his DNA, which holds information on what instruments (e.g. drum-sounds) are played when. These beats can either be generated from a file (allowing the user to set a general direction or genre), randomly (allowing a user to add wanted 'noise') or by some form of random symmetry (a heuristical approach that usually generates pleasing beats). After listening to and comparing these given beats, the user chooses the proportion of each in the population - a high proportion for the beats he likes, and a low (or zero) proportion for those he dislikes. The user also sets how dominant each beat will be in breeding, that is, how much his children will resemble him. Once these are set, the user can breed his beats, thus creating 6 new 'offspring' to work with. The user can either choose to bread the population once and view (listen to) the results, or allow the program to simulate an evolutionary process comprised of many 'rounds' of breeding, according to a parameterized fitness function. The first option allows exploration and aids creativity, but demands much user interaction and may be slow in reaching a wanted result. The second option allows a fast advance towards a goal without the need for user interaction, but requires the user to define what he wants. Thus, alternating between methods allows the user to exploit the pros of each method.
The GUI: beat population (left) and control panel (right) Heuristic approaches While the basic evolutionary concept is simple, good results require good heuristics. To achieve musically-pleasant results fast, different heuristics, drawn from musical theory, have been implemented in all stages of the algorithm. We give an example of such heuristics from each stage: Representation: the genes contain both active and inactive chromosomes (that is, instruments). This idea is taken from the biological world, where most nucleotides in the DNA are not expressed. Through mutation, instruments can shift from one state to the other, allowing 'silent' instruments to suddenly appear, and vice versa. This allows the user to explore more ideas and reach his 'goal' beat. Population initialization: the starting state of the population can be easily set by the user. He can choose any combination of beats, either random, symmetric, or from a file, and compose their relative proportion in the population. Choosing a good starting state will lead to better results faster. Moreover, with options such as activating a user-triggered mutation of a
beat, the user can artificially intervene and tweak the population at any stage. Breeding and mutation: The user is given the ability to tweak many significant parameters that affect the way breeding and mutations occur. The different parameters shape the rate, pace character and tendencies of the evolutionary process. Since the user has control over these parameters via the GUI, the shaping of parameters can be thought of as heuristics provided by the user that aid in the maximization of the (artificial) fitness function, just as in other search paradigms The fitness function: this is the user's preference of style he provides the target. As in biological evolution, this fitness function changes with time, and is affected by the life forms themselves; it is very natural that the user changes his opinion or preference, and may be guided by the 'population' of beats that are presented to him. Allowing the user to choose between slow hands-on and fast auto-pilot evolution allows for a faster and better path to the desired beat. Analysis Since our fitness function is based on users' taste, it is hard to perform a raw analysis of aspects such as runtime or optimality, especially in the hands-on mode. Instead, we present several interesting phenomena that were encountered while working, testing, or just playing around with the program. Classic runtime analyses under different settings of the auto-pilot mode are also presented. Finally, qualitative differences between both modes are discussed.
Fusion Consider the following population and breeding parameters: As we can see, the population is composed mainly of two pre-defined beats ('James Brown' and 'Samba'), while the other smaller portion of the population is composed of 4 'noise' beats random or symmetric. With the mutation rates set to medium-low and using 'by instrument' crossover breeding, after several hands-on evolutionary steps we get the following results: Both a visual and auditory analysis of these results shows that each beat in the new population is, as expected, a unique noisy mix between the two pre-defined beats (see appendix for beat files of results 2 and 5). In other words, the new population displays
beats that are fused together from 'James Brown' and 'Samba'. Listening to these samples confirms just this: they sound as if James Brown had played Samba. Differences in crossover types The 3 different types of crossover (that is, by what rule do the parents' genes form the offspring's) dissection, by instrument or by note determine how 'long' the leaps within the search space are. While no type reduces the actual state-space (since mutations allow all states to be reached), choosing the crossover type can de facto reduce the feasible states, and thus 'narrow' the search. Choosing dissection (e.g. the offspring gets a continuous half of each parent's genes) keeps the evolved population similar to the original one, while choosing 'by note' is very exploratory and produces novel results. Differences between the types are of great importance especially in the auto-pilot mode. These are examples of results after 10 evolutionary cycles with different crossover types on the James Brown-Samba example: Disect By instrument By note Convergence and local maxima Even though the initial population contains very different beats, often the population converges to the same beat. Of course, convergence itself, it's speed, and population variation at near convergence differs greatly depending on parameter values. For instance, low mutation rates and dissection crossover may converge even after 10 cycles. At the other end, high mutation rates and by-note crossover may never converge after a few hundred cycles beats will be similar, but still contain variations.
In hands on mode, escaping convergence (if wanted) is easy; when approaching convergence, the user can simply hand-pick new beats, randomize them or manually insert mutations. Thus, the user can manually escape local maxima. In auto-pilot mode this is harder (though convergence might be wanted if auto-pilot is chosen). Convergence in this mode greatly depends on the different evaluators chosen and their settings. Local maxima, abundant or scarce, can sometimes be evaded using random noise. A detailed analysis is enclosed. Convergence and local maxima To test the auto-evaluators, two criteria were taken into consideration: 1. Satisfying the intended purpose (i.e. note density=50%) 2. Population diversity. The significance of the 1st criterion is straightforward, however, with some evaluators, the purpose is abstract and is not naturally quantified. In such cases, the evaluator was tested on avoiding Rubbish (example below). The control group tested is an empty valuator- assigning all beats with the same fitness function, thus turning the process random. Evaluators were tested on the 3 types of crossover- Dissection, by instrument and by note. Examples of ''
The main findings: 1. "Simply" defined evaluators work well. 2. Long repetitions (2000) of autopilot results with high probability (usually more than 50%) in complete. 3. Abstract evaluators usually succeed in avoiding complete, and are slightly better at that than the control group. 4. Crossovers- Dissect seems the most effective. Simple evaluators reached their goal in relatively short time (50 rounds) and complete was usually avoided. Conclusions: As an aiding tool, evaluators seem to work. Simple tasks are achievable, and the algorithm usually avoids converging to beats (meaning they are worthwhile). On the other hand, it seems writing more advanced evaluators would be a lot more complicated, and should probably involve some sort of learning algorithm. One such possible algorithm would be to create a database of all beats evaluated by the user, and then try to predict accordingly (using sub-gradient descent or the likes).
Appendix 1. James Brown Samba example beats: 2. Fusion auditory beat examples (can be loaded in the program): 3. Packages and Classes (further documentation in the code): a. Beats: contains beats in.txt format, can be loaded into the program b. Evaluators: contains evaluator functions for auto-pilot mode: i. EvalContainsBeat.java ii. EvalInstrumentDensity.java iii. EvalNoteDensity.java iv. EvalOffBeat.java v. EvalSymmetry.java vi. Evaluator.java vii. EvalWeightedAverage.java c. Graphics: contains graphics for the GUI d. Main: contains program logic, GUI, engine etc.: i. BeatBase.java beat manipulation (eg reading, writing, creating etc.) ii. CrossoverType.java - enum iii. Engine.java the main logic behind breeding and evolution iv. GenoBeat.java main class v. GUI.java user interface vi. inittype.java - enum vii. Mtrack.java represents a single track viii. Mutator.java mutates tracks ix. RandomUtils.java set of randomness-related utilities x. SubTrack.java represent an instrument within a class xi. TrackPlayer.java plays tracks xii. VisPanel.java beat visualization 4. Evaluator test statistics:
Dissect test By inst. test By Note test Note density Offbeat Symmetry Inst density Contains^ control 10 no* no Note mutation Inst mutation 50 yes* yes half half *high chance of very low diversity 100 yes* yes ^contains guauanco.txt 200 yes* Yes* ok Symmetry- 3 500 yes* ok ok Yes* ok semi Offbeat 3\40 2000 ok semi ok semi 10 no no 50 no* Yes* 100 no* Yes* 200 no* ok Yes* ok ok 500 yes* ok semi Yes* semi semi 2000 semi semi semi 10 no yes 50 no* Yes* 100 no* Yes* 200 no* ok ok Yes* ok 500 yes* semi semi Yes* ok semi 2000 semi semi semi