Implementation of Expressive Performance Rules on the WF-4RIII by modeling a professional flutist performance using NN

Similar documents
Quality improvement in measurement channel including of ADC under operation conditions

Motivation. Analysis-and-manipulation approach to pitch and duration of musical instrument sounds without distorting timbral characteristics

MODELLING PERCEPTION OF SPEED IN MUSIC AUDIO

Logistics We are here. If you cannot login to MarkUs, me your UTORID and name.

Line numbering and synchronization in digital HDTV systems

Energy-Efficient FPGA-Based Parallel Quasi-Stochastic Computing

Reliable Transmission Control Scheme Based on FEC Sensing and Adaptive MIMO for Mobile Internet of Things

Chapter 7 Registers and Register Transfers

EE260: Digital Design, Spring /3/18. n Combinational Logic: n Output depends only on current input. n Require cascading of many structures

Research on the Classification Algorithms for the Classical Poetry Artistic Conception based on Feature Clustering Methodology. Jin-feng LIANG 1, a

Randomness Analysis of Pseudorandom Bit Sequences

References and quotations

The Blizzard Challenge 2014

Analyzing the influence of pitch quantization and note segmentation on singing voice alignment in the context of audio-based Query-by-Humming

Image Intensifier Reference Manual

Australian Journal of Basic and Applied Sciences

PROBABILITY AND STATISTICS Vol. I - Ergodic Properties of Stationary, Markov, and Regenerative Processes - Karl Grill

The Communication Method of Distance Education System and Sound Control Characteristics

Daniel R. Dehaan Three Études For Solo Voice Summer 2010, Chicago

Background Manuscript Music Data Results... sort of Acknowledgments. Suite, Suite Phylogenetics. Michael Charleston and Zoltán Szabó

Internet supported Analysis of MPEG Compressed Newsfeeds

VOCALS SYLLABUS SPECIFICATION Edition

Quantifying Domestic Movie Revenues Using Online Resources in China

Voice Security Selection Guide

PIANO SYLLABUS SPECIFICATION. Also suitable for Keyboards Edition

NewBlot PVDF 5X Stripping Buffer

Application of a Musical-based Interaction System to the Waseda Flutist Robot WF-4RIV: Development Results and Performance Experiments

Polychrome Devices Reference Manual

THE Internet of Things (IoT) is likely to be incorporated

L-CBF: A Low-Power, Fast Counting Bloom Filter Architecture

Read Only Memory (ROM)

A Novel Method for Music Retrieval using Chord Progression

Recognition of Human Speech using q-bernstein Polynomials

Comparative Study of Different Techniques for License Plate Recognition

Mullard INDUCTOR POT CORE EQUIVALENTS LIST. Mullard Limited, Mullard House, Torrington Place, London Wel 7HD. Telephone:

PROJECTOR SFX SUFA-X. Properties. Specifications. Application. Tel

2 Specialty Application Photoelectric Sensors

CODE GENERATION FOR WIDEBAND CDMA

ABSTRACT. woodwind multiphonics. Each section is based on a single multiphonic or a combination thereof distributed across the wind

PowerStrip Automatic Cut & Strip Machine

ROUNDNESS EVALUATION BY GENETIC ALGORITHMS

RHYTHM TRANSCRIPTION OF POLYPHONIC MIDI PERFORMANCES BASED ON A MERGED-OUTPUT HMM FOR MULTIPLE VOICES

2 Specialty Application Photoelectric Sensors

Using a Computer Screen as a Whiteboard while Recording the Lecture as a Sound Movie

Our competitive advantages : Solutions for X ray Tubes. X ray emitters. Long lifetime dispensers cathodes n. Electron gun manufacturing capability n

A Simulation Experiment on a Built-In Self Test Equipped with Pseudorandom Test Pattern Generator and Multi-Input Shift Register (MISR)

The new, parametrised VS Model for Determining the Quality of Video Streams in the Video-telephony Service

STx. Compact HD/SD COFDM Transmitter. Features. Options. Accessories. Applications

Perspectives AUTOMATION. As the valve turns By Jim Garrison. The Opportunity to make Misteaks By Doug Aldrich, Ph.D., CFM

Working with PlasmaWipe Effects

Sensor Data Processing and Neuro-inspired Computing

Higher-order modulation is indispensable in mobile, satellite,

T-25e, T-39 & T-66. G657 fibres and how to splice them. TA036DO th June 2011

TRAINING & QUALIFICATION PROSPECTUS

COLLEGE READINESS STANDARDS

TOWARDS AN AUDITORY REPRESENTATION OF COMPLEXITY

Facial Expression Recognition Method Based on Stacked Denoising Autoencoders and Feature Reduction

9311 EN. DIGIFORCE X/Y monitoring. For monitoring press-fit, joining, rivet and caulking operations Series 9311 ±10V DMS.

Math of Projections:Overview. Perspective Viewing. Perspective Projections. Perspective Projections. Math of perspective projection

Image Enhancement in the JPEG Domain for People with Vision Impairment

A Backlight Optimization Scheme for Video Playback on Mobile Devices

CCTV that s light years ahead

Analysis and Detection of Historical Period in Symbolic Music Data

What Does it Take to Build a Complete Test Flow for 3-D IC?

Because your pack is worth protecting. Tobacco Biaxially Oriented Polypropylene Films. use our imagination...

Music Scope Headphones: Natural User Interface for Selection of Music

Forces: Calculating Them, and Using Them Shobhana Narasimhan JNCASR, Bangalore, India

Achieving 550 MHz in an ASIC Methodology

DIGITAL DISPLAY SOLUTION REAL ESTATE POINTS OF SALE (POS)

MultiTest Modules. EXFO FTB-3923 Specs Provided by FTB-3920 and FTB-1400

Volume 20, Number 2, June 2014 Copyright 2014 Society for Music Theory

Research Article Measurements and Analysis of Secondary User Device Effects on Digital Television Receivers

Tobacco Range. Biaxially Oriented Polypropylene Films and Labels. use our imagination...

Emotional Intelligence:

RELIABILITY EVALUATION OF REPAIRABLE COMPLEX SYSTEMS AN ANALYZING FAILURE DATA

NexLine AD Power Line Adaptor INSTALLATION AND OPERATION MANUAL. Westinghouse Security Electronics an ISO 9001 certified company

ST E T H O S C O P E S FE AT U R E CH A RT I I I

Practice Guide Sonata in F Minor, Op. 2, No. 1, I. Allegro Ludwig van Beethoven

DIGITAL SYSTEM DESIGN

NIIT Logotype YOU MUST NEVER CREATE A NIIT LOGOTYPE THROUGH ANY SOFTWARE OR COMPUTER. THIS LOGO HAS BEEN DRAWN SPECIALLY.

Detection of Historical Period in Symbolic Music Text

BesTrans AOC (Active Optical Cable) Spec and Manual

How the IoT Fuels Airlines Industry's Flight into the Future

For children aged 5 7

Entropy ISSN by MDPI

Manual Industrial air curtain

CSI 2130 Machinery Health Analyzer

Apollo 360 Map Display User s Guide

SMARTEYE ColorWise TM. Specialty Application Photoelectric Sensors. True Color Sensor 2-65

Manual RCA-1. Item no fold RailCom display. tams elektronik. n n n

Before you submit your application for a speech generating device, we encourage you to take the following steps:

Part II: Derivation of the rules of voice-leading. The Goal. Some Abbreviations

Organic Macromolecules and the Genetic Code A cell is mostly water.

A Model of Metric Coherence

ttco.com

Manual Comfort Air Curtain

MPEG4 Traffic Modeling Using The Transform Expand Sample Methodology

MOBILVIDEO: A Framework for Self-Manipulating Video Streams

University Student Design and Applied Solutions Competition

2 Specialty Application Photoelectric Sensors

Transcription:

2007 IEEE Iteratioal Coferece o Robotics ad Automatio Roma, Italy, 10-14 April 2007 Implemetatio of Expressive Performace Rules o the WF-4RIII by modelig a professioal flutist performace usig NN Jorge Solis, Kei Suefuji, Koichi Taiguchi, Takeshi Niomiya, Maki Maeda ad Atsuo Takaishi Abstract I this paper, the methodology for automatically geeratig a expressive performace o the athropomorphic flutist robot is detailed. A feed-forward etwork traied with the error back-propagatio algorithm was implemeted to model the performace s expressiveess of a professioal flutist. I particular, the ote duratio ad vibrato were cosidered as performace rules (sources of variatio) to ehace the robot s performace expressiveess. From the mechaical poit of view, the vibrato ad lug systems were re-desiged to effectively cotrol the proposed music performace rules. A experimetal setup was proposed to verify the effectiveess of geeratig a ew score with expressiveess from a model created based o the performace of a professioal flutist. As a result, the flutist robot was able of automatically producig a expressive performace similar to the huma oe from a omial score. T I. INTRODUCTION HE fact that music ca be used as a meas for expressio ad commuicatio is ofte ackowledged. Yet this is oe of the least uderstood aspects of music, at least as far as scietific explaatio goes. Performers itroduce some deviatios from omial values specified i the score, which characterizes their ow performace. It is kow that several performaces of the same score ofte differ sigificatly, depedig o performer s expressive itetios. Studies i music performace use the word expressiveess to idicate the systematic presece of deviatios from the musical otatio as a commuicatio meas betwee musicia ad listeer [1]. Such deviatios represet the added value of a performace ad are part of the reaso that music is iterestig to liste to ad souds alive. I fact, a score played with the exact values idicated i it lacks of musical meaig ad is perceived dull as a text read without ay prosodic iflexio. Ideed, huma performers ever respect tempo, timig, ad loudess otatios (some deviatios are always itroduced). A performace which is played accordigly to appropriate rules imposed by a specific musical praxis will be Mauscript received August 31, 2006. A part of this research was doe at the Humaoid Robotics Istitute (HRI), Waseda Uiversity. This research was supported (i part) by a Gifu-i-Aid for the WABOT-HOUSE Project by Gifu Prefecture. We would like to thaks Ms. Akiko Sato, a professioal flutist, for her valuable advices while developig the flutist robot. Jorge Solis is with the Waseda Uiversity, Mechaical Egieerig Departmet, 3-4-1 Ookubo, Shijuku-ku, 169-8555. Tokyo, Japa (e-mail: solis@kureai.waseda.jp). Kei Suefuji, Koichi Taiguchi, Takeshi Niomiya ad Maki Maeda are with the Waseda Uiversity, Graduate School of Sciece ad Egieerig. Atsuo Takaishi is with the Waseda Uiversity, Mechaical Egieerig Departmet ad Humaoid Robotics Istitute, 3-4-1 Ookubo, Shijuku-ku, 169-8555. Tokyo, Japa (e-mail: takaisi@waseda.jp). called atural. I order to uderstad how humas ca express emotios while performig music; several researchers have tried to emulate the huma music performace by proposig computatioal models [2] ad by developig mechaical systems which early simulate the physiology of the orgas ivolved o the performace of musical istrumets [3-5]. From the computatioal poit of view, the aalysis of such systematic deviatios has led to the formulatio of differet models that try to describe their structures ad aim at explaiig where, how, ad why a performer modifies (sometimes i a ucoscious way) what is idicated by the otatio of the score. I recet years, several researchers o the computer music field have focused o the Artificial Itelligece (AI) approaches for developig automatic performace systems i order to capture the kowledge applied whe performig a score by meas of rules. I order to develop a automatic performace system, mostly two approaches have bee proposed: the aalysis-by-sythesis ad the aalysis-by-measuremet [6]. Such kid of approaches maily coverts a music score ito a expressive musical performace by applyig rules (typically icludig time, soud ad timbre deviatios). Every rule tries to predict some deviatio that a huma performer iserts by quatitatively describig the deviatios to be applied to a musical score. As a result, more attractive ad huma-like performaces ca be geerated ad simulated. From the egieerig poit of view, several researchers have bee developig musical performace robots that early imitate the fuctio of the orgas for playig musical istrumets. I particular, authors have bee developig a athropomorphic flutist robot which imitates the huma flute playig by emulatig the huma motor cotrol required to play the flute [3]. The mai idea of this approach is to emulate huma dexterity ad to coordiate the movemets of each of the orgas ivolved durig the flute playig by mechaical meas. For this purpose, the sychroizatio of all the simulated orgas of the flutist robot is realized by readig the timig clock sigal from the MIDI data geerated from a PC sequecer ad by geeratig a iterrupt every 5ms o the PC cotroller [7]. Eve that the Waseda Flutist robot has demostrated to be able of early imitatig the performace of a itermediate flutist, the robot s performace still lacks of a huma-like expressiveess which is desirable for achievig a more atural performace. Up to ow, we have focused o usig the aalysis-by-measuremet method to ehace the expressiveess of the flutist robot s performace; 1-4244-0602-1/07/$20.00 2007 IEEE. 2552

where the performace of a professioal flutist is aalyzed (based o the Fast Fourier Trasform [8]) to extract musical parameters such as pitch, volume, tempo, etc. However, this approach caot provide eough iformatio to describe how performers actually add expressio to their performaces. I additio, every time the flutist robot is programmed to perform a ew score, the recordig from such score performed by professioal flutist is required. Therefore, i order to ehace the performace s expressiveess of the flutist robot, a aalysis-by-sythesis approach was implemeted to model a huma performace. I particular, a Artificial Neural Network (ANN) has bee implemeted to model the musical expressiveess of a professioal flutist. By usig such a model, a set of performace rules ca be extracted to produce a expressive musical performace. Specifically, the ote duratio ad vibrato were cosidered as pricipal sources of variatio required for a expressive performace. The extracted performace rules where the implemeted o the performace cotrol system of the Waseda Flutist Robot No.4 Refied III (WF-4RIII). This ew versio of the flutist robot improved the vibrato ad lug systems to effectively cotrol the musical parameters extracted from the resultat performace rules. From the mechaical poit of view; the lug system was re-desiged to eable a better cotrol of the air while breathig (to effectively add deviatios o the tempo), ad the vibrato system was re-desiged to emulate more closely a huma-like vocal cord (to effectively cotrol the amplitude ad frequecy of the vibratio added to the air beam). This paper is orgaized as follows. I Sectio II, the way of modelig a expressive performace of a professioal flutist usig ANN is detailed. I the followig sectio, the improvemets of the mechaical system of WF-4RIII are described. Fially, a set of experimets were carried out to verify if the flutist robot could effectively ehace the expressiveess of its performace. II. MODELING HUMAN MUSIC PERFORMANCE A. Aalysis of Huma Performace From the computer music field, the research o music performace has bee quite itesive i the 20 th cetury, particularly i its last decades. As a result, several automatic performace systems have bee developed to covert a music score ito a expressive musical performace. As it was previously metioed, maily two strategies have bee used for the desig of such performace systems: aalysis-by-sythesis ad aalysis-by-measuremet. Rules based o a aalysis-by-measuremet method are derived from measuremets of real performaces; usually recorded o audio CDs or played with MIDI-eabled istrumets coected to a computer [9]. Ofte the data are processed statistically, such that the rules reflect typical rather tha idividual deviatios from a deadpa performace, eve though idividual deviatios may be musically highly relevat. The secod method implies that the ituitive, overbal kowledge ad the experiece of a expert musicia are traslated ito performace rules. These rules explicitly describe musically relevat factors. The most importat is the KTH rule system [10]. Machie learig is also aother active research stream. Katayose [11] used some artificial itelligece iductive algorithms to ifer performace rules from recorded performaces. Similar approaches were proposed by Arcos [12] ad Suzuki [13]. Several methodologies of approximatio of huma performaces were developed usig, a fuzzy logic approach [14], multiple regressio aalysis [15], or eural etwork techiques [16]. Up to ow, the implemetatio of Artificial Itelligece (AI) approaches has demostrated to be capable of geeratig high-quality huma-like moophoic performaces based o examples of huma performers [2]. However, all of these systems have tested oly by computer systems or midi-eabled istrumets which limited the uique experiece of a live performace. Therefore, we proposed to implemet a AI approach to the performace cotrol system of the WF-4RIII which ca provided the uique experiece of a live performace. I particular, a feed-forward eural etwork was implemeted to model the musical expressiveess from a performace of a professioal flutist. From such a model, a set of musical performace rules ca be created ad the used by the flutist robot i order to produce a expressive performace; eve from a differet omial score (Figure 1). I this paper, the musical performace rules cosidered are: the ote duratio ad vibrato (duratio ad frequecy). I order to trai the ANN, the teachig sigal was obtaied by aalyzig the cosidered musical parameters from a recordig of a professioal flutist performace. I the followig sub-sectios, the way of aalyzig the huma performace is described. B. Note duratio As oe of the pricipal characteristics of a expressive performace, the deviatios of tempo are added by performers to express emotios. For this purpose, we have Fig. 1. The flutist robot may produce a expressive performace by extractig the performace rules modeled from a professioal flutist. 2553

proposed to aalyze the duratio of a ote from the performace of a professioal flutist. I order to aalyze such musical parameter, we have recorded a expressive performace of a professioal flutist. The recordig was sampled at 44100k Hz with a resolutio of 16 bits. Such recordig was the aalyzed by the short-time Fourier trasform (STFT). We experimetally foud that a frame size of 4096 poits (frequecy resolutio of 10.77 Hz) with a Haig widow obtaied a good compromise betwee resolutio ad processig speed. By computig the STFT, the ote duratio ca be easily obtaied by comparig the volume of the fudametal frequecy betwee two adjacet frames. I Fig. 2, the diagram flux used to determie the duratio of a ote is show. Basically, whe a ote is foud, the amplitude of the fudametal frequecy is obtaied ad the compared with the amplitude of the fudametal frequecy of cotiguous frames to detect whe the ote has chaged. As a result from the aalysis of the ote duratio ad vibrato, the professioal flutist performace ca be aalyzed; from where most of the musical parameters ca be obtaied (such as pitch, ote volume, ote off, ote o ad vibrato duratio/frequecy). I the followig sectio, the method for modelig the huma performace is described. f cet = 1200 log ( f ). 2 fo average (1) Fig. 3. Aalysis of the Vibrato from the performace of a huma player. Fig. 2. Algorithm used to determie the ote duratio. C. Vibrato: Duratio ad Frequecy Similar to the ote duratio, the vibrato plays a key role o producig a expressive performace. The vibrato gives a pleasig flexibility, tederess ad richess to the toe. I flute playig, it is maily used to add warmth ad expressiveess to otes. Basically, the pricipal parameters of the vibrato are the rate ad width of modulatio. The first parameter is related to how fast the vibrato is beig played while the secod oe is referred to how sharp/flat or how far from the ote is beig played. Therefore, we proposed to extract the vibrato duratio ad frequecy from the performace of a professioal flutist. I order to compute them, a otch filter was applied to the origial soud to reduce its oise. The, for each ote the frequecy specified i cets was calculated usig (1). By computig the frequecy specified i cets of each ote of a score, oe ca easily aalyze the duratio ad frequecy of each harmoic of a ote (i our case, up to the 3 rd harmoic was cosidered). As a example, i Fig. 3, the aalysis of the duratio ad frequecy of the vibrato of ote A4 is show. D. Artificial Neural Networks I order to create a expressive performace, a feed-forward eural etwork traied with the error back-propagatio algorithm was implemeted usig Borlad Builder C++. Feed-forward eural etworks are the most widely used models i may practical applicatios [16]. Such kid of etwork is divided ito layers: iput, hidde ad output (Figure 4a). The iput layer cosists of just the iputs to the etwork. The hidde layer cosists of ay umber of euros placed i parallel. Each euro (Figure 4b) performs a weighted summatio of the iputs, which is the passed through a o-liear fuctio kow as a activatio or trasfer fuctio. Mathematically the fuctioality of a hidde euro is described as (2); where u i is the iteral state of the euro, h i is a threshold value ad the umber of iputs as. The iteral state u i is represeted as i (3); where x j represets the iputs to euro ad w i,j are the weights betwee euros i ad j. I order to compute the fial etwork output, the trasfer fuctio f is defied as i (4). X ( h i = f u i i ). (2) u = j = W, 1 X 1 i 1 i, j j (3) f ( x) = 1 (1 + e ( a x) ). (4) I our applicatio, we modeled a expressive performace from a professioal flutist; where the ote duratio ad vibrato (duratio ad frequecy) were cosidered as relevat performace rules i producig local deviatio durig the flute performace. The model was created by usig a omial score as iput ad the cosidered performace rules as outputs (Figure 4a). I order to trai the ANN, the 2554

Fig. 4. a) Graphical represetatio of feed-forward NN traied with the error back-propagatio algorithm; b) Represetatio of oe uit i a NN. back-propagatio algorithm was cosidered. This kid of supervised learig icorporates a exteral teacher, so that each output uit is told what its desired respose to iput sigals ought to be. Durig learig, the weight vectors (W ij ) are updated usig (4); where E(t) is the error betwee the output value ad desired oe ad η is the learig rate. The ANN was traied to lear the extracted performace rules obtaied from the aalysis of the professioal flutist performace. Oe of the critical issues while desigig a eural etwork is the geeralizatio; which helps prevetig overfittig. Overfittig occurs whe a etwork has memorized the traied set but has ot leared to geeralize the ew iputs. I this paper; as a first approach, we have avoided such situatio by defiig a small umber of hidde layer uits ad by limitig the umber of learig steps (less tha 10,000). glottis plays a key role i producig a huma-like vibrato which will help i producig a performace with expressiveess. Thus, a ew vibrato mechaism for the WF-4RIII has bee desiged similar to the shape ad huma vocal cord (Figure 7a). The vocal cord part was fabricated with a theromoplastic rubber Septo by Kuralay Co. Ltd due to its high stiffess ad flexibility. I order to cotrol the amplitude ad its frequecy of the aperture of the glottis, a DC motor liked to a couple of gears (which are attached to the vocal fold were used through a lik) is used (Figure 7b). As a result, the ew vibrato system reproduce quite similar the vibratio of huma vocal cords. W k ij = W k ( 1) ( ) k ( t) ij t η δe t δw. (5) ij III. WASEDA FLUTIST ROBOT NO. 4 REFINED III The WF-4RIII was developed this year ad it has a total of 43-DOFs which reproduced the lips, eck, lug, arms, figers, vibrato ad eyes required for the flute playig performace (Figure 5). Compared with the previous versio, the WF-4RII [17], this ew versio has the same umber of degrees of freedom ad it has maily improved the desig of the vibrato ad lug system i order to implemet the performace rules previously described so that a effective cotrol of the ote duratio ad vibrato ca be achieved. A. New Vibrato Mechaism The previous vibrato mechaism implemeted o the WF-4RII was composed by a coil voice motor which presses directly a tube to add vibratios to the air beam pass through this mechaism [17]. However, huma uses a more complicated mechaism to produce a vibrato. It is believed that maily the vocal cord of huma has a importat role i producig it. I fact, by observig the larygeal movemet while playig a wid istrumet usig larygo-fiberscope, the shape of the vocal cord of flutists differs from the level of expertise [18]. As it is show i Fig. 6, the larygoscopic view of the vocal folds demostrated the differeces amog them. Therefore, we believe that the cotrol of the aperture of the Fig. 5. The Waseda Flutist Robot No.4 Refied III B. New Lug Mechaism The previous lug system o the WF-4RII was implemeted usig two vae mechaisms which were cotrolled by a AC motor [17]. The breathig process was cotrolled by a couple of valve mechaisms which were located behid the robot. Eve that the mechaical oise was effectively reduced, still some problems were foud. I particular, the air coversio efficiecy was too low (51%) which meas some the existece of some loss of air from the lugs to the lips. Furthermore, the time required for the ihalatio was too log (2.36 sec). Such kids of problems make difficult the accurate cotrol of the ote duratio while playig the flute. Therefore, a ew lug mechaism was desiged o the WF-4RIII by usig a bellow system located iside a acrylic cotaier. Each of the bellow has ot cotact with the 2555

cotaier to assure a high airtightess (Figure 8a). I order to icrease the ihalatio speed, a crak mechaism was used ad cotrolled by a AC servo motor attached to a lik; which it moves a shaft coected to the bellow plate (Fig. 8b). A rubber was attached alog the shaft. This ew desig eabled to improve the airtightess to achieve 85% air coversio efficiecy ad reduce the time required for the ihalatio process to 0.64 sec. Fig. 6. Larygoscopic view of the vocal folds from differet flutists. Fig. 7. a) New vibrato mechaism; b) 3D mechaism (top view) Fig. 8. a) New lug system of WF-4RIII; b) Lug mechaism detail. IV. EXPERIMENTS & RESULTS The experimets preseted i this paper are focused i verifyig the usefuless of implemetig performace rules, modeled from a expressive performace of a professioal flutist, to ehace the expressiveess of the WF-4RIII s performace. For that purpose, we would like to ivestigate if such a expressive performace model could be used by the flutist robot to automatically predict a ew expressive score differet from the oe modeled with the eural etwork (we are assumig a score with similar style). The proposed experimet was divided i three steps: at first, a expressive performace model was created from the recorded performace of a professioal flutist. The, we have verified how well the traiig data fits to the model. Fially; we cofirmed how well the created model ca predict a differet score with expressiveess. Therefore, we have recorded a professioal flutist performig the Soata No.4 i C Major composed by Hadel; from where ote duratio ad vibrato were extracted usig the proposed algorithms. Such musical parameters were the use to trai the feed-forward eural etwork (Figure 4b). I order to verify how well the created model fits the traiig data, the obtaied performace rules were used to create the music data which was coverted ito MIDI format so that it ca be reproduced o a computer system. Such performace was recorded ad compared with the professioal flutist performace. I order to compare the differeces betwee both performaces, the correlatio coefficiet was computed. The correlatio coefficiet is a quatity that gives the quality of how well the predicted data fits to the origial data. As a result from the compariso, a high correlatio coefficiet was foud for all the cosidered musical parameters (0.86, 0.93 ad 0.86 for the ote duratio, vibrato duratio ad frequecy respectively). From this result, we ca coclude that the implemeted ANN could effectively be used for modelig the expressiveess of a professioal flutist. Fially, we have used the previously produced expressive performace model to automatically predict the required deviatios from a differet score (with similar musical style). I this case, the musical score Le Cyge (composed by Camille Sait-Saës) was cosidered. The omial score was iputted to the expressive performace model ad the performace rules were automatically created. The outputs from the ANN were used to produce the music data which was the coverted ito midi format. The midi file was iputted o the WF-4RIII s cotrol performace system. The robot s performace was recorded ad the compared with the professioal flutist s performace; where the ote duratio ad vibrato parameters were extracted. I additio, the midi file was outputted o a midi device coected to a computer system ad compared with the professioal flutist performace. The musical parameters obtaied from both performaces are show i Fig. 9. Regardig the ote duratio, the robot s performace could early imitate the behavior foud i the huma oe (correlatio coefficiet = 0.81). Regardig the vibrato, some differeces betwee the huma ad robot performaces were foud; however, still the correlatio coefficiet is acceptable (0.71 ad 0.72 respectively). Regardig the compariso betwee the professioal flutist performace ad the oe reproduced o a midi device; a high correlatio coefficiet was foud for all the cosidered music parameters (0.86, 0.85 ad 0.81 for the ote duratio, vibrato duratio ad frequecy respectively). I order to uderstad the differeces betwee the results obtaied from the flutist robot s performace ad the oe reproduced o a midi device, a t-test statistical aalysis was performed; where o statistical differece was detected for the ote duratio ad the vibrato duratio parameters (p>0.05). Meawhile, a statistical differece was detected regardig the vibrato frequecy was 2556

foud (p<0.05). This meas that the AI approaches could be used to predict a expressive score eve whe the musical rules are used to produce a live performace based o the WF-4RIIII without cosiderable differeces. From the results preseted above, the implemetatio of feed-forward eural etwork eabled the WF-4RIII to automatically predict the required performace rules (ote duratio ad vibrato) to produce a expressive performace from a omial score; by usig a expressive performace model geerated from a professioal flutist performig a differet score (havig a similar style). Fig. 9. Comparig the professioal flutist performace vs. WF-4RIII. V. CONCLUSIONS & FUTURE WORK I this paper, the developmet of the WF-4RIII was detailed. From the computatioal poit of view, a feed-forward eural etwork traied with the error back-propagatio algorithm was implemeted to create a expressive performace model from a professioal flutist performace. As a result, a expressive performace was automatically produced from a omial score ad performed by the WF-4RIII. From the mechaical poit of view; the vibrato ad lug mechaism were re-desiged to effectively cotrol the music performace rules durig the robot s performace. I particular, a huma-like vocal cord was desiged ad the lug system was desiged to improve the airtightess ad to icrease the ihalatio speed. Although the WF-4RIII was able of automatically geeratig a expressive performace, we require performig further improvemets o the learig process of the ANN as well as o the performace cotrol system. Regardig the first issue, we will implemet more efficiet methods [16] to avoid overfittig (i.e. model selectio, early stoppig, etc). Regardig the performace cotrol system, a feedback sigal must be cosidered durig the learig process, so that the flutist robot ca also autoomously improve its ow performace. Therefore, as a future work, we will propose to implemet the feedback-error-learig based o the implemeted eural etworks. REFERENCES [1] A. Gabrielsso, Music performace, the psychology of music, i The Psychology of Music, 2 d ed., New York: Academic, 1997, pp. 35-47. [2] R. L. Mataras ad J.L. Arcos, AI ad Music: From compositio to expressive performace, AI Magazie, 2002, pp. 43-58. [3] J. Solis, K. Chida, K. Suefuji, ad A. Takaishi, The developmet of the athropomorphic flutist robot at Waseda Uiversity, Iteratioal Joural of Humaoid Robots, 2006, vol. 30(2), pp. 127-151. [4] M. Kajitai, Developmet of musicia robots, Joural of Robotics ad Mechatroics, 1989, vol. 1(3), pp. 254-255. [5] K. Shibuya, Aalysis of Huma KANSEI ad developmet of a violi playig robot, i Workshop of the IEEE/RSJ It. Coferece o Itelliget Robots ad Systems: Musical Performace Robots ad Its Applicatios, 2006, Beijig, Chia. [6] R. Bresi, Virtual Virtuosity: Studies Automatic Music Performace, Ph.D Thesis, Kugl Tekiska Hogskola, 2000, p. 32. [7] K. Chida, I. Okuma, S. Isoda, Y. Saisu, K. Wakamatsu, K. Nishikawa, J. Solis, H. Takaobu, A. Takaishi, Developmet of a ew athropomorphic flutist robot WF-4, i Proc. of IEEE Iteratioal Coferece o Robots ad Automatio, 2004. pp. 152-157. [8] J. Solis, K. Chida, K. Suefuji, K. Taiguchi, S.M. Hashimoto, ad A. Takaishi, Imitatio of huma flute playig by the athropomorphic flutist robot WF-4RII, i the Computer Music Joural, 2006, vol. 30(4). [9] N.P. Todd, A model of expressive timig i toal music, Music Perceptio, 1995, vol. 3, pp. 1940-1949. [10] A. Friberg, V. Colombo, L. Fryde, ad J. Sudberg, Performace rules for computer-cotrolled cotemporary keyboard music, Comput. Music Joural, 1991, vol. 15(2), pp. 49-55. [11] H. Katayose ad S. Iokuchi, Learig performace rules i a music iterpretatio system, Comput. Humaities, 1993, vol. 27, pp. 31-40. [12] J.L. Arcos ad R.L. de Mataras, A iteractive case-based reasoig approach for geeratig expressive music, Appl. Itell., 2001, vol. 14(1), pp. 115-129. [13] T. Suzuki, T. Toluaga, ad H. Taaka, A case based approach to the geeratio of musical expressio, i Proc. IJCAI, 1999, pp. 642-648. [14] R. Bresi, G.D. Poli, ad R. Ghetta, A fuzzy approach to performace rules, i Proc. XI Colloq. o Musical Iformatics, 1995, pp. 163-168. [15] O. Ishikawa, Y. Aoo, H. Katayose, ad S. Iokuchi, Extractio of musical performace rule usig a modified algorithm of multiple regressio aalysis, i Proc. KTH Symp. Grammars for Music Performace, 2000, pp. 348-351. [16] C.M. Bishop. Neural Networks for Patter Recogitio. Great Britai: Oxford Uiversity Press, 2004, pp. 116-121. [17] J. Solis, K. Suefuji, K. Chida, K. Taiguchi, ad A. Takaishi, The mechaical improvemets of the athropomorphic flutist robot WF-4RII to icrease the soud clarity ad to ehace the iteractivity with humas, i Proc. of the 16 th CISM-IFToMM Symposium o Robot Desig, Dyamics, ad Cotrol, 2006, pp. 247-254. [18] S. Mukai, Larygeal movemet while playig wid istrumets, i Proc. of Iteratioal Symposium o Musical Acoustics, 1992, pp. 239-241. 2557