Automatic Selection and Concatenation System for Jazz Piano Trio Using Case Data

Similar documents
AN ESTIMATION METHOD OF VOICE TIMBRE EVALUATION VALUES USING FEATURE EXTRACTION WITH GAUSSIAN MIXTURE MODEL BASED ON REFERENCE SINGER

Evaluation of a Singing Voice Conversion Method Based on Many-to-Many Eigenvoice Conversion

10. Water tank. Example I. Draw the graph of the amount z of water in the tank against time t.. Explain the shape of the graph.

-To become familiar with the input/output characteristics of several types of standard flip-flop devices and the conversion among them.

Performance Rendering for Piano Music with a Combination of Probabilistic Models for Melody and Chords

A Turbo Tutorial. by Jakob Dahl Andersen COM Center Technical University of Denmark

MELODY EXTRACTION FROM POLYPHONIC AUDIO BASED ON PARTICLE FILTER

Adaptive Down-Sampling Video Coding

Measurement of Capacitances Based on a Flip-Flop Sensor

MULTI-VIEW VIDEO COMPRESSION USING DYNAMIC BACKGROUND FRAME AND 3D MOTION ESTIMATION

Lab 2 Position and Velocity

Hierarchical Sequential Memory for Music: A Cognitive Model

DO NOT COPY DO NOT COPY DO NOT COPY DO NOT COPY

Overview ECE 553: TESTING AND TESTABLE DESIGN OF. Ad-Hoc DFT Methods Good design practices learned through experience are used as guidelines:

Nonuniform sampling AN1

TRANSFORM DOMAIN SLICE BASED DISTRIBUTED VIDEO CODING

application software

4.1 Water tank. height z (mm) time t (s)

AUTOCOMPENSATIVE SYSTEM FOR MEASUREMENT OF THE CAPACITANCES

THE INCREASING demand to display video contents

On Mopping: A Mathematical Model for Mopping a Dirty Floor

EX 5 DIGITAL ELECTRONICS (GROUP 1BT4) G

Real-time Facial Expression Recognition in Image Sequences Using an AdaBoost-based Multi-classifier

application software

CE 603 Photogrammetry II. Condition number = 2.7E+06

Computer Vision II Lecture 8

Computer Vision II Lecture 8

Singing voice detection with deep recurrent neural networks

Solution Guide II-A. Image Acquisition. Building Vision for Business. MVTec Software GmbH

Video Summarization from Spatio-Temporal Features

LATCHES Implementation With Complex Gates

The Art of Image Acquisition

(12) (10) Patent N0.: US 7,260,789 B2 Hunleth et a]. (45) Date of Patent: Aug. 21, 2007

Truncated Gray-Coded Bit-Plane Matching Based Motion Estimation and its Hardware Architecture

Solution Guide II-A. Image Acquisition. HALCON Progress

The Art of Image Acquisition

Region-based Temporally Consistent Video Post-processing

Mean-Field Analysis for the Evaluation of Gossip Protocols

Source and Channel Coding Issues for ATM Networks y. ECSE Department, Rensselaer Polytechnic Institute, Troy, NY 12180, U.S.A

BLOCK-BASED MOTION ESTIMATION USING THE PIXELWISE CLASSIFICATION OF THE MOTION COMPENSATION ERROR

Coded Strobing Photography: Compressive Sensing of High-speed Periodic Events

A Methodology for Evaluating Storage Systems in Distributed and Hierarchical Video Servers

TUBICOPTERS & MORE OBJECTIVE

Telemetrie-Messtechnik Schnorrenberg

Removal of Order Domain Content in Rotating Equipment Signals by Double Resampling

UPDATE FOR DESIGN OF STRUCTURAL STEEL HOLLOW SECTION CONNECTIONS VOLUME 1 DESIGN MODELS, First edition 1996 A.A. SYAM AND B.G.

Workflow Overview. BD FACSDiva Software Quick Reference Guide for BD FACSAria Cell Sorters. Starting Up the System. Checking Cytometer Performance

Marjorie Thomas' schemas of Possible 2-voice canonic relationships

R&D White Paper WHP 120. Digital on-channel repeater for DAB. Research & Development BRITISH BROADCASTING CORPORATION.

LOW LEVEL DESCRIPTORS BASED DBLSTM BOTTLENECK FEATURE FOR SPEECH DRIVEN TALKING AVATAR

Automatic location and removal of video logos

Video inpainting of complex scenes based on local statistical model

Physics 218: Exam 1. Sections: , , , 544, , 557,569, 572 September 28 th, 2016

First Result of the SMA Holography Experirnent

Determinants of investment in fixed assets and in intangible assets for hightech

Student worksheet: Spoken Grammar

Study of Municipal Solid Wastes Transfer Stations Locations Based on Reverse Logistics Network

LCD Module Specification

G E T T I N G I N S T R U M E N T S, I N C.

Besides our own analog sensors, it can serve as a controller performing variegated control functions for any type of analog device by any maker.

Personal Computer Embedded Type Servo System Controller. Simple Motion Board User's Manual (Advanced Synchronous Control) -MR-EM340GF

Novel Power Supply Independent Ring Oscillator

SC434L_DVCC-Tutorial 1 Intro. and DV Formats

The Impact of e-book Technology on Book Retailing

Supercompression for Full-HD and 4k-3D (8k) Digital TV Systems

THERMOELASTIC SIGNAL PROCESSING USING AN FFT LOCK-IN BASED ALGORITHM ON EXTENDED SAMPLED DATA

A ROBUST DIGITAL IMAGE COPYRIGHT PROTECTION USING 4-LEVEL DWT ALGORITHM

LCD Module Specification

A Delay-efficient Radiation-hard Digital Design Approach Using CWSP Elements

A Delay-efficient Radiation-hard Digital Design Approach Using CWSP Elements

VECM and Variance Decomposition: An Application to the Consumption-Wealth Ratio

Digital Panel Controller

TLE Overview. High Speed CAN FD Transceiver. Qualified for Automotive Applications according to AEC-Q100

2015 Communication Guide

TEA2037A HORIZONTAL & VERTICAL DEFLECTION CIRCUIT

Drivers Evaluation of Performance of LED Traffic Signal Modules

SMD LED Product Data Sheet LTSA-G6SPVEKT Spec No.: DS Effective Date: 10/12/2016 LITE-ON DCC RELEASE

Advanced Handheld Tachometer FT Measure engine rotation speed via cigarette lighter socket sensor! Cigarette lighter socket sensor FT-0801

USB TRANSCEIVER MACROCELL INTERFACE WITH USB 3.0 APPLICATIONS USING FPGA IMPLEMENTATION

Computer Graphics Applications to Crew Displays

TLE6251D. Data Sheet. Automotive Power. High Speed CAN-Transceiver with Bus Wake-up. Rev. 1.0,

Commissioning EN. Inverter. Inverter i510 Cabinet 0.25 to 2.2 kw

Enabling Switch Devices

I (parent/guardian name) certify that, to the best of my knowledge, the

TLE7251V. 1 Overview. Features. Potential applications. Product validation. High Speed CAN-Transceiver with Bus Wake-up

Monitoring Technology

DIGITAL MOMENT LIMITTER. Instruction Manual EN B

TLE7251V. Data Sheet. Automotive Power. High Speed CAN-Transceiver with Bus Wake-up TLE7251VLE TLE7251VSJ. Rev. 1.0,

TLE9251V. 1 Overview. High Speed CAN Transceiver. Qualified for Automotive Applications according to AEC-Q100. Features

MELSEC iq-f FX5 Simple Motion Module User's Manual (Advanced Synchronous Control) -FX5-40SSC-S -FX5-80SSC-S

AN-605 APPLICATION NOTE

United States Patent (19) Gardner

Communication Systems, 5e

Theatrical Feature Film Trade in the United States, Europe, and Japan since the 1950s: An Empirical Study of the Home Market Effect

Diffusion in Concert halls analyzed as a function of time during the decay process

BayesianBand: Jam Session System based on Mutual Prediction by User and System

SOME FUNCTIONAL PATTERNS ON THE NON-VERBAL LEVEL

And the Oscar Goes to...peeeeedrooooo! 1

SAFETY WITH A SYSTEM V EN

LCD Module Specification

Transcription:

Proceedings of he 48h ISCIE Inernaional Symposium on Sochasic Sysems Theory and Is Applicaions Fukuoka, Nov. 4-5, 2016 Auomaic Selecion and Concaenaion Sysem for Jazz Piano Trio Using Case Daa Takeshi Hori, Kazuyuki Nakamura, Shigeki Sagayama Graduae School of Advanced Mahemaical Sciences, Meiji Universiy 4-21-1 Nakano-ku, Tokyo, 164-8525 Japan E-mail: {cs51003, knaka, sagayama}@meiji.ac.jp Absrac In his paper, we discuss a compuaional model of an auomaic jazz session sysem ha is saisically rainable model using lead shee and jazz session daa, in addiion, we provide an implemenaion as prooype sysem based on his model. In conras o mos previous jazz session sysems ha required heurisic rules and he human labeling of raining daa o esimae musical inenion of human players, we suggesed a saisically rainable mahemaical model of jazz session using sochasic sae ransiion model approximaing a musical rajecory model. Based on he model, we developed a jazz session sysem as a prooype using concaenaion of case daa from real jazz session recordings o show he validiy of our model. This sysem consiss of raining phase and concaenaing phase. In he raining phase, he sysem learns some parameers o classify he piano, bass, and drums daa using non-negaive marix facorizaion, and calculaes he chain probabiliies by rigram and co-occurrence probabiliies beween piano, bass, and drums. In he concaenaing phase, he sysem esimaes musical saes of bass and drums from piano midi-forma inpu, searches and selecs a suiable musical daa from case daa, and concaenaes a musical daa maching he key beween inpu piano and bass. As a resul of he comparaive evaluaion experimen using some concaenaed midi-forma daa by above mehods, our sysem was found o generae a jazz piano rio musical daa having nauralness and shown validiy of our proposing model. 1 Inroducion We previously developed an auomaic accompanimen sysem called Eurydice [1] ha allows empo changes and noe inserion/deviaion/subsiuion errors in human performance as well as repeas and skips. Alhough his sysem can deal wih various errors, a score informaion should be fixed in auomaic accompanimen sysems like Eurydice since such sysem need o follow and o mach beween he score informaion and player s performance. Previous sudies for an auomaic accompanimen sysem have been proposed from a variey of perspecives such as adding expression [2], Fig. 1: Modeling for musical sessions suppor sysem for children pracice [3]. On he oher hand, o deal wih improvised performance and o realize ineracion of each player s performance such as jazz music, here are various researches relaed jazz session sysems where improvised performances are allowed. As he nex sep, we are also working an auomaed jazz session sysem ha can follow improvised human performances o exend our sysem. Jazz session consiss of frequenly improvised par, where he players improvise on a score by racking oher players performances and esimaing heir inenions based on heir previous performances and score informaion. Therefore, a jazz session sysem needs o learn relaionship beween observable musical feaures and human inenions in order o generae musical performance o cooperae wih oher musical insrumens. In conras o mos previous session sysems based on jazz [4-6] required heurisic rules and he human labeling of raining daa o esimae musical inenion of human players, we suggesed a saisically rainable mahemaical model of jazz session based on sochasic sae ransiion model using a lead shee and musical performance daa o develop an auomaed jazz session sysem [7]. Figure 1 shows our concepual model. This paper describes an auomaic jazz-piano-riogeneraing sysem o show he validiy of our session model. 98

3. We need o make clear relaionship beween observed mulidimensional feaures and musical performances. 4. The sysem needs o generae oher insrumens pars. We can arrange hese problems as based on feaures, rajecory, mapping, and rendiion. In an ideal case where an unlimied amoun of daa are available, we would be able o deal wih hese problems, however, acually we have a limied amoun of daa, so ha o solve hese problems, we firs developed as prooype sysem deal wih above problems and nex considered wheher i can be solved by he following mehods in his research. Fig. 2: Trajecory model and concep of an auomaic session sysem (piano and bass) 2 Modeling Our approach is o realize a saisically rainable session sysem, where imiaes relaionship beween inpu and oupu of human jazz sessions which are excluded heurisic rules and daa labeling by human. To rain hese relaionship saisically, we need o define a musical performance mahemaically. Firsly, since we can feel musicaliy from conneced sounds of a cerain span, and music can be represened by a series of shor-ime feaure vecors including all feaures of music. If a musical informaion is expressed as a poin on a space, he ime-series informaion represened by such vecors can be defined as a rajecory on a space. We defined his space as a musical even space, where heoreically feaure vecors comprise all musical evens. Nexly considering exending from individual performances o sessions based on his rajecory model, we can regard musical performance of a session as a se of individual rajecories in a musical even space. Hence, good session is modeled as a se of well-inerwined rajecories in his space. Figure 2 is an idea of our model. Since we can ge he daase of well-inerwined rajecories from acual performance daa, if a session sysem learns he essence of way of enanglemen from good performance daa, he sysem can realize a human inerplay. 2.1 Pracical problems for realizing To rain saisically based on such rajecory model, here are a leas four problems based on daa sparsiy. 1. Defined space is oo high-dimensional o rain using a limied daa since a musical even space encloses all musical evens (such as a number of noe, velociy, aciviies of a player). 2. I is difficul o rain relaionship of rajecories from a limied daa because of coninuiy. Feaures Dimensionaliy reducion Trajecory Discreizaion (inerpolaion) Mapping beween feaures of each player Musical mapping Mapping beween feaures and performances Sample-based (subsiuion using case daa) Rendiion Sample-based (subsiuion using case daa) 3 Ouline of he prooype sysem The objecive of he prooype sysem is o show validiy of our proposing mahemaical model for jazz session. As configuraion in his paper, we adoped a jazz piano rio consising of a piano, bass, and drums. This sysem firs receives midi-forma daa consising of an only piano performance daa, and nex oupus a midiforma daa of piano rio synhesized performance daa of bass/drums using case daa. Alhough we modeled a musical performance as rajecory in all musical evens space, where he session sysem needs o rain a se of inerwined rajecories, we can approximae a rajecory model wih a sochasic sae ransiion model because of discreizaion, so ha we can approximaely esimae a corresponding rajecories wih inpu rajecory expressed by discreizing poins (Figure 3). 3.1 Feaures: Syle parameer We seleced parameers closely relaed o he jazz session on he basis of musical knowledge and defined hem as syle parameers o se an effecive axis in he session model. These parameers are used o rack musical performances and o calculae a degree of similariy beween every insrumens. We defined 68 parameers ha are exracable from he music performance a every uni ime (every bar in his paper) as follows: 99

To deermine a effecive clusering mehod, we previously compared hree mehods: k-means clusering, Gaussian mixure model (GMM), and non-negaive marix facorizaion (NMF)[8]. As a resul of our previous research, since we could ge a experimenal resuls where NMF clusering yielded he highes predicion accuracy, we used NMF clusering for discreizaion. Fig. 3: Concep of case-daa-based auomaic session sysem approximaed by a sochasic sae ransiion model Piano-specific feaures The number of noes composed of diaonic chords, and he characer of noes such as ension noes, avoid noes, and blue noes. The range beween he highes and lowes ones. Bass-specific feaures The range beween he highes and lowes ones. Drums-specific feaures Each number of noes of he hi-ha cymbal, snare drum, and crash cymbal. Common feaures The number of noes, he number of simulaneous sounds, he average velociies. The raio of he above feaures beween adjacen ime spans. The raio of sum of he above feaures hroughou he music. The raio of off-bea noes o all noes in he uni ime. 3.2 Trajecory and Musical mapping: NMF, rigram Alhough we need o learn co-occurrence of every insrumens rajecories o search a maching performance beween insrumens, relaionship of rajecories is non-linear since defined musical feaures axis of he space by syle parameers are differen from each oher. This problem is classified a non-linear idenificaion problem. In recen years, deep neural nework (DNN) is ofen used o learn he non-linear co-occurrence saisically, however o simplify, we esimaed a maching performance daa by assuming lineariy beween he same insrumen s space, and searching neares neighbor wih Euclidean disance crierion. On he oher hand, since he assumpion of lineariy is a srong consrain, we considered no only all space bu also subspaces. In order o segmen whole space ino subspaces, we used a clusering mehod. 3.2.1 NMF NMF is a mehod for facorizing a non-negaive marix ino a pair of non-negaive marices wih a lower rank [6]. A non-negaive original marix X is facorized X i,j k H i,k U k,j, where k denoes an index of he basis. H is called basis marix and U represens an acivaion. To calculae he crierion for approximaion, a disance measure beween X and HU is generally seleced from he Euclidean disance, generalized Kullback-Leibler divergence, or Iakura-Saio divergence, we used generalized Kullback-Leibler divergence since he disance was achieved he highes predicion accuracy compared oher mehods in his model by our previous research. In he case of generalized Kullback-Leibler divergence, emplyed he probabiliy densiy disribuion beween observed daa and approximaed marix HU is coninuous log Poisson disribuion. The disance based on generalized Kullback-Leibler divergence D(X HU) is expressed by D(X HU)= [ X X i,j log i,j i,j k H i,ku k,j ( X i,j )] H i,k U k,j. k On he oher hand, he coninuous log Poisson disribuion log Po(X HU)is [ log Po(X HU)= i,j X i,j log k H i,k U k,j log X i,k! k ] H i,k U k,j. Considering he minimizaion for HU, hese formulas are he same. Every parameers h ik,u kj can be calculaed by muliplicaive updae equaion as follows based on auxiliary funcion echnique: x ij x j ˆx h ik h ij u ij kj i ˆx ik, u kj u ij h ik kj. j u kj i u ik An acual acivaion marices U was given as follows by using an original marix X and a generalized inverse marices of he basis marices H + obained basis 100

marices H: U rain = H + rain X rain U es = H + rain X es. Then class numbers c k were assigned by c k (x j ) = arg max u kj. k Since a high-dimensional musical even space is segmened ino subspace by using NMF clusering, ime series characerisics of rajecories are approximaed by class series. Alhough here are various mehods for racking ime series characerisics of a musical performance which are expressed by sochasic sae ransiion model, for insance rigram and hidden Markov models, we used rigram because he predicion accuracy rae of rigram is beer han ha of HMMs in he simply discreized model according o our previous research. 3.2.2 Trigram Trigram is one of an N-gram model (in he case of N = 3). In an N-gram, given n saes {s 1,s 2,,s n }, he chain probabiliy is given as follows: n P (s 1,s 2,,s n )= P (s i s i N+1,,s i 1 ). i=1 For a rigram, he number of ransiions from i 2o i is expressed by N(s i i 2 ) and he chain probabiliy is given as P (s i s i 1 i 2 )=N(si i 2 ) N(s i 1 i 2 ). In his sysem, chain probabiliies by rigram are used o esimae classes of no piano bu bass and drums since we don need o esimae a piano performance because of bach processing. 3.2.3 Musical mapping Since he sysem needs o esimae inpupiano/bass/drums classes from inpu piano daa in he case of considering subspaces, he sysem also needs o assume a class number of bass/drums corresponding o he esimaed inpu-piano class a every bars using chain probabiliies and co-occurrence probabiliies. A class number of each insrumen in he measure number is expressed as c Piano,c Bass,c Drums. Similarly, a chain probabiliy based on rigram is c Bass 1,c Bass 2 ) c Drums 1,c Drums 2 ). Co-occurrence probabiliy beween piano and bass/drums is c Piano ) c Piano ). Consequenly, bass/drums class number a imes is esimaed from produc of hese probabiliies as c Bass = arg max c Drums c Bass c Piano = arg max c Drums ) c Piano ). c Bass 1,c Bass 2 ) c Drums 1,c Drums 2 ) Using above probabiliies, The sysem can esimae he bass and drums classes considered ime series of hem and co-occurrence informaion wih piano. 3.3 Rendiion Alhough he sysem uses case daa seleced as a similar performance daa wih inpu-piano daa o generae a synhesized music, a mapping beween feaures of insrumens is ofen missing in he case daa because of daa sparsiy. To compensae he problem, as menioned above, we assumed lineariy in a same insrumen s space, and he sysem firs searches he neares piano-case daa wih Euclidean disance crierions and subsiues he inpu based on he lineariy, and nex exracs bass/drums performance daa co-occurring he piano daa. To search he bes case daa, we compared following four crierions and evaluaed he validiy. 1. Random choice Randomly chose a bar from all songs 2. Neares neighbor Choose he neares neighbor case in he enire case daa in he Euclidean disance sense 3. NMF Choose he neares neighbor case wih he cluser he inpu belongs o 4. Co-occurrence + rigram Choose a case of he highes join probabiliy of co-occurrence and rigram. Use co-occurrence across he inpu and he ohers (i.e., maching consrain) Use rigram of classes along he individual insrumen(i.e., ime-series consrain) The differences are ha Neares neighbor searches neares neighbor daa from he whole space, by conras, NMF and Co-occurrence + rigram use he subspaces by clusering. Meanwhile, in conras o NMF considers he maching classes beween only piano, Co-occurrence + rigram esimaes he bass/drums classes and searches he maching daa under he consrain ha all insrumens classes are he same. 101

Fig. 4: Process char for auomaic selecion and concaenaion sysem for jazz piano rio 4 Auomaic selecion and concaenaion sysem for jazz piano rio: SCSJ Based on above discussion, we developed an auomaic sysem (SCSJ) ha received midi-forma daa consising of an only piano performance daa and oupu a midi-forma daa of piano rio concaenaed performance of bass/drums daa from case daa. SCSJ comprises wo phases (raining phase and concaenaing phase) and hree pars (raining par, analyzing par, and oupu par). Figure 4 shows a process char of SCSJ. 4.1 Training phase In a raining phase, firs, SCSJ esimaes he key because modulaion ofen occurs in a jazz session. Nexly, using syle parameers based on a esimaed key and some observable feaures of raining dae, SCJS obains basis marices from NMF algorihm a every insrumens, so ha SCSJ can esimae a class number a every bars from hese basis marices. Addiionally, SCSJ calculaed chain probabiliies based on rigram and co-occurrence probabiliies beween piano and he ohers class number. These parameers (basis marices, chain probabiliies, and co-occurrence probabiliies) are sored ino a daabase o esimae bass/drums classes from piano midi-forma inpu daa. This par for calculaing hese parameers is named as raining par. 4.2 Concaenaing phase A concaenaing phase mainly consiss of wo pars: an analyzing par and an oupu par. Firsly, midiforma daa consising of only a piano-daa inpus ino SCSJ. SCSJ firs esimaes key because of modulaion as well as raining phase. In an analyzing par, SCSJ esimaes a class number of inpu daa using basis marices. Nexly, SCSJ assumes class numbers of bass/drums corresponding o he esimaed inpu daa s class a every bars using chain probabiliies and co-occurrence probabiliies. Fig. 5: Chord progression of Auumn Leaves Moreover, SCSJ searches a bar number having a performance daa a every bars for synhesizing from case daa having same class wih inpu daa and having he mos neares syle parameers based on Euclidean disance. In conras o a bass/drums daa is searched from same piano class wih inpu piano in he case of <2, a bass/drums daa is searched from same inpu and case daa of piano and esimaed bass/drums class inhecaseof 2. In he case beween inpu piano daa and seleced bass performance s keys are disharmony, he key of he bass is shifed o be mached key of inpu daa. Finally, inpu piano and seleced/shifed bass and drums daa are concaenaed, so ha he piano rio daa is generaed and oupu as midi-forma in an oupu par. 4.3 Key esimaion Modulaion is ofen carried ou in a jazz. For insance, some pars of chord progression of Auumn Leaves (key:g moll ) which is one of a jazz sandard music are given as {C m7 F 7 Bmaj7 b Eb maj7 Ab5 m7 D 7 G m }. We should esimae a key o generae a maching performance because of such modulaion. The key esimaion on his sysem is described by he rule-based. One of a mehod for analyzing a modulaion is o uilize II m7 V 7 moion commonly used in a jazz. We can find C m7 F 7 as II m7 V 7 wihbdur b key. On he oher hand, A b5 m7 D 7 is he same of II m7 V 7 wih G b moll key. As a resul, we can see ha his chord progression consiss of a modulaion from Bdur b o Gb moll. Figure 5 shows he resul of chord analysis for Auumn Leaves. Chord name wrien in black is a chord progression of Auumn Leaves and red represens a esimaed key name. Therefore, his song comprises a configuraion ha is repeaed several imes par modulaion of Bdur b (relaive key) and G b moll (onic key) via pivo chord (common diaonic chord in muliple key) like Emaj7 b. 102

and HOW INSENSITIVE were evaluaed lower han ohers. Since onic I 7 is ofen regarded as dominan in blues, he lower evaluaion of BLUE MONK was due o key esimaion s errors. Similarly, HOW INSENSI- TIVE was due o key esimaion s errors oo. Because here is no onic key, a onaliy is floaing overall in his song. Moreover, in his research, since we exraced syle parameers and analyzed a bar uni, he sysem could hardly grasp fine feaures changes by clusering. Fig. 6: Comparaive evaluaion by five-grade evaluaion In addiion, some cyclic chord (such as I VI m7 II m7 V 7 ), a funcion of chord (such as onic, dominan, and subdominan), and secondary dominan (dominan chord corresponding assumed a emporary onic in a diaonic chord) are uilized as modulaion paerns o esimae he key. Esimaed key is uilized no only o exrac syle parameers like ension noe bu also o mach he key beween inpu piano daa and seleced bass daa since a key of inpu daa is ofen differen from ha of case daa. A pich of seleced bass daa is shifed o be he sound of diaonic scale wih esimaed key of an inpu daa o mach wih an inpu daa. 5 Experimenal evaluaion As configuraion, we used fifeen NMF-clusered classes o generae a music, where he number of classes was o mainain accuracy rae over 80% according o previous research. Chain and condiional probabiliies were calculaed by parameers based on NMF. On he oher hand, We used cross validaion upon raining and generaing. Inpu piano daa was exraced from a song ou of 13 songs daa (1:ALL OF ME, 2:AUTUMN LEAVES, 3:BLUE MONK, 4:BYE BYE BLACK- BIRD, 5:YOU D BE SO NICE TO COME HOME TO, 6:HOW INSENSITIVE, 7:MOANIN, 8:NIGHT AND DAY, 9:ROUND MIDNIGHT, 10:SOFTLY AS IN A MORNING SUNRISE, 11:STELLA BY STARLIGHT, 12:WALTZ FOR DEBBY, 13:THE DAYS OF WINE AND ROSES), he oher songs were used for raining. Research paricipan evaluaed random segmens of approximaely 45 seconds omiing inro. We used a five-grade evaluaion by wo suden wih musical experience of more han en years and wo sudens no musical experience. Figure 6 shows he resul of comparaive evaluaion of four searching mehods by random choice, neares neighbor, NMF, and Cooccurrence + rigram. The x axis expresses a music number (based on above number), and y axis illusraes mean opinion scores using five-grade evaluaion. We can find ha co-occurrence + rigram had he mos highes evaluaion. Meanwhile, BLUE MONK 6 Conclusions We modeled a session as a rajecory model, approximaed as a sochasic sae ransiion model, and developed he SCSJ based on he mahemaical model. Alhough his sysem is prooype using case daa o simplify, i could show validiy our proposing model based on saisically rainable model since co-occurrence and rigram (and NMF clusering) performed beer han ohers on lisening evaluaion. As fuure works, we plan o reconsruc using coninuous mixure HMMs and DNN. Alhough we approximaed ime series characerisics wih class series in his model o deal wih robusly, using he HMMs, we will deal wih clusering and ime series characerisics inegrally raher han individually. In addiion, we wan o approximae he rajecory in more deail by LSTM and DBN. To generae bass and drums performances wihou case daa is also an imporan issue. Furhermore, non-parameric bayesian inference migh be effecive o exclude parameers more, and o deal wih an acousical inpu, we need o research abou muli-pich analysis. References [1] E. Nakamura, R. Takeda, R.Yamamoo, Y. Saio, S. Sako, and S. Sagayama: Score following Handling Performances wih Arbirary Repeas and Skips and Auomaic Accompanimen, IPSJ Journal, Vol.54, No.4, pp.1338-1349, 2013. [2] H. Kaayose, K. Okudaira, and M. Hashida: sfp: A Piano Performance Inerface Using Expressive Performance Templee, IPSJ Journal, Vol.44, No.11, pp.2728-2736, 2003. [3] C. Oshima, K. Nishimoo, and M. Suzuki: A Piano Duo Performance Suppor Sysem o Moivae Children s Pracice a Home, IPSJ Journal, Vol.46, No.1, pp.157-171, 2005. [4] S. Wake, H. Kao, N. Saiwaki, and S. Inokuchi: Cooperaive Musical Parner Sysem Using Tension Parameer: JASPER (Jam Session Parner), Trans. IPS Japan, Vol.35, No.7, pp.1469-1481, 1994. 103

[5] M. Goo, I. Hidaka, H. Masumoo, Y. Kuroda, and Y. Muraoka: A Jazz Session Sysem for Inerplay among All Players - VirJa Session (Virual Jazz Session Sysem), Proc. ICMC, pp.346-349, 1996. [6] M. Hamanaka, M. Goo, H. Asoh, and N. Osu: Guiaris Simulaor: A Jam Session Sysem Saisically Learning Player s Reacions, IPSJ Journal, Vol.45, No.3, pp.698-709, 2004. [7] T. Hori, K. Nakamura, and S. Sagayama: Saisically Trainable Model of Jazz Session: Compuaional Model, Music Rendering Feaures and Case Daa Uilizaion, IPSJ SIG Technical Repor, Vol.2016-MUS-112, No.18, 2016. [8] D. D. Lee, and H. S. Seung: Algorihms for nonnegaive marix facorizaion, Advances in Neural Informaion Processing Sysems, Vol.13, pp.556-562, 2001. 104