Motivation. Analysis-and-manipulation approach to pitch and duration of musical instrument sounds without distorting timbral characteristics

Similar documents
The Blizzard Challenge 2014

MODELLING PERCEPTION OF SPEED IN MUSIC AUDIO

Implementation of Expressive Performance Rules on the WF-4RIII by modeling a professional flutist performance using NN

Analyzing the influence of pitch quantization and note segmentation on singing voice alignment in the context of audio-based Query-by-Humming

Practice Guide Sonata in F Minor, Op. 2, No. 1, I. Allegro Ludwig van Beethoven

Music Scope Headphones: Natural User Interface for Selection of Music

Logistics We are here. If you cannot login to MarkUs, me your UTORID and name.

Quality improvement in measurement channel including of ADC under operation conditions

Line numbering and synchronization in digital HDTV systems

Australian Journal of Basic and Applied Sciences

Chapter 7 Registers and Register Transfers

Image Enhancement in the JPEG Domain for People with Vision Impairment

A Novel Method for Music Retrieval using Chord Progression

ABSTRACT. woodwind multiphonics. Each section is based on a single multiphonic or a combination thereof distributed across the wind

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

The Communication Method of Distance Education System and Sound Control Characteristics

SMARTEYE ColorWise TM. Specialty Application Photoelectric Sensors. True Color Sensor 2-65

Recognition of Human Speech using q-bernstein Polynomials

2 Specialty Application Photoelectric Sensors

The new, parametrised VS Model for Determining the Quality of Video Streams in the Video-telephony Service

Randomness Analysis of Pseudorandom Bit Sequences

THE importance of music content analysis for musical

RHYTHM TRANSCRIPTION OF POLYPHONIC MIDI PERFORMANCES BASED ON A MERGED-OUTPUT HMM FOR MULTIPLE VOICES

Part II: Derivation of the rules of voice-leading. The Goal. Some Abbreviations

Drumix: An Audio Player with Real-time Drum-part Rearrangement Functions for Active Music Listening

Math of Projections:Overview. Perspective Viewing. Perspective Projections. Perspective Projections. Math of perspective projection

ttco.com

A Backlight Optimization Scheme for Video Playback on Mobile Devices

RELIABILITY EVALUATION OF REPAIRABLE COMPLEX SYSTEMS AN ANALYZING FAILURE DATA

On Human Capability and Acoustic Cues for Discriminating Singing and Speaking Voices

PROBABILITY AND STATISTICS Vol. I - Ergodic Properties of Stationary, Markov, and Regenerative Processes - Karl Grill

TIMBRE REPLACEMENT OF HARMONIC AND DRUM COMPONENTS FOR MUSIC AUDIO SIGNALS

L-CBF: A Low-Power, Fast Counting Bloom Filter Architecture

Daniel R. Dehaan Three Études For Solo Voice Summer 2010, Chicago

Digest Journal of Nanomaterials and Biostructures Vol. 13, No. 2, April - June 2018, p

2 Specialty Application Photoelectric Sensors

Musical Instrument Recognizer Instrogram and Its Application to Music Retrieval based on Instrumentation Similarity

Reliable Transmission Control Scheme Based on FEC Sensing and Adaptive MIMO for Mobile Internet of Things

THE Internet of Things (IoT) is likely to be incorporated

CODE GENERATION FOR WIDEBAND CDMA

PROJECTOR SFX SUFA-X. Properties. Specifications. Application. Tel

HarmonyMixer: Mixing the Character of Chords among Polyphonic Audio

Energy-Efficient FPGA-Based Parallel Quasi-Stochastic Computing

Image Intensifier Reference Manual

Classification of Timbre Similarity

Subjective Similarity of Music: Data Collection for Individuality Analysis

Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution

Preview Only. Legal Use Requires Purchase W PREVIEW PREVIEW PRE IEW PREVIEW PREVIEW PREVIEW PREVIEW PREVIE PREVIEW PREVIEW PREVIEW PREVIEW

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

STx. Compact HD/SD COFDM Transmitter. Features. Options. Accessories. Applications

TOWARDS AN AUDITORY REPRESENTATION OF COMPLEXITY

TRAINING & QUALIFICATION PROSPECTUS

Voice Security Selection Guide

EE260: Digital Design, Spring /3/18. n Combinational Logic: n Output depends only on current input. n Require cascading of many structures

9311 EN. DIGIFORCE X/Y monitoring. For monitoring press-fit, joining, rivet and caulking operations Series 9311 ±10V DMS.

Digest Journal of Nanomaterials and Biostructures Vol. 12, No. 3, July - September 2017, p

Mullard INDUCTOR POT CORE EQUIVALENTS LIST. Mullard Limited, Mullard House, Torrington Place, London Wel 7HD. Telephone:

A Model of Metric Coherence

Comparative Study of Different Techniques for License Plate Recognition

A Simulation Experiment on a Built-In Self Test Equipped with Pseudorandom Test Pattern Generator and Multi-Input Shift Register (MISR)

Features for Audio and Music Classification

CSI 2130 Machinery Health Analyzer

Index. LV Series. Multimedia Projectors FULL LINE PRODUCT GUIDE. usa.canon.com/projectors. REALiS LCOS Projectors. WUX10 Mark II D WUX10 Mark II...

Manual Comfort Air Curtain

Internet supported Analysis of MPEG Compressed Newsfeeds

Our competitive advantages : Solutions for X ray Tubes. X ray emitters. Long lifetime dispensers cathodes n. Electron gun manufacturing capability n

Achieving 550 MHz in an ASIC Methodology

Computational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)

Polychrome Devices Reference Manual

Analysis and Detection of Historical Period in Symbolic Music Data

FHD inch Widescreen LCD Monitor USERGUIDE

A. Flue Pipes. 2. Open Pipes. = n. Musical Instruments. Instruments. A. Flue Pipes B. Flutes C. Reeds D. References

Automatic music transcription

Working with PlasmaWipe Effects

Detection of Historical Period in Symbolic Music Text

A Robot Listens to Music and Counts Its Beats Aloud by Separating Music from Counting Voice

T-25e, T-39 & T-66. G657 fibres and how to splice them. TA036DO th June 2011

BesTrans AOC (Active Optical Cable) Spec and Manual

Read Only Memory (ROM)

Research on the Classification Algorithms for the Classical Poetry Artistic Conception based on Feature Clustering Methodology. Jin-feng LIANG 1, a

Music Genre Classification and Variance Comparison on Number of Genres

A SCORE-INFORMED PIANO TUTORING SYSTEM WITH MISTAKE DETECTION AND SCORE SIMPLIFICATION

Sigma 3-30KS Sigma 3-30KHS

SINGING EXPRESSION TRANSFER FROM ONE VOICE TO ANOTHER FOR A GIVEN SONG. Sangeon Yong, Juhan Nam

MUSI-6201 Computational Music Analysis

Further Topics in MIR

DCT 1000 Cable Terminal Installation Manual

MultiTest Modules. EXFO FTB-3923 Specs Provided by FTB-3920 and FTB-1400

ROUNDNESS EVALUATION BY GENETIC ALGORITHMS

MEMORY & TIMBRE MEMT 463

AN IMPROVED VARIABLE STEP-SIZE AFFINE PROJECTION SIGN ALGORITHM FOR ECHO CANCELLATION * Jianming Liu and Steven L Grant 1

Analysis, Synthesis, and Perception of Musical Sounds

Global and China Flat Panel TV (FPTV) Industry Report,

Minimum Span. Maximum Span Setting

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

Transcription of the Singing Melody in Polyphonic Music

Incidence and Progression of Astigmatism in Singaporean Children METHODS

Entropy ISSN by MDPI

PowerStrip Automatic Cut & Strip Machine

Timbre perception

Transcription:

Aalysis-ad-maipulatio approach to pitch ad duratio of musical istrumet souds without distortig timbral characteristics Takehiro Abe Katsutoshi Itoyama Kazuyoshi Yoshii Kazuori Komatai Tetsuya Ogata Hiroshi G. Okuo Departmet of Itelligece Sciece ad Techology, Kyoto Uiversity, Japa Natioal Istitute of Advaced Idustrial Sciece ad Techology (AIST) Demostratio: http://wiie.kuis.kyoto-u.ac.jp/~abe/dafx-8/ Active music listeig Selectig from users requiremet Chagig music to suit users feelig Active ad exploratory listeig user ca chage Istrumets Volume Timbre Drumix [Yoshii 7] Motivatio (Drums oly) Covetioal music listeig Selectig from limited playlist Oly listeig after pressig play Itoyama s EQ. [Itoyama 8] Passive ad limited listeig experiece Istrumet equalizers have bee developed (All) Our equalizer Replacig arbitrary part with users favorite timbre Demostratio (Trial equalize part buttos gere buttos Cotet midi soud sythesized piao soud Jazz soud (sythesis) Equalizer s souds are sythesized from real souds except midi souds Requiremets for our equalizer. Soud separatio from polyphoic audio to extract a musical istrumet soud that users wat to replace Well studied 2. Soud maipulatio from separated souds without timbral distortio to play arbitrary phrases The applicatio of separated souds is ot well studied Our research target Differece from the soud excited by real istrumet Objective Sythesizig mootoes excited by the same istrumet from multiple musical istrumet souds

Our defiitio of timbral features ASA s defiitio The quality of a soud that distiguishes it from others of the same pitch ad volume [ASA 6] Cocrete defiitio based o [Grey 77] Our defiitio The quality of a soud that cosists of three features except pitch ad volume.the relative amplitudes 3.Temporal evelopes of harmoic peaks 2.The iharmoic compoet We use the toal model that ca aalyze these features [Itoyama 8] Maipulatio of pitch ad duratio It is ot proper to achieve maipulatio without chagig the timbral features Seed, (44Hz) Timbre has pitch depedecy [Marozeau 3] Ref., (88Hz) Phase vocoder (88Hz) We use pitch-depedecy feature fuctio for the depedecy Attack, decay ad vibrato feature are i the same istrumet Seed, ref., (legth) Siusoidal model(legth4) attack segmet high frequecy vibrato feature We preserve attack, decay segmets ad vibrato feature Our method(88hz) Our method(legth4) Overview of our maipulatio method StepAalysis Separate harmoic ad iharmoic structures ad extract timbral features Step2Maipulatio Maipulate pitch, duratio, ad eergy of the iharmoic structure Step3Sythesis Sythesize harmoic ad iharmoic sigals ad add them Harmoic structure Frequecy Iharmoic structure Aalysis to obtai three features Toal model Feature2 Harmoic model Iharmoic model represets spectrogram of iharmoic compoet Frequecy Feature Frequecy Frequecy Spectral Power of structure harmoics is expressed as the Gaussia Mixture Model Temporal structure is expressed as the oparametric model Duratio Evelope Feature3 2

maipulatio Maipulatig the spectral evelope by multiplyig the pitch ( µ( ) by a desired ratio Obtai timbral features from pitch-depedet feature fuctio v v' µ ( µ( Frequecy µ '( µ' ( -depedet feature fuctio approximates timbral features over pitches by polyomial fuctio power of harmoics ( v ) the ratio of harmoic eergy to iharmoic eergy ( w / ) v of st. Power of th harmoics.8.6.4.2. 22 44 88 Fudametal Frequecy [Hz] v of 4th. Power of th harmoics.. 4. 22 44 88 Fudametal Frequecy [Hz] The ratio of harmoic e. to iharmoic e. w H / w I w H I 2. 6.... 22 44 88 Fudametal Frequecy [Hz] pitch [Hz] pitch [Hz] pitch [Hz] Power of harmoics Duratio maipulatio Maipulatig the temporal evelope ( E( ) by expadig or shrikig betwee oset ( r o ) ad offset ( r off ) de( detectio equatio: < ε, E( > Th dr Detect Detect Preserve Expad Preserve Temporal E ( E( evelope ro roff Preservig the vibrato ( µ( ) is aalyzed ad sythesized by siusoidal model Frequecy Origial Aalyze µ ( µ( Preserve Sythesize Preserve Smoothig Sythesized Sythesis from harmoics ad iharmoics Harmoic sigal ( ) usig siusoidal model s H s H s s I Equatios for harmoic sigal Harmoic sigal: s H ( t) = A ( t)exp[ jφ ( t)] Istace amplitude: A ( t ) w v ' E ( t ) Istace phase: = H φ ( t) = φ () + µ '( τ ) dτ t s Iharmoic sigal ( s I ) from iharmoic model weighted by iharmoic eergy ( w I ' ) Output sigal ( ) obtaied by addig these two sigals parameter is a maipulated parameter. w Harmoic eergy: H Power of harmoics: v ' Temporal evelope: : ( E µ' τ ) Evaluatio i pitch maipulatio Baselie method = Sophisticated siusoidal model Our method without pitch-depedet feature fuctio Criteria Spectral distace: evaluatio of harmoic compoet differece Mel-Frequecy Cepstrum Coefficiet (MFCC) distace: quatitative auditory measuremet evaluatio of harmoic ad iharmoic compoets differeces D = ( C ( f, C ( f, ) 2 / T f, t real Real soud sy C Spectrum or MFCC i Sythesis soud Frames Coditios 32 istrumets from RWC-MDB (forte, ormal articulatio) 3 idividuals for each istrumet -fold cross validatio (%:9% = [evaluatio data]:[learig data]) 3

Spectral distace MFCC distace Quality i pitch maipulatio Fagot for discussio musical istrumets Average 64.7% reduced 32.3% reduced There was good improvemet for the whole musical istrumets Discussio o good improvemet The result of the fagot Baselie Distaces icreased with the absolute maipulated semitoes Ours Distaces were stable Spectrum differece 4 2 2 4 Maipulated halftoes The result demostrated the validity of our method, which cosiderig pitch depedecy of timbre MFCC distace blueour method redbaselie method Maipulated semitoe low pitch high pitch Discussio o poor improvemet There was poor improvemet for istrumet souds that have a lot of iharmoic compoet i attack segmet The result of the madoli Spectral distace The relative amplitudes of harmoic peaks of a sythesized soud are to those of a real soud MFCC distace The distributio of the iharmoic compoet of a sythesized soud differs from that of a real soud. Oly w / w H I is isufficiet for pitch-depedecy of iharmoic compoet Spectral distace Spectrum differece MFCC distace blueour method redbaselie method MFCC differece.4.2.8.6.4.2 3 3 2 2 4 2 2 4 Maipulated halftoes 4 2 2 4 halftoes Maipulated semitoe low pitch high pitch Coclusio Objective Maipulatig pitch ad duratio of a musical istrumet soud usig multiple istrumet souds without distortig timbral characteristics Approach We defied ad aalyzed timbral features. I pitch maipulatio, we use pitch-depedecy of timbre as a pitch-depedet feature fuctio I duratio maipulatio, we preserve attack, decay ad the vibrato Future work Icorporatig other depedecies (e.g., volume) Evaluatig our method for duratio maipulatio Applyig our method to musical istrumet parts separated from the polyphoic audio sigals of commercial CD recordigs 4

maipulatio demo. for piao Real souds seed, (44Hz) Sythesized souds (88Hz) Phase vocoder STRAIGHT Duratio maipulatio demo. for violi Real soud Sythesized souds(legth4) seed, ref., (legth) Siusoidal model Our method ref., (88Hz) Siusoidal model Our method 2 from MARSYAS 2 do ot use soud Ref. as learig data maipulatio demo. for trumpet Real souds Sythesized souds (88Hz) seed, (44Hz) Phase vocoder STRAIGHT ref., (88Hz) Siusoidal model Our method 2 from MARSYAS 2 do ot use soud Ref. as learig data