Drum Stroke Computing: Multimodal Signal Processing for Drum Stroke Identification and Performance Metrics

Size: px
Start display at page:

Download "Drum Stroke Computing: Multimodal Signal Processing for Drum Stroke Identification and Performance Metrics"

Transcription

1 Drum Stroke Computing: Multimodal Signal Processing for Drum Stroke Identification and Performance Metrics Jordan Hochenbaum 1, 2 New Zealand School of Music 1 PO Box 2332 Wellington 6140, New Zealand jhochenbaum@calarts.edu Ajay Kapur 1,2 California Institute of the Arts McBean Parkway Valencia CA, akapur@calarts.edu ABSTRACT In this paper we present a multimodal system for analyzing drum performance. In the first example we perform automatic drum hand recognition utilizing a technique for automatic labeling of training data using direct sensors, and only indirect sensors (e.g. a microphone) for testing. Left/Right drum hand recognition is achieved with an average accuracy of 84.95% for two performers. Secondly we provide a study investigating multimodality dependent performance metrics analysis. Keywords Multimodality, Drum stroke identification, surrogate sensors, surrogate data training, machine learning, music information retrieval, performance metrics 1. INTRODUCTION AND MOTIVATION Combining machine learning techniques with percussive musical interface/instrument design is an emerging area of research that has seen many applications in recent years. Tindale investigated drum timbre recognition [17] and later applied similar techniques to turn regular drum triggers into expressive controllers for physical models [16]. Other examples have been proposed which even enable human-machine interaction with mechanical percussionists who can listen to human performers and improvise in real time [19]. In terms of signal processing there is now robust onset detection algorithms [1, 4] enabling one to accurately identify when musical events occur. Researchers have also been actively investigating other areas of musical performance such as tempo estimation [2, 7], beat tracking [3, 9], and percussive instrument segmentation [8]. Combining many of these techniques together, researches have explored the task of automatic transcription of drum and percussive performance, [6, 7, 12, 18]. While great advances have been made in the aforementioned tasks the majority of research into drum interaction scenarios which combine musical interfaces/instruments and machine learning have been concerned with the segmentation or isolation of individual drums from a recorded audio signal. While mono- and polyphonic drum segmentation is a major aspect to tasks such as automatic drum transcription, a key feature of drum Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. NIME 12, May 21-23, 2012, University of Michigan, Ann Arbor. Copyright remains with the author(s). performance (that has yet to be explored in current drum analysis literature) pertains to the physiological space of drum performance. Not only is it important to know when and which drum is played in a pattern, but also which hand is striking the drum. In this research we investigate this question, and propose a multimodal signal processing system for the automatic labeling and classification of left and right hand drum strikes from a monophonic audio source. There are many real-world cases where drum stroke recognition is important. In fact most traditional exercises which practicing drummers study emphasize the practice of specific left and right hand patterns (more information on this can be found in section 2.1.3). In automatic transcription scenarios, a key element that has been missing up until now is transcribing which hand performed a particular drum hit. In order to fully understand ones performance it is important to know how the player moves around the drum(s), the nuances and differences present in the strikes of their independent hands, and the possible stylistic signifiers resulting from the physical aspects of their individual hand strikes. This presents a large problem, as it is nearly impossible to determine which hand is hitting a drum from a monophonic audio recording alone. Using direct sensors such as accelerometers on the performers hands however we can capture extremely accurate ground truth about the movements of the performers hands. This comes at the cost of being invasive and possibly hindering performance. In a typical controlled machine-learning situation we can of course place constraints on the data-capturing scenario. One solution would be to only record left hand strikes, and then separately record right hand strikes, labeling them accordingly when performing feature extraction. We are interested however not only capturing each hand playing in isolation, but in context of actual performance and practice scenarios; and so the interplay between left and right hand playing is of utmost importance. Another option would be to manually label each audio event as being either from the left or right hand, based on a priori knowledge of a specific pattern played. As many data capturing scenarios (including ones in the research) involve specific patterns to be played, this is a common but time-consuming approach to labeling drum training performance data. Additionally this task is nonsympathetic to inevitable playing mistakes in the performance, which require manual adjustment when labeling the training data. We are also interested in investigating the improvisatory elements of drum performance, making the task of manually labeling hand-patterns nearly impossible. To overcome these challenges, this research turns to an exciting new technique inspired from Surrogate Sensing [14] to enable the automatic labeling of drum hand patterns for classification.

2 One of the earliest studies of drum performance showed how physical factors such as the start height of a stick could impact the resulting amplitudes and durations of the sound produced [10]. More recently, Dahl showed similar relationships between the correlation of strike velocity and the height and shape of the stroke in user studies [2]. Dolhansky et al. modeled the shape of a percussive stroke to turn mobile phones with accelerometers into physically-inspired percussive instruments [5]. There are many ways which people have attempted to analyze the gesture of drum performance and its effect on the dynamic and timbre spaces; Tindale et al. provides a good overview of sensor capturing methodologies in [15]. The research mentioned and other countless examples confirm the strong link between the physical space in which a performers actions exist, and the fingerprint imparted on the musical output. To this end we begin to investigate these ties in this paper by not only looking at drum-hand recognition, but also at statistical measures afforded by multimodal analysis of acoustical instrument output paired with NIME s. The remainder of this paper is as follows: In section 2 we provide an overview of our data collection and analysis framework, including our implementation of surrogate data training for automatic hand labeling of training data. We show our drum hand recognition results in section 3 and performance metrics in section 4. Finally our conclusions are discussed in section in section SYSTEM DESIGN AND IMPLEMENTATION In this section we describe the data capturing and analysis system used in the drum-stroke recognition experiment. From a high-level view, the drum-hand recognition experiment employs a three-step process including a data collection phase, an analysis phase, and finally the testing and machine-learning phase as illustrated in Figure 1. Once recorded the data is ready to be analyzed in other platforms such as Matlab, Marsyas, ChucK, etc. In our experiments all data was recorded as uncompressed.wav files at a sampling rate of samples per second. As audio-rate sampling is typically higher than common instrument sensor systems, sensor data is up-sampled to audio-rate using a sample-and-hold step function driven by the audio clock. This ensures synchronicity between audio and sensor data and enables the treatment of the sensor data as normal audio during analysis. Interacting with Nuance follows a similar paradigm as common multi-track digital audio workstations, and provides and easy drag-and-drop user interface Sensor System In our experiments Nuance was used to synchronously record three axis of motion from two accelerometers placed on the hands of the performers, as well as a single mono microphone recording the acoustic drum signal. The ADXL335 tri-axis accelerometer was used, as well as a Shure SM57 for recording the audio output of the snare drum. The microphone was placed in a typical configuration, approximately 1 3 from the rim of the snare, slightly angled down and across the head of the drum. The two accelerometers were placed on the topsides of the performers hands, and connected to a wireless transmitter. An Arduino Fio inside the transmitting device receives data form each axis of the accelerometers with 10-bit resolution, and transmits each sensor readings to a nearby computer over wireless Xbee (ZigBee) RF communication. This data was recorded directly over a serial-connection with the receiving Xbee module using Nuance. Analysis Machine Learning Audio Accelerometers L R - Onset Detection - Surrogate Data Training -Feature Extraction Hand-Pattern Recognition Figure 1 Overview of Drum Hand Recognition System 2.1 Data Collection A primary goal in the research was to make sure that the techniques used could easily be used in a variety of scenarios from live performance to assisted learning in music schools. It was also a goal to empower musicians to be able to run the experiments themselves. Other methodologies were considered, including hi-speed video camera tracking. While hi-speed video tracking could be a useful solution for hand tracking, and has been used by others for similar tasks such as bow performance tracking [13, 21], we desired a solution that was more affordable than typical hi-speed cameras, and that needed little to no calibration. Additionally the research was concerned with investigating surrogate data training, and so the following software and sensor system was used Nuance Nuance is a program written by the authors as a generalpurpose multi-track recording solution for multimodal data sources. Geared towards machine learning and musical data mining, Nuance enables nearly any musical sensor system and instrument communicating over serial, MIDI and/or OSC to synchronously capture its data to disk in.wav audio format. Figure 2 Drum Rudiments Performed Data Set The system described in sections and was used to record a total of 2917 snare-drum hits from two performers. Performer one was at a beginner level whereas performer two was an intermediate/advanced percussionist. The performers were instructed to play four fundamental drum exercises from the Percussive Arts Society 1 International Drum Rudiments; these included the Single Stroke Roll (referred to as D1 throughout the remainder of the paper), the Double Stroke Open Roll (D2), the Single Paradiddle (D3), and the Double Paradiddle (D4) (Figure 2). Each exercise was recorded for roughly 3 minutes, resulting in a total of 1467 hits (736 right hand / 731 left hand) for performer one and 1450 hits (726 right hand / 724 left hand) for performer two. In preliminary testing the performers recorded purely improvisational, however a more regimented routine was played during the final data 1 The PAS is the world s largest international percussion organization. More information on the PAS can be found at

3 collection process to enable other research into specific performance metrics and rudiment classification. Figure 2 details the drum rudiments performed. 2.2 Analysis Framework In the following sections we discuss the analysis framework that was used to extract features for the left/right hand classification experiments and metrics tracking Surrogate Data Training One of the biggest hurdles for musical supervised machine learning is obtaining and labeling a large enough training data set for true results. As described earlier in section 1 manually labeling the training data is not an efficient process, nor does it easily deal with errors that are common in the data collection phase. By using a process that can automatically label training data, the training regiment can be more loosely defined, even allowing the performer to improvise (unless there was specific desire to record particular patterns as in our case). Common disturbances in the data collection process such as performance mistakes, which normally must be accounted for by the researchers manually are also no longer an issue. We turn to a new technique inspired by Surrogate Sensors [14] enabling us to quickly record and label each hit in our audio recordings by using known information from direct sensors (accelerometers) to navigate unknown information in the data from our indirect sensor (microphone). The direct sensors provide the benefit of near perfect ground truth making the technique extremely robust (see section 2.2.3). The method is also transferable to other sensors and modalities, and the particular implementation in this research is described in the following section on onset detection. X Y Z Accelerometers L R Preprocessing - Remove DC - Rectify L/R Accel Sum Axis Jerk Env / Smooth Onset Curve / Peak Pick Figure 3 Overview of Onset Detection Algorithm Onset Detection A triple-axis accelerometer was placed on each of the performers hands while recording the data sets. The ultimate goal was to use gesture onsets in the independent hands accelerometers to navigate and label the note onsets in the audio recordings. As shown in Figure 3, each axis (per accelerometer) is first preprocessed in Matlab by removing the DC offset and full-wave rectification. The accelerometers each have their three axis summed and averaged to collapse the data streams into a single dimension. Next jerk is calculated for each accelerometer, followed by a threshold function to remove spurious jitter. To further smooth the signals before onset detection is applied, the envelopes of the signals are extracted, and smoothed with a Gaussian of standard deviation of samples. The onset curve is then calculated and peak-picked at local maxima s. Lastly onset detection was also performed on the audio recording, and all three streams (1 audio, 2 accelerometer) onset locations (in seconds) are stored in independent vectors. More detailed information on the onset detection algorithm can be found in [11] Onset Detection Accuracy Table 1 shows the onset classification accuracy of the accelerometers prior to correction. The high yield (99%) in accuracy of the accelerometers makes them a great candidate for surrogate labeling the audio onsets as either left or right hand onsets. The onset vectors were also exported as.txt files and imported into a beat-tracking application called BeatRoot [3] to visualize and (manually) correct any errors in the accelerometer onsets detected. It should be noted that the correction step was not necessary as the minor amount falsely detected onsets were few enough to not impact the data too much, however we desired 100% ground truth and so any falsepositive and false-negative onsets were corrected in BeatRoot prior to feature extraction. Precision Recall Performer Performer F-Measure Table 1 Accelerometer Onset Detection Accuracy Feature Extraction After onset detection, features were extracted in Matlab by taking the accelerometer onset positions for each hand and searching for the nearest detected onset (within a certain threshold determined by the frequency and tempo of the strikes) in the audio onsets. The strike in the audio file is then windowed to contain the entire single-hit and various features are extracted. The feature vector is labeled with the appropriate class (1 = Right, 2 = Left) and exported as an.arff file for machine learning analysis in Weka [20]. For each strike a 14- dimension feature vector is calculated containing: RMS, Spectral Rolloff, Spectral Centroid, Brightness, Regularity, Roughness, Skewness, Kurtosis, Spread, Attack Slope, Attack Time, Zero Crossing, MFCC (0 th coeff.), and the Onset Difference Time (ODT) between the detected audio and corresponding accelerometer onsets. 3. DRUM HAND RECOGNITION After the data was collected it was imported into Weka for supervised learning. The primary focus of this experiment was to investigate if a machine could be trained to reliably classify which hand was used to strike a snare drum. 3.1 Classification Five classifiers were used in our tests including a Multilayer Perception back-propagation artificial neural network, the J48 decision tree classifier, Naive Bays, a support vector machine trained using Sequential Minimal Optimization (SMO), and Logistic Regression. 10-Fold cross validation was used in all tests with a 12-dimension feature subset (attack time and onset difference features were removed for this experiment). 3.2 Results and Discussion This section describes the outcomes obtained from our classification tests. As this is binary classification scenario (classification can either be left or right hand), the chance classification baseline is 50%. Using the entire data set and 10-fold cross validation, the best results were achieved using multilayer perceptron (MLP) for

4 both performers. MLP yielded an accuracy of 84.93% for performer one and 84.96% classification accuracy for performer two. All of the algorithms appear to do a decent job at generalizing over the entire data set and provide similar classification results with smaller subsets of the feature vector. Performer 1 (%) Performer 2 (%) MLP SMO Naive Bays Logistic J Table 2 - Classification Accuracy Using All Data While it is clear that some classifiers seem to generalize quite well in all cases, more simple probabilistic classifiers such as Naive Bays seem to benefit greatly from having a larger training set that covers a wider variance in feature data. 4. PERFORMANCE METRICS Automatic drum hand recognition proposes exciting new possibilities including: more nuanced automatic drum transcription, preservation of performance technique from master musicians long after life, providing new controller data for live performance, and providing insightful information and metrics during regimented practice and musical training. However, the information from direct sensors can also be used in conjunction with indirect sensors to provide insightful new performance metrics and features. In this section we will look at new features and how they may add to our ability in describing and deducing meaningful information from musical performance. 4.1 Onset Differences In traditional drum performance analysis, temporal information such as timing deviations and onsets of drum hits are normally investigated by analyzing an audio recording. Researchers have not only investigated the physical onset times (in audio) but have also looked at the perceptual onset and attack times (often called PAT) in order to measure when sounds are actually heard [5]. Here we consider the physical onset times from sensors on the actual performer in relation to the onset times recorded simultaneously in the acoustical output in what we call the Onset Difference Time or ODT. Figure 4 - Onset Difference Times for the first 60 sec. of D1 (Performer 1 Top, Performer 2 Bottom) In Figure 4 we can see the onset difference times (in seconds) between the left (+) and right (o) hand accelerometer onsets and their audio onset times. A horizontal line at 0 would mean a perfect match (zero difference) in onset times, and an observation of the graphs shows that performer two (bottom) had a generally lower onset differentiation than performer one (top). Performer two was in fact a more highly experienced drummer, suggesting a great link or consistency in physical vs. acoustical onsets in this particular exercise. Observing Figure 4 it is also apparent that in this 60-second pass of D1, the onset difference times of performer two s individual hands were more closely related (in terms of mean onset difference) than that of performer ones. Data Set Min Max Mean Std (rush) (lag) P P Table 3 Average Onset Difference Statistics for Both Performers Table 3 and Figure 5 show averages from both hands and all data sets D1-D4. Min which we call rush is calculated as the average amount the accelerometer onsets that were earlier than the audio onsets. As such it is calculated only over negative onset difference times. On average, when performer two s physical strike onsets rushed the audio onsets, it did so less drastically than performer one. Again this may be attributed to the fact that performer two was a more experienced player with tighter timing than performer one. Max or lag is calculated as the average amount the accelerometer onsets were later than their paired audio onsets (positive difference times). Coincidentally, both performers lag differences were extremely similar. Mean is the average onset difference time calculated over the entire vector or onset differences for each performer. Again performer two performed with less distance between physical and audio onset times. Interestingly, performer two s standard deviation was slightly larger than performer one, meaning that the amount of dispersion from the performers mean performance was greater P1 min (rush) max (lag) mean std Figure 5 Bar Graph Visualizing Table 4 Metrics 5. CONCLUSIONS AND FUTURE WORK In this paper we investigated two ways in which multimodal signal processing and sensor systems can benefit percussive computation. In the first case study we used direct sensors (accelerometers) on a performer to automatically annotate and train the computer to perform drum stroke recognition from indirect sensors (a single microphone). Averaging the best results from two performers multilayer perception achieved 84.95% accuracy and shows that it is possible for the computer to identify whether a performer hit a drum with their left or right hand. Once trained with the direct sensors, the computer can non-invasively transcribe the physical attributes of a P2

5 percussionist s performance, adding important nuance to future automatic music transcription. Additionally automatic drumhand recognition will be useful in many pedagogical scenarios such as rudiment identification, accuracy and other performance/metrics measures. In live performance contexts where it may be desired to trigger musical events, processes, and/or visualization based on particular sequences of strikes, drum stroke recognition using non-invasive methods will also be extremely powerful. In the second case study we looked at statistics measures as performance metrics obtainable using a multimodal system. Our preliminary findings comparing data from typical direct and indirect sensors such as accelerometers and microphones (respectively) reconfirm the importance of the looking at both the audio-space and physical-space (simultaneously) when investigating musical performance. Research often chooses one or the other for analysis, however investigating the space in between is one we are excited at looking at more closely in the future. In the future we are looking forward to expanding the data set with a larger pool of performers, as well as investigating how well the techniques generalize to different snare drums (and eventually other drums in the drum set). It would also be particularly useful to add a third strike to the test set, when a player performs more complex patterns, including striking with both hands. The authors are particular interested in performance metrics tracking, and so the techniques discussed in this paper will serve as a foundation to continue research into performance metrics tracking, allowing performers to evaluate their playing in live performance and in the practice room. At the core of much of this is the trade-off between direct and indirect sensors. Indirect sensors such as microphones have proven to be extremely useful and reliable sources for music information retrieval, with the benefit on not hindering performance. At the same time they lack certain physical attributes that are only possible to obtain by placing more invasive direct sensors on the performer, and/or instrument/nime. In one sense we hope this research brings wider attention to a part of a novel technique called surrogate sensing which reduces the negative impact of invasive sensors by constraining their dependency to the training phase of a musical system or experiment. At the same time there is lots of work ahead and so the future will definitely still hold an important space for direct sensors in these scenarios. We look forward to a future where direct sensors such as accelerometers are small and light enough to be embedded within a drum stick without altering performance in any way, but also one where a trained machine can play back a recording from great musicians of the past and automatically transcribe the magical nuances of their performances for future generations. 6. ACKNOWLEDGMENTS We d like to thank Adam Tindale and George Tzanetakis for their previous work in drum performance analysis and surrogate sensing. 7. REFERENCES [1] Bello, J. et al A Tutorial on Onset Detection in Music Signals. Speech and Audio Processing, IEEE Transactions on. 13, 5 (2005), [2] Dahl, S On the Beat: Human movement and Timing in the Production an Perception of Music. Royal Institute of Technology. [3] Dixon, S Evaluation of the Audio Beat Tracking System BeatRoot. Journal of New Music Research. 36, 1 (2007), [4] Dixon, S Onset Detection Revisited. Proceedings of the 9th International Conference on Digital Audio Effects (DAFX 06) (Montreal, Canada, 2006). [5] Dolhansky, B. et al Designing an Expressive Virtual Percussive Instrument. Proceeding of the Sound and Music Computing Conference (2011). [6] Fitzgerald, D Automatic Drum Transcription and Source Separation. Doctoral. (Jun. 2004). [7] Gillet, O. and Richard, G Transcription and Separation of Drum Signals From Polyphonic Music. IEEE Transactions on Audio, Speech, and Language Processing. 16, 3 (Mar. 2008), [8] Goto, M. and Muraoka, Y A sound source separation system for percussion instruments. Transactions of the Institute of Electronics, Information and Communication Engineers (1994), [9] Goto, M. and Muraoka, Y Real-time beat tracking for drumless audio signals: chord change detection for musical decisions. Speech Commun. 27, 3-4 (1999), [10] Henzie, C.A Amplitude and duration characteristics of snare drum tones. Indiana University. [11] Lartillot, Olivier et al A Unifying Framework for Onset Detection, Tempo Estimation, and Pulse Clarity Prediction. 11th International Conference on Digital Audio Effects (Espoo, Finland, Sep. 2008). [12] Paulus, J.K. and Klapuri, A.P Conventional and periodic N-grams in the transcription of drum sequences International Conference on Multimedia and Expo, ICME 03. Proceedings (Jul. 2003), II vol.2. [13] Schoonderwaldt, E. et al Combining accelerometer and video camera: reconstruction of bow velocity profiles. Proceedings of the 2006 conference on New interfaces for musical expression (Paris, France, France, 2006), [14] Tindale, A. et al Training Surrogate Sensors in Musical Gesture Acquisition Systems. IEEE Transactions on Multimedia. 13, 1 (Feb. 2011), [15] Tindale, A.R. et al A comparison of sensor strategies for capturing percussive gestures. Proceedings of the 2005 conference on New interfaces for musical expression (Singapore, Singapore, 2005), [16] Tindale, A.R A hybrid method for extended percussive gesture. Proceedings of the 7th international conference on New interfaces for musical expression (New York, NY, USA, 2007), [17] Tindale, A.R. et al Retrieval of percussion gestures using timbre classification techniques. ISMIR (2004). [18] Tzanetakis, G. et al Subband-based Drum Transcription for Audio Signals. (Oct. 2005), 1 4. [19] Weinberg, G. and Driscoll, S Robot-human interaction with an anthropomorphic percussionist. Proceedings of the SIGCHI conference on Human Factors in computing systems (New York, NY, USA, 2006), [20] Witten, I.H. et al Data Mining: Practical Machine Learning Tools and Techniques, Third Edition. Morgan Kaufmann. [21] Zhang, B. et al Visual analysis of fingering for pedagogical violin transcription. Proceedings of the 15th international conference on Multimedia (New York, NY, USA, 2007),

TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION

TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION Jordan Hochenbaum 1,2 New Zealand School of Music 1 PO Box 2332 Wellington 6140, New Zealand hochenjord@myvuw.ac.nz

More information

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS

TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS TOWARD UNDERSTANDING EXPRESSIVE PERCUSSION THROUGH CONTENT BASED ANALYSIS Matthew Prockup, Erik M. Schmidt, Jeffrey Scott, and Youngmoo E. Kim Music and Entertainment Technology Laboratory (MET-lab) Electrical

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

Tempo and Beat Analysis

Tempo and Beat Analysis Advanced Course Computer Science Music Processing Summer Term 2010 Meinard Müller, Peter Grosche Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Tempo and Beat Analysis Musical Properties:

More information

THE importance of music content analysis for musical

THE importance of music content analysis for musical IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 1, JANUARY 2007 333 Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates With

More information

ON DRUM PLAYING TECHNIQUE DETECTION IN POLYPHONIC MIXTURES

ON DRUM PLAYING TECHNIQUE DETECTION IN POLYPHONIC MIXTURES ON DRUM PLAYING TECHNIQUE DETECTION IN POLYPHONIC MIXTURES Chih-Wei Wu, Alexander Lerch Georgia Institute of Technology, Center for Music Technology {cwu307, alexander.lerch}@gatech.edu ABSTRACT In this

More information

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016 6.UAP Project FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System Daryl Neubieser May 12, 2016 Abstract: This paper describes my implementation of a variable-speed accompaniment system that

More information

Subjective Similarity of Music: Data Collection for Individuality Analysis

Subjective Similarity of Music: Data Collection for Individuality Analysis Subjective Similarity of Music: Data Collection for Individuality Analysis Shota Kawabuchi and Chiyomi Miyajima and Norihide Kitaoka and Kazuya Takeda Nagoya University, Nagoya, Japan E-mail: shota.kawabuchi@g.sp.m.is.nagoya-u.ac.jp

More information

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu

More information

IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS

IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS 1th International Society for Music Information Retrieval Conference (ISMIR 29) IMPROVING RHYTHMIC SIMILARITY COMPUTATION BY BEAT HISTOGRAM TRANSFORMATIONS Matthias Gruhne Bach Technology AS ghe@bachtechnology.com

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS Andrew N. Robertson, Mark D. Plumbley Centre for Digital Music

More information

MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC

MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC MODELING RHYTHM SIMILARITY FOR ELECTRONIC DANCE MUSIC Maria Panteli University of Amsterdam, Amsterdam, Netherlands m.x.panteli@gmail.com Niels Bogaards Elephantcandy, Amsterdam, Netherlands niels@elephantcandy.com

More information

Automatic music transcription

Automatic music transcription Educational Multimedia Application- Specific Music Transcription for Tutoring An applicationspecific, musictranscription approach uses a customized human computer interface to combine the strengths of

More information

Training Surrogate Sensors in Musical Gesture Acquisition Systems Adam Tindale, Ajay Kapur, and George Tzanetakis, Member, IEEE

Training Surrogate Sensors in Musical Gesture Acquisition Systems Adam Tindale, Ajay Kapur, and George Tzanetakis, Member, IEEE 50 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 1, FEBRUARY 2011 Training Surrogate Sensors in Musical Gesture Acquisition Systems Adam Tindale, Ajay Kapur, and George Tzanetakis, Member, IEEE Abstract

More information

Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Automatic Construction of Synthetic Musical Instruments and Performers

Automatic Construction of Synthetic Musical Instruments and Performers Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.

More information

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.

More information

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods Kazuyoshi Yoshii, Masataka Goto and Hiroshi G. Okuno Department of Intelligence Science and Technology National

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH

HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH Proc. of the th Int. Conference on Digital Audio Effects (DAFx-), Hamburg, Germany, September -8, HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH George Tzanetakis, Georg Essl Computer

More information

Music Understanding and the Future of Music

Music Understanding and the Future of Music Music Understanding and the Future of Music Roger B. Dannenberg Professor of Computer Science, Art, and Music Carnegie Mellon University Why Computers and Music? Music in every human society! Computers

More information

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Musical Hit Detection

Musical Hit Detection Musical Hit Detection CS 229 Project Milestone Report Eleanor Crane Sarah Houts Kiran Murthy December 12, 2008 1 Problem Statement Musical visualizers are programs that process audio input in order to

More information

Introductions to Music Information Retrieval

Introductions to Music Information Retrieval Introductions to Music Information Retrieval ECE 272/472 Audio Signal Processing Bochen Li University of Rochester Wish List For music learners/performers While I play the piano, turn the page for me Tell

More information

Automatic Labelling of tabla signals

Automatic Labelling of tabla signals ISMIR 2003 Oct. 27th 30th 2003 Baltimore (USA) Automatic Labelling of tabla signals Olivier K. GILLET, Gaël RICHARD Introduction Exponential growth of available digital information need for Indexing and

More information

Topic 10. Multi-pitch Analysis

Topic 10. Multi-pitch Analysis Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds

More information

A REAL-TIME SIGNAL PROCESSING FRAMEWORK OF MUSICAL EXPRESSIVE FEATURE EXTRACTION USING MATLAB

A REAL-TIME SIGNAL PROCESSING FRAMEWORK OF MUSICAL EXPRESSIVE FEATURE EXTRACTION USING MATLAB 12th International Society for Music Information Retrieval Conference (ISMIR 2011) A REAL-TIME SIGNAL PROCESSING FRAMEWORK OF MUSICAL EXPRESSIVE FEATURE EXTRACTION USING MATLAB Ren Gang 1, Gregory Bocko

More information

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment

Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment Improvised Duet Interaction: Learning Improvisation Techniques for Automatic Accompaniment Gus G. Xia Dartmouth College Neukom Institute Hanover, NH, USA gxia@dartmouth.edu Roger B. Dannenberg Carnegie

More information

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Kyogu Lee

More information

TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS

TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS TOWARDS CHARACTERISATION OF MUSIC VIA RHYTHMIC PATTERNS Simon Dixon Austrian Research Institute for AI Vienna, Austria Fabien Gouyon Universitat Pompeu Fabra Barcelona, Spain Gerhard Widmer Medical University

More information

A repetition-based framework for lyric alignment in popular songs

A repetition-based framework for lyric alignment in popular songs A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

Week 14 Music Understanding and Classification

Week 14 Music Understanding and Classification Week 14 Music Understanding and Classification Roger B. Dannenberg Professor of Computer Science, Music & Art Overview n Music Style Classification n What s a classifier? n Naïve Bayesian Classifiers n

More information

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Emotion Recognition. Jaesung Lee. Chung-Ang University Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

Robert Alexandru Dobre, Cristian Negrescu

Robert Alexandru Dobre, Cristian Negrescu ECAI 2016 - International Conference 8th Edition Electronics, Computers and Artificial Intelligence 30 June -02 July, 2016, Ploiesti, ROMÂNIA Automatic Music Transcription Software Based on Constant Q

More information

ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1

ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1 ESTIMATING THE ERROR DISTRIBUTION OF A TAP SEQUENCE WITHOUT GROUND TRUTH 1 Roger B. Dannenberg Carnegie Mellon University School of Computer Science Larry Wasserman Carnegie Mellon University Department

More information

Statistical Modeling and Retrieval of Polyphonic Music

Statistical Modeling and Retrieval of Polyphonic Music Statistical Modeling and Retrieval of Polyphonic Music Erdem Unal Panayiotis G. Georgiou and Shrikanth S. Narayanan Speech Analysis and Interpretation Laboratory University of Southern California Los Angeles,

More information

MUSIC INFORMATION ROBOTICS: COPING STRATEGIES FOR MUSICALLY CHALLENGED ROBOTS

MUSIC INFORMATION ROBOTICS: COPING STRATEGIES FOR MUSICALLY CHALLENGED ROBOTS MUSIC INFORMATION ROBOTICS: COPING STRATEGIES FOR MUSICALLY CHALLENGED ROBOTS Steven Ness, Shawn Trail University of Victoria sness@sness.net shawntrail@gmail.com Peter Driessen University of Victoria

More information

PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS

PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS Andy M. Sarroff and Juan P. Bello New York University andy.sarroff@nyu.edu ABSTRACT In a stereophonic music production, music producers

More information

Improving Frame Based Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for

More information

Rhythm related MIR tasks

Rhythm related MIR tasks Rhythm related MIR tasks Ajay Srinivasamurthy 1, André Holzapfel 1 1 MTG, Universitat Pompeu Fabra, Barcelona, Spain 10 July, 2012 Srinivasamurthy et al. (UPF) MIR tasks 10 July, 2012 1 / 23 1 Rhythm 2

More information

Computational Modelling of Harmony

Computational Modelling of Harmony Computational Modelling of Harmony Simon Dixon Centre for Digital Music, Queen Mary University of London, Mile End Rd, London E1 4NS, UK simon.dixon@elec.qmul.ac.uk http://www.elec.qmul.ac.uk/people/simond

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information

Supervised Learning in Genre Classification

Supervised Learning in Genre Classification Supervised Learning in Genre Classification Introduction & Motivation Mohit Rajani and Luke Ekkizogloy {i.mohit,luke.ekkizogloy}@gmail.com Stanford University, CS229: Machine Learning, 2009 Now that music

More information

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface

MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface MusCat: A Music Browser Featuring Abstract Pictures and Zooming User Interface 1st Author 1st author's affiliation 1st line of address 2nd line of address Telephone number, incl. country code 1st author's

More information

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting Luiz G. L. B. M. de Vasconcelos Research & Development Department Globo TV Network Email: luiz.vasconcelos@tvglobo.com.br

More information

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection

Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection Robust Transmission of H.264/AVC Video using 64-QAM and unequal error protection Ahmed B. Abdurrhman 1, Michael E. Woodward 1 and Vasileios Theodorakopoulos 2 1 School of Informatics, Department of Computing,

More information

Mood Tracking of Radio Station Broadcasts

Mood Tracking of Radio Station Broadcasts Mood Tracking of Radio Station Broadcasts Jacek Grekow Faculty of Computer Science, Bialystok University of Technology, Wiejska 45A, Bialystok 15-351, Poland j.grekow@pb.edu.pl Abstract. This paper presents

More information

Retrieval of textual song lyrics from sung inputs

Retrieval of textual song lyrics from sung inputs INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Retrieval of textual song lyrics from sung inputs Anna M. Kruspe Fraunhofer IDMT, Ilmenau, Germany kpe@idmt.fraunhofer.de Abstract Retrieving the

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,

More information

Automatic music transcription

Automatic music transcription Music transcription 1 Music transcription 2 Automatic music transcription Sources: * Klapuri, Introduction to music transcription, 2006. www.cs.tut.fi/sgn/arg/klap/amt-intro.pdf * Klapuri, Eronen, Astola:

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES Vishweshwara Rao and Preeti Rao Digital Audio Processing Lab, Electrical Engineering Department, IIT-Bombay, Powai,

More information

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity

Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity Multiple instrument tracking based on reconstruction error, pitch continuity and instrument activity Holger Kirchhoff 1, Simon Dixon 1, and Anssi Klapuri 2 1 Centre for Digital Music, Queen Mary University

More information

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection

Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection Robust Transmission of H.264/AVC Video Using 64-QAM and Unequal Error Protection Ahmed B. Abdurrhman, Michael E. Woodward, and Vasileios Theodorakopoulos School of Informatics, Department of Computing,

More information

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,

More information

Tempo and Beat Tracking

Tempo and Beat Tracking Tutorial Automatisierte Methoden der Musikverarbeitung 47. Jahrestagung der Gesellschaft für Informatik Tempo and Beat Tracking Meinard Müller, Christof Weiss, Stefan Balke International Audio Laboratories

More information

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION

A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION A CLASSIFICATION APPROACH TO MELODY TRANSCRIPTION Graham E. Poliner and Daniel P.W. Ellis LabROSA, Dept. of Electrical Engineering Columbia University, New York NY 127 USA {graham,dpwe}@ee.columbia.edu

More information

A Beat Tracking System for Audio Signals

A Beat Tracking System for Audio Signals A Beat Tracking System for Audio Signals Simon Dixon Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria. simon@ai.univie.ac.at April 7, 2000 Abstract We present

More information

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY Eugene Mikyung Kim Department of Music Technology, Korea National University of Arts eugene@u.northwestern.edu ABSTRACT

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES

A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES 12th International Society for Music Information Retrieval Conference (ISMIR 2011) A PERPLEXITY BASED COVER SONG MATCHING SYSTEM FOR SHORT LENGTH QUERIES Erdem Unal 1 Elaine Chew 2 Panayiotis Georgiou

More information

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate

More information

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool For the SIA Applications of Propagation Delay & Skew tool Determine signal propagation delay time Detect skewing between channels on rising or falling edges Create histograms of different edge relationships

More information

Triggering Sounds From Discrete Air Gestures: What Movement Feature Has the Best Timing?

Triggering Sounds From Discrete Air Gestures: What Movement Feature Has the Best Timing? Triggering ounds From Discrete Air Gestures: What Movement Feature Has the Best Timing? Luke Dahl Center for Computer Research in Music and Acoustics Department of Music, tanford University tanford, CA

More information

Automatic Music Clustering using Audio Attributes

Automatic Music Clustering using Audio Attributes Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,

More information

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS

AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS AN APPROACH FOR MELODY EXTRACTION FROM POLYPHONIC AUDIO: USING PERCEPTUAL PRINCIPLES AND MELODIC SMOOTHNESS Rui Pedro Paiva CISUC Centre for Informatics and Systems of the University of Coimbra Department

More information

An Examination of Foote s Self-Similarity Method

An Examination of Foote s Self-Similarity Method WINTER 2001 MUS 220D Units: 4 An Examination of Foote s Self-Similarity Method Unjung Nam The study is based on my dissertation proposal. Its purpose is to improve my understanding of the feature extractors

More information

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn Introduction Active neurons communicate by action potential firing (spikes), accompanied

More information

Shimon: An Interactive Improvisational Robotic Marimba Player

Shimon: An Interactive Improvisational Robotic Marimba Player Shimon: An Interactive Improvisational Robotic Marimba Player Guy Hoffman Georgia Institute of Technology Center for Music Technology 840 McMillan St. Atlanta, GA 30332 USA ghoffman@gmail.com Gil Weinberg

More information

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES

MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES MUSICAL INSTRUMENT RECOGNITION WITH WAVELET ENVELOPES PACS: 43.60.Lq Hacihabiboglu, Huseyin 1,2 ; Canagarajah C. Nishan 2 1 Sonic Arts Research Centre (SARC) School of Computer Science Queen s University

More information

Music Database Retrieval Based on Spectral Similarity

Music Database Retrieval Based on Spectral Similarity Music Database Retrieval Based on Spectral Similarity Cheng Yang Department of Computer Science Stanford University yangc@cs.stanford.edu Abstract We present an efficient algorithm to retrieve similar

More information

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr

More information

REAL-TIME PITCH TRAINING SYSTEM FOR VIOLIN LEARNERS

REAL-TIME PITCH TRAINING SYSTEM FOR VIOLIN LEARNERS 2012 IEEE International Conference on Multimedia and Expo Workshops REAL-TIME PITCH TRAINING SYSTEM FOR VIOLIN LEARNERS Jian-Heng Wang Siang-An Wang Wen-Chieh Chen Ken-Ning Chang Herng-Yow Chen Department

More information

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed, VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS O. Javed, S. Khan, Z. Rasheed, M.Shah {ojaved, khan, zrasheed, shah}@cs.ucf.edu Computer Vision Lab School of Electrical Engineering and Computer

More information

Drum Source Separation using Percussive Feature Detection and Spectral Modulation

Drum Source Separation using Percussive Feature Detection and Spectral Modulation ISSC 25, Dublin, September 1-2 Drum Source Separation using Percussive Feature Detection and Spectral Modulation Dan Barry φ, Derry Fitzgerald^, Eugene Coyle φ and Bob Lawlor* φ Digital Audio Research

More information

Chord Classification of an Audio Signal using Artificial Neural Network

Chord Classification of an Audio Signal using Artificial Neural Network Chord Classification of an Audio Signal using Artificial Neural Network Ronesh Shrestha Student, Department of Electrical and Electronic Engineering, Kathmandu University, Dhulikhel, Nepal ---------------------------------------------------------------------***---------------------------------------------------------------------

More information

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon

A Study of Synchronization of Audio Data with Symbolic Data. Music254 Project Report Spring 2007 SongHui Chon A Study of Synchronization of Audio Data with Symbolic Data Music254 Project Report Spring 2007 SongHui Chon Abstract This paper provides an overview of the problem of audio and symbolic synchronization.

More information

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications Matthias Mauch Chris Cannam György Fazekas! 1 Matthias Mauch, Chris Cannam, George Fazekas Problem Intonation in Unaccompanied

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;

More information

Image Processing Using MATLAB (Summer Training Program) 6 Weeks/ 45 Days PRESENTED BY

Image Processing Using MATLAB (Summer Training Program) 6 Weeks/ 45 Days PRESENTED BY Image Processing Using MATLAB (Summer Training Program) 6 Weeks/ 45 Days PRESENTED BY RoboSpecies Technologies Pvt. Ltd. Office: D-66, First Floor, Sector- 07, Noida, UP Contact us: Email: stp@robospecies.com

More information

Classification of Timbre Similarity

Classification of Timbre Similarity Classification of Timbre Similarity Corey Kereliuk McGill University March 15, 2007 1 / 16 1 Definition of Timbre What Timbre is Not What Timbre is A 2-dimensional Timbre Space 2 3 Considerations Common

More information

TOWARDS A GENERATIVE ELECTRONICA: HUMAN-INFORMED MACHINE TRANSCRIPTION AND ANALYSIS IN MAXMSP

TOWARDS A GENERATIVE ELECTRONICA: HUMAN-INFORMED MACHINE TRANSCRIPTION AND ANALYSIS IN MAXMSP TOWARDS A GENERATIVE ELECTRONICA: HUMAN-INFORMED MACHINE TRANSCRIPTION AND ANALYSIS IN MAXMSP Arne Eigenfeldt School for the Contemporary Arts Simon Fraser University Vancouver, Canada arne_e@sfu.ca Philippe

More information

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Video-based Vibrato Detection and Analysis for Polyphonic String Music Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International

More information

OBSERVED DIFFERENCES IN RHYTHM BETWEEN PERFORMANCES OF CLASSICAL AND JAZZ VIOLIN STUDENTS

OBSERVED DIFFERENCES IN RHYTHM BETWEEN PERFORMANCES OF CLASSICAL AND JAZZ VIOLIN STUDENTS OBSERVED DIFFERENCES IN RHYTHM BETWEEN PERFORMANCES OF CLASSICAL AND JAZZ VIOLIN STUDENTS Enric Guaus, Oriol Saña Escola Superior de Música de Catalunya {enric.guaus,oriol.sana}@esmuc.cat Quim Llimona

More information

Lyrics Classification using Naive Bayes

Lyrics Classification using Naive Bayes Lyrics Classification using Naive Bayes Dalibor Bužić *, Jasminka Dobša ** * College for Information Technologies, Klaićeva 7, Zagreb, Croatia ** Faculty of Organization and Informatics, Pavlinska 2, Varaždin,

More information

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models Aric Bartle (abartle@stanford.edu) December 14, 2012 1 Background The field of composer recognition has

More information

Creating a Feature Vector to Identify Similarity between MIDI Files

Creating a Feature Vector to Identify Similarity between MIDI Files Creating a Feature Vector to Identify Similarity between MIDI Files Joseph Stroud 2017 Honors Thesis Advised by Sergio Alvarez Computer Science Department, Boston College 1 Abstract Today there are many

More information

Automatic Rhythmic Notation from Single Voice Audio Sources

Automatic Rhythmic Notation from Single Voice Audio Sources Automatic Rhythmic Notation from Single Voice Audio Sources Jack O Reilly, Shashwat Udit Introduction In this project we used machine learning technique to make estimations of rhythmic notation of a sung

More information

Torsional vibration analysis in ArtemiS SUITE 1

Torsional vibration analysis in ArtemiS SUITE 1 02/18 in ArtemiS SUITE 1 Introduction 1 Revolution speed information as a separate analog channel 1 Revolution speed information as a digital pulse channel 2 Proceeding and general notes 3 Application

More information