A real time study of plosives in Glaswegian using an automatic measurement algorithm

Size: px
Start display at page:

Download "A real time study of plosives in Glaswegian using an automatic measurement algorithm"

Transcription

1 A real time study of plosives in Glaswegian using an automatic measurement algorithm Jane Stuart Smith, Tamara Rathcke, Morgan Sonderegger University of Glasgow; University of Kent, McGill University NWAV42, Pittsburgh, October, 2013

2 A real time study of plosives in Glaswegian using an automatic measurement algorithm Background the voicing contrast in Scottish English Methodology Glasgow real time corpus automatic phonetic measurement improving the algorithm algorithm performance Preliminary results

3 Background Scottish English is typically observed to show voiceless plosives with shorter aspiration than Southern varieties of English (e.g. Wells 1982)

4 Background Docherty et al (2011): Scottish English border 159 speakers 4 locations: Scottish/English; East/West 4662 tokens of voiced and voiceless plosives read wordlists

5 Background Docherty et al (2011): Younger speakers showed longer aspiration, measured as Voice Onset Time (VOT) less prevoicing than older speakers apparent time change? physiological constraints?

6 Background Docherty et al (2011): Scottish speakers at Eastern end (Eyemouth), showed shorter aspiration/vot than speakers at the Western end of the Border (Gretna) Eyemouth speakers also show more Scottish features (rhoticity; SVLR) fine grained aspect of plosive production subject to subtle sociolinguistic control (cf phonetic imitation studies, e.g. Nielsen 2011)

7 Research Question Is the voicing contrast in plosives changing in real time in Scottish English?

8 Research Question Is the voicing contrast in plosives changing in real time in Scottish English? sample of different ages recorded at different points in time sufficient number of tokens hand labelling VOT in spontaneous speech is very time consuming!

9 Fine phonetic variation and sound change: A real time study of Glaswegian Oct 2011 Sept 2014

10 A real time corpus of Glaswegian vernacular ideal structure Decade of recording Old Middle aged Young s 6 m, 6 f 6 m, 6 f 6 m, 6 f 1980s 6 m, 6 f 6 m, 6 f 6 m, 6 f 1990s 6 m, 6 f 6m, 6 f 6m, 6 f 2000s 6 m, 6 f 6m, 6 f 6m, 6 f

11 Sample for this paper Decade of recording Old Middle aged s 2 f 2 f 2 f 1980s 1990s 2000s 2 f 2 f 2 f Young 10 15

12 Sample for this paper Decade of recording 1970s 1980s 1990s Old f (sociolinguistic interview; oral history interview) Middle aged f (sociolinguistic interview) Young f (sociolinguistic interview) 2000s 2 f (oral history) 2 f (conversation) 2 f (conversation) Sources (with thanks): Labov; Macaulay; M74 Project; Glasgow Media Project

13 Corpus for this study LABB CAT (Fromont and Hay; previously ONZEMiner) Storage of time aligned transcripts Detailed contextualized searches Preliminary segmentation by forced alignment using HTK in LABB CAT

14 Methodology plosives voiceless /p t k/; voiced /b d g/ stressed syllable initial Automatic measurement algorithm Positive VOT voiceless plosives

15 Methodology plosives voiceless /p t k/; voiced /b d g/ stressed syllable initial Automatic measurement algorithm Positive VOT voiceless plosives voiced plosives (partial) release = burst + frication Negative VOT Closure duration

16 Automatic VOT measurement Manuallylabeled VOTs Training Goal: Minimize VOT prediction error on unseen data Classifier Classifier input, for a new stop: Where to start looking for VOT (search boundary) 62 acoustic feature functions Output: Predicted VOT boundaries Sonderegger & Keshet (2012), JASA Henry, Sonderegger, Keshet (2012), Interspeech

17 Feature functions: Based on cues used by human annotators Example: Mean of high frequency energy between burst and voicing onsets minus its mean before the burst onset Algorithm learns: High for good burst/voicing onset pair, low otherwise

18 Previous results: Positive VOT On 4 datasets: Trainable: Optimal performance with examples Accurate: Performance near intertranscriber agreement Intertranscriber Auto/manual Intertranscriber Auto/manual Switchboard Big Brother 2 ms 5 ms 10 ms Sonderegger & Keshet (2012), JASA

19 Procedure Training data: 100 tokens for 5 speakers First round of manual correction Code 1: correct Code 2: close, worth manually correcting Codes 3 8: completely wrong Algorithm altered Another round of manual correction

20 Manual correction (all plosives n = 4491) 100% 90% 80% 70% 60% 50% 40% 30% 20% Code 8 Code 7 Code 6 Code 5 Code 4 close Code 3 and Code 2 easily Code 1 corrected 10% 0% 70 O f01 70 O f02 70 M f01 70 M f02 00 O f01 00 O f02 70 Y f01 70 Y f02 00 M f01 00 M f02 00 Y f01 00 Y f s 1930s 1940s 1960s 1960s Decade of birth 1990s correct

21 Manual correction (all plosives n = 4491) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 70 O f01 70 O f02 70 M f01 70 M f02 00 O f01 00 O f02 70 Y f01 70 Y f02 00 M f01 00 M f02 00 Y f01 00 Y f s 1930s 1940s 1960s 1960s Decade of birth 1990s background noise Code 8 Code 7 Code 6 overlapping Code 5 speakers Code 4 Code 3 Code 2 Code 1 wrong forcedalignment

22 Manual correction (all plosives n = 4491) 100% 90% 80% strongly reduced 70% 60% 50% 40% 30% 20% Code 8 fricative or Code 7 Code 6 approximant Code 5 Code 4 Code 3 wrong Code 2 but Code 1 unclear why 10% 0% 70 O f01 70 O f02 70 M f01 70 M f02 00 O f01 00 O f02 70 Y f01 70 Y f02 00 M f01 00 M f02 00 Y f01 00 Y f s 1930s 1940s 1960s 1960s Decade of birth 1990s

23 Prediction results N = 4491; 12 speakers Code 1: correct: 52% Code 2: close: 15% Codes 3 8: wrong: 33%

24 Prediction results by voicing voiced: Code 1: correct: 45% Code 2: close: 18% Codes 3 8: wrong: 37% voiceless Code 1: correct: 61% Code 2: close: 12% Codes 3 8: wrong: 25%

25 Preliminary results voiced voiceless n= 3012 Voicing p <

26 Voiced plosives /b/ 1970s 2000s release phase may be getting longer /d/ very short = burst longer = VOT /g/ n= 1669

27 Voiceless plosives: /p/ n = s 2000s

28 Voiceless plosives: /p/ OLD MIDDLE AGED YOUNG n = s 2000s

29 Voiceless plosives: /p/ OLD MIDDLE AGED YOUNG p < n = s 2000s

30 Voiceless plosives: /t/ OLD MIDDLE AGED YOUNG p < n = s 2000s

31 Voiceless plosives: /k/ OLD MIDDLE AGED YOUNG p < n = s 2000s

32 Discussion Methodology large number of tokens (6125 > 3012 usable) processed in a short time 52% correct close to previous results in Sonderegger and Keshet (2012) for Switchboard/Big Brother voiced plosives need more parameters promising for sociolinguistic analysis

33 Discussion Preliminary results real time change? Voicing contrast is robust shift in phonetic realization from voicing to VOT/aspiration? age grading? No consistency in VOT duration according to age group Some younger speakers show much shorter VOTs than much older speakers (and vice versa)

34 Next steps Improve algorithm for voiced plosives: Positive VOT Negative VOT Closure duration % voicing during closure More speakers

35 GULP GLASGOW UNIVERSITY LABORATORY OF PHONETICS Feedback gratefully received Jane.Stuart

Phone-based Plosive Detection

Phone-based Plosive Detection Phone-based Plosive Detection 1 Andreas Madsack, Grzegorz Dogil, Stefan Uhlich, Yugu Zeng and Bin Yang Abstract We compare two segmentation approaches to plosive detection: One aproach is using a uniform

More information

Improving Frame Based Automatic Laughter Detection

Improving Frame Based Automatic Laughter Detection Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for

More information

Semester A, LT4223 Experimental Phonetics Written Report. An acoustic analysis of the Korean plosives produced by native speakers

Semester A, LT4223 Experimental Phonetics Written Report. An acoustic analysis of the Korean plosives produced by native speakers Semester A, 2017-18 LT4223 Experimental Phonetics Written Report An acoustic analysis of the Korean plosives produced by native speakers CHEUNG Man Chi Cathleen Table of Contents 1. Introduction... 3 2.

More information

Week 6 - Consonants Mark Huckvale

Week 6 - Consonants Mark Huckvale Week 6 - Consonants Mark Huckvale 1 Last Week Vowels may be described in terms of phonology, phonetics, acoustics and audition. There are about 20 phonological choices for vowels in English. The Cardinal

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional

More information

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016

Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016 Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016 Jordi Bonada, Martí Umbert, Merlijn Blaauw Music Technology Group, Universitat Pompeu Fabra, Spain jordi.bonada@upf.edu,

More information

/s/-stop Blends: Phonetically Consistent Minimal Pairs for Easier Elicitation

/s/-stop Blends: Phonetically Consistent Minimal Pairs for Easier Elicitation /s/-stop Blends: Phonetically Consistent Minimal Pairs for Easier Elicitation Eric Reid, M.S., CCC-SLP Workshop Number PS 5 CSHA 2016 Annual Convention and Exhibition /s/ + Kate = skate OR /s/ + gate =

More information

Myanmar (Burmese) Plosives

Myanmar (Burmese) Plosives Myanmar (Burmese) Plosives Three-way voiceless contrast? Orthographic Contrasts Bilabial Dental Alveolar Velar ပ သ တ က Series 2 ဖ ထ ခ ဘ ဗ သ (allophone) ဒ ဓ ဂ ဃ Myanmar script makes a three-way contrast

More information

Automatic Laughter Detection

Automatic Laughter Detection Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,

More information

Music Radar: A Web-based Query by Humming System

Music Radar: A Web-based Query by Humming System Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,

More information

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,

More information

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are

Auditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are In: E. Bruce Goldstein (Ed) Encyclopedia of Perception, Volume 1, Sage, 2009, pp 160-164. Auditory Illusions Diana Deutsch The sounds we perceive do not always correspond to those that are presented. When

More information

Measuring oral and nasal airflow in production of Chinese plosive

Measuring oral and nasal airflow in production of Chinese plosive INTERSPEECH 2015 Measuring oral and nasal airflow in production of Chinese plosive Yujie Chi 1, Kiyoshi Honda 1, Jianguo Wei 1, *, Hui Feng 1, Jianwu Dang 1, 2 1 Tianjin Key Laboratory of Cognitive Computation

More information

Processing Linguistic and Musical Pitch by English-Speaking Musicians and Non-Musicians

Processing Linguistic and Musical Pitch by English-Speaking Musicians and Non-Musicians Proceedings of the 20th North American Conference on Chinese Linguistics (NACCL-20). 2008. Volume 1. Edited by Marjorie K.M. Chan and Hana Kang. Columbus, Ohio: The Ohio State University. Pages 139-145.

More information

English Phonetics and Phonology. 1. Voiced and voiceless plosives. Voiced and voiceless plosives: Word-initial position

English Phonetics and Phonology. 1. Voiced and voiceless plosives. Voiced and voiceless plosives: Word-initial position English Phonetics and Phonology 1. Voiced and voiceless plosives Lecture 6: English consonants in detail KAMIYAMA, Takeki takeki.kamiyama@univ-paris8.fr Word-initial position Observe the consonant at the

More information

English Consonants - how can we classify them? Phonetics and Phonology. English Consonants - how can we classify them?

English Consonants - how can we classify them? Phonetics and Phonology. English Consonants - how can we classify them? English Consonants - how can we classify them? Phonetics and Phonology Lecture 7: English consonants in detail KAMIYAMA, Takeki takeki.kamiyama@univ-paris8.fr Three main properties: VOICE PLACE of articulation

More information

First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text

First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text Sabrina Stehwien, Ngoc Thang Vu IMS, University of Stuttgart March 16, 2017 Slot Filling sequential

More information

A repetition-based framework for lyric alignment in popular songs

A repetition-based framework for lyric alignment in popular songs A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine

More information

Brain-Computer Interface (BCI)

Brain-Computer Interface (BCI) Brain-Computer Interface (BCI) Christoph Guger, Günter Edlinger, g.tec Guger Technologies OEG Herbersteinstr. 60, 8020 Graz, Austria, guger@gtec.at This tutorial shows HOW-TO find and extract proper signal

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval IPEM, Dept. of musicology, Ghent University, Belgium Outline About the MAMI project Aim of the

More information

LING 202 Lecture outline W Sept 5. Today s topics: Types of sound change Expressing sound changes Change as misperception

LING 202 Lecture outline W Sept 5. Today s topics: Types of sound change Expressing sound changes Change as misperception LING 202 Lecture outline W Sept 5 Today s topics: Types of sound change Expressing sound changes Change as misperception 1 Discussion: Group work from last time Take the list of stronger and weaker sounds

More information

Sonority as a Primitive: Evidence from Phonological Inventories Ivy Hauser University of North Carolina

Sonority as a Primitive: Evidence from Phonological Inventories Ivy Hauser  University of North Carolina Sonority as a Primitive: Evidence from Phonological Inventories Ivy Hauser (ihauser@live.unc.edu, www.unc.edu/~ihauser/) University of North Carolina at Chapel Hill West Coast Conference on Formal Linguistics,

More information

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications Matthias Mauch Chris Cannam György Fazekas! 1 Matthias Mauch, Chris Cannam, George Fazekas Problem Intonation in Unaccompanied

More information

The odds of eternal optimization in OT

The odds of eternal optimization in OT The odds of eternal optimization in OT Paul Boersma, University of Amsterdam http://www.fon.hum.uva.nl/paul/ December 13, 2000 It is often suggested that if all sound change were due to optimizations of

More information

Retrieval of textual song lyrics from sung inputs

Retrieval of textual song lyrics from sung inputs INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Retrieval of textual song lyrics from sung inputs Anna M. Kruspe Fraunhofer IDMT, Ilmenau, Germany kpe@idmt.fraunhofer.de Abstract Retrieving the

More information

A New "Duration-Adapted TR" Waveform Capture Method Eliminates Severe Limitations

A New Duration-Adapted TR Waveform Capture Method Eliminates Severe Limitations 31 st Conference of the European Working Group on Acoustic Emission (EWGAE) Th.3.B.4 More Info at Open Access Database www.ndt.net/?id=17567 A New "Duration-Adapted TR" Waveform Capture Method Eliminates

More information

Computer Coordination With Popular Music: A New Research Agenda 1

Computer Coordination With Popular Music: A New Research Agenda 1 Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,

More information

homework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition

homework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING homework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition May 3,

More information

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu

More information

Pitfalls and Windfalls in Corpus Studies of Pop/Rock Music

Pitfalls and Windfalls in Corpus Studies of Pop/Rock Music Introduction Hello, my talk today is about corpus studies of pop/rock music specifically, the benefits or windfalls of this type of work as well as some of the problems. I call these problems pitfalls

More information

A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems

A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems Jérôme Urbain and Thierry Dutoit Université de Mons - UMONS, Faculté Polytechnique de Mons, TCTS Lab 20 Place du

More information

Experiments on musical instrument separation using multiplecause

Experiments on musical instrument separation using multiplecause Experiments on musical instrument separation using multiplecause models J Klingseisen and M D Plumbley* Department of Electronic Engineering King's College London * - Corresponding Author - mark.plumbley@kcl.ac.uk

More information

Sonority as a Primitive: Evidence from Phonological Inventories

Sonority as a Primitive: Evidence from Phonological Inventories Sonority as a Primitive: Evidence from Phonological Inventories 1. Introduction Ivy Hauser University of North Carolina at Chapel Hill The nature of sonority remains a controversial subject in both phonology

More information

Acoustic Prosodic Features In Sarcastic Utterances

Acoustic Prosodic Features In Sarcastic Utterances Acoustic Prosodic Features In Sarcastic Utterances Introduction: The main goal of this study is to determine if sarcasm can be detected through the analysis of prosodic cues or acoustic features automatically.

More information

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis

Semi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis Semi-automated extraction of expressive performance information from acoustic recordings of piano music Andrew Earis Outline Parameters of expressive piano performance Scientific techniques: Fourier transform

More information

Plosive voicing acoustics and voice quality in Yerevan Armenian

Plosive voicing acoustics and voice quality in Yerevan Armenian Plosive voicing acoustics and voice quality in Yerevan Armenian Scott Seyfarth and Marc Garellek Abstract Yerevan Armenian is a variety of Eastern Armenian with a three-way voicing contrast that includes

More information

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

Sarcasm Detection in Text: Design Document

Sarcasm Detection in Text: Design Document CSC 59866 Senior Design Project Specification Professor Jie Wei Wednesday, November 23, 2016 Sarcasm Detection in Text: Design Document Jesse Feinman, James Kasakyan, Jeff Stolzenberg 1 Table of contents

More information

Speech To Song Classification

Speech To Song Classification Speech To Song Classification Emily Graber Center for Computer Research in Music and Acoustics, Department of Music, Stanford University Abstract The speech to song illusion is a perceptual phenomenon

More information

Syllabling on instrument imitation: case study and computational segmentation method

Syllabling on instrument imitation: case study and computational segmentation method Syllabling on instrument imitation: case study and computational segmentation method Jordi Janer Music Technology Group, Pompeu Fabra University, Barcelona jjaner at iua.upf.edu - http://www.mtg.upf.edu

More information

AUD 6306 Speech Science

AUD 6306 Speech Science AUD 3 Speech Science Dr. Peter Assmann Spring semester 2 Role of Pitch Information Pitch contour is the primary cue for tone recognition Tonal languages rely on pitch level and differences to convey lexical

More information

Transcription of the Singing Melody in Polyphonic Music

Transcription of the Singing Melody in Polyphonic Music Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,

More information

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues Kate Park katepark@stanford.edu Annie Hu anniehu@stanford.edu Natalie Muenster ncm000@stanford.edu Abstract We propose detecting

More information

Automatic Laughter Segmentation. Mary Tai Knox

Automatic Laughter Segmentation. Mary Tai Knox Automatic Laughter Segmentation Mary Tai Knox May 22, 2008 Abstract Our goal in this work was to develop an accurate method to identify laughter segments, ultimately for the purpose of speaker recognition.

More information

Using Genre Classification to Make Content-based Music Recommendations

Using Genre Classification to Make Content-based Music Recommendations Using Genre Classification to Make Content-based Music Recommendations Robbie Jones (rmjones@stanford.edu) and Karen Lu (karenlu@stanford.edu) CS 221, Autumn 2016 Stanford University I. Introduction Our

More information

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Video-based Vibrato Detection and Analysis for Polyphonic String Music Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International

More information

Lecture 9 Source Separation

Lecture 9 Source Separation 10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research

More information

PSYCHOLOGICAL AND CROSS-CULTURAL EFFECTS ON LAUGHTER SOUND PRODUCTION Marianna De Benedictis Università di Bari

PSYCHOLOGICAL AND CROSS-CULTURAL EFFECTS ON LAUGHTER SOUND PRODUCTION Marianna De Benedictis Università di Bari PSYCHOLOGICAL AND CROSS-CULTURAL EFFECTS ON LAUGHTER SOUND PRODUCTION Marianna De Benedictis marianna_de_benedictis@hotmail.com Università di Bari 1. ABSTRACT The research within this paper is intended

More information

Expressive performance in music: Mapping acoustic cues onto facial expressions

Expressive performance in music: Mapping acoustic cues onto facial expressions International Symposium on Performance Science ISBN 978-94-90306-02-1 The Author 2011, Published by the AEC All rights reserved Expressive performance in music: Mapping acoustic cues onto facial expressions

More information

Load Frequency Control Structure for Ireland and Northern Ireland

Load Frequency Control Structure for Ireland and Northern Ireland Load Frequency Control Structure for Ireland and Northern Ireland EirGrid TSO & TSO consultation on a proposal for the determination of LFC blocks in accordance with Article 141(2) of the Commission Regulation

More information

AN ACOUSTIC-PHONETIC APPROACH TO VOCAL MELODY EXTRACTION

AN ACOUSTIC-PHONETIC APPROACH TO VOCAL MELODY EXTRACTION 12th International Society for Music Information Retrieval Conference (ISMIR 2011) AN ACOUSTIC-PHONETIC APPROACH TO VOCAL MELODY EXTRACTION Yu-Ren Chien, 1,2 Hsin-Min Wang, 2 Shyh-Kang Jeng 1,3 1 Graduate

More information

Further Topics in MIR

Further Topics in MIR Tutorial Automatisierte Methoden der Musikverarbeitung 47. Jahrestagung der Gesellschaft für Informatik Further Topics in MIR Meinard Müller, Christof Weiss, Stefan Balke International Audio Laboratories

More information

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis

Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra

More information

Topics in Computer Music Instrument Identification. Ioanna Karydi

Topics in Computer Music Instrument Identification. Ioanna Karydi Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches

More information

A Beat Tracking System for Audio Signals

A Beat Tracking System for Audio Signals A Beat Tracking System for Audio Signals Simon Dixon Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria. simon@ai.univie.ac.at April 7, 2000 Abstract We present

More information

Experiments with Fisher Data

Experiments with Fisher Data Experiments with Fisher Data Gunnar Evermann, Bin Jia, Kai Yu, David Mrva Ricky Chan, Mark Gales, Phil Woodland May 16th 2004 EARS STT Meeting May 2004 Montreal Overview Introduction Pre-processing 2000h

More information

Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts

Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts Gerald Friedland, Luke Gottlieb, Adam Janin International Computer Science Institute (ICSI) Presented by: Katya Gonina What? Novel

More information

Automatic Labelling of tabla signals

Automatic Labelling of tabla signals ISMIR 2003 Oct. 27th 30th 2003 Baltimore (USA) Automatic Labelling of tabla signals Olivier K. GILLET, Gaël RICHARD Introduction Exponential growth of available digital information need for Indexing and

More information

Timbre perception

Timbre perception Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani Timbre perception www.cariani.com Timbre perception Timbre: tonal quality ( pitch, loudness,

More information

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES Vishweshwara Rao and Preeti Rao Digital Audio Processing Lab, Electrical Engineering Department, IIT-Bombay, Powai,

More information

Automatic music transcription

Automatic music transcription Music transcription 1 Music transcription 2 Automatic music transcription Sources: * Klapuri, Introduction to music transcription, 2006. www.cs.tut.fi/sgn/arg/klap/amt-intro.pdf * Klapuri, Eronen, Astola:

More information

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning Mitsubishi Electric Research Laboratories, Cambridge, MA, USA {peker,ajayd,}@merl.com

More information

Acoustic and musical foundations of the speech/song illusion

Acoustic and musical foundations of the speech/song illusion Acoustic and musical foundations of the speech/song illusion Adam Tierney, *1 Aniruddh Patel #2, Mara Breen^3 * Department of Psychological Sciences, Birkbeck, University of London, United Kingdom # Department

More information

Phonology. Submission of papers

Phonology. Submission of papers Phonology Phonology is concerned with all aspects of phonology and related disciplines. Each volume contains three issues, published in May, August and December. Preference is given to papers which make

More information

MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS

MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS Ju-Chiang Wang Hung-Yan Gu Hsin-Min Wang Institute of Information Science, Academia Sinica Dept. of Computer

More information

Powerful Software Tools and Methods to Accelerate Test Program Development A Test Systems Strategies, Inc. (TSSI) White Paper.

Powerful Software Tools and Methods to Accelerate Test Program Development A Test Systems Strategies, Inc. (TSSI) White Paper. Powerful Software Tools and Methods to Accelerate Test Program Development A Test Systems Strategies, Inc. (TSSI) White Paper Abstract Test costs have now risen to as much as 50 percent of the total manufacturing

More information

Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder

Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder Präsentation des Papers ICWSM A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews

More information

CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016

CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016 CS 1674: Intro to Computer Vision Intro to Recognition Prof. Adriana Kovashka University of Pittsburgh October 24, 2016 Plan for today Examples of visual recognition problems What should we recognize?

More information

Computational analysis of rhythmic aspects in Makam music of Turkey

Computational analysis of rhythmic aspects in Makam music of Turkey Computational analysis of rhythmic aspects in Makam music of Turkey André Holzapfel MTG, Universitat Pompeu Fabra, Spain hannover@csd.uoc.gr 10 July, 2012 Holzapfel et al. (MTG/UPF) Rhythm research in

More information

TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION

TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION Jordan Hochenbaum 1,2 New Zealand School of Music 1 PO Box 2332 Wellington 6140, New Zealand hochenjord@myvuw.ac.nz

More information

Automatic music transcription

Automatic music transcription Educational Multimedia Application- Specific Music Transcription for Tutoring An applicationspecific, musictranscription approach uses a customized human computer interface to combine the strengths of

More information

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate

More information

Topic 10. Multi-pitch Analysis

Topic 10. Multi-pitch Analysis Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds

More information

Laughter and Topic Transition in Multiparty Conversation

Laughter and Topic Transition in Multiparty Conversation Laughter and Topic Transition in Multiparty Conversation Emer Gilmartin, Francesca Bonin, Carl Vogel, Nick Campbell Trinity College Dublin {gilmare, boninf, vogel, nick}@tcd.ie Abstract This study explores

More information

Automatic Construction of Synthetic Musical Instruments and Performers

Automatic Construction of Synthetic Musical Instruments and Performers Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.

More information

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC Lena Quinto, William Forde Thompson, Felicity Louise Keating Psychology, Macquarie University, Australia lena.quinto@mq.edu.au Abstract Many

More information

CHAPTER 8 CONCLUSION AND FUTURE SCOPE

CHAPTER 8 CONCLUSION AND FUTURE SCOPE 124 CHAPTER 8 CONCLUSION AND FUTURE SCOPE Data hiding is becoming one of the most rapidly advancing techniques the field of research especially with increase in technological advancements in internet and

More information

MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION

MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION Michael Epstein 1,2, Mary Florentine 1,3, and Søren Buus 1,2 1Institute for Hearing, Speech, and Language 2Communications and Digital

More information

Behavioral and neural identification of birdsong under several masking conditions

Behavioral and neural identification of birdsong under several masking conditions Behavioral and neural identification of birdsong under several masking conditions Barbara G. Shinn-Cunningham 1, Virginia Best 1, Micheal L. Dent 2, Frederick J. Gallun 1, Elizabeth M. McClaine 2, Rajiv

More information

Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012)

Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012) project JOKER JOKe and Empathy of a Robot/ECA: Towards social and affective relations with a robot Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012) http://www.chistera.eu/projects/joker

More information

Note : Answer all questions.

Note : Answer all questions. I BEGE-102/EEG-02 I BACHELOR'S DEGREE PROGRAMME O Term-End Examination %-1 December, 2009 C\J ELECTIVE COURSE-ENGLISH BEGE-102/EEG-02 : THE STRUCTURE OF MODERN ENGLISH Time : 3 hours Maximum Marks : 100

More information

A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification

A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification INTERSPEECH 17 August, 17, Stockholm, Sweden A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification Yun Wang and Florian Metze Language

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Drum Source Separation using Percussive Feature Detection and Spectral Modulation

Drum Source Separation using Percussive Feature Detection and Spectral Modulation ISSC 25, Dublin, September 1-2 Drum Source Separation using Percussive Feature Detection and Spectral Modulation Dan Barry φ, Derry Fitzgerald^, Eugene Coyle φ and Bob Lawlor* φ Digital Audio Research

More information

IMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC

IMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC IMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC Ashwin Lele #, Saurabh Pinjani #, Kaustuv Kanti Ganguli, and Preeti Rao Department of Electrical Engineering, Indian

More information

Multimodal databases at KTH

Multimodal databases at KTH Multimodal databases at David House, Jens Edlund & Jonas Beskow Clarin Workshop The QSMT database (2002): Facial & Articulatory motion Clarin Workshop Purpose Obtain coherent data for modelling and animation

More information

Instructions for producing camera-ready manuscript using MS-Word for publication in conference proceedings *

Instructions for producing camera-ready manuscript using MS-Word for publication in conference proceedings * Instructions for producing camera-ready manuscript using MS-Word for publication in conference proceedings * First Author and Second Author University Department, University Name, Address City, State ZIP/Zone,

More information

WAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS. A. Zehetner, M. Hagmüller, and F. Pernkopf

WAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS. A. Zehetner, M. Hagmüller, and F. Pernkopf WAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS A. Zehetner, M. Hagmüller, and F. Pernkopf Graz University of Technology Signal Processing and Speech Communication Laboratory, Austria ABSTRACT Wake-up-word (WUW)

More information

Mixed Linear Models. Case studies on speech rate modulations in spontaneous speech. LSA Summer Institute 2009, UC Berkeley

Mixed Linear Models. Case studies on speech rate modulations in spontaneous speech. LSA Summer Institute 2009, UC Berkeley Mixed Linear Models Case studies on speech rate modulations in spontaneous speech LSA Summer Institute 2009, UC Berkeley Florian Jaeger University of Rochester Managing speech rate How do speakers determine

More information

1. Introduction NCMMSC2009

1. Introduction NCMMSC2009 NCMMSC9 Speech-to-Singing Synthesis System: Vocal Conversion from Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices * Takeshi SAITOU 1, Masataka GOTO 1, Masashi

More information

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England

Brian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England Asymmetry of masking between complex tones and noise: Partial loudness Hedwig Gockel a) CNBH, Department of Physiology, University of Cambridge, Downing Street, Cambridge CB2 3EG, England Brian C. J. Moore

More information

D. BARD, J. NEGREIRA DIVISION OF ENGINEERING ACOUSTICS, LUND UNIVERSITY

D. BARD, J. NEGREIRA DIVISION OF ENGINEERING ACOUSTICS, LUND UNIVERSITY Room Acoustics (1) D. BARD, J. NEGREIRA DIVISION OF ENGINEERING ACOUSTICS, LUND UNIVERSITY Outline Room acoustics? Parameters Summary D. Bard, J. Negreira / May 2018 Basics All our life happens (mostly)

More information

Zero Crossover Dynamic Power Synchronization Technology Overview

Zero Crossover Dynamic Power Synchronization Technology Overview Technical Note Zero Crossover Dynamic Power Synchronization Technology Overview Background Engineers have long recognized the power benefits of zero crossover (Figure 1) over phase angle (Figure 2) power

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues Kate Park, Annie Hu, Natalie Muenster Email: katepark@stanford.edu, anniehu@stanford.edu, ncm000@stanford.edu Abstract We propose

More information

The Human Features of Music.

The Human Features of Music. The Human Features of Music. Bachelor Thesis Artificial Intelligence, Social Studies, Radboud University Nijmegen Chris Kemper, s4359410 Supervisor: Makiko Sadakata Artificial Intelligence, Social Studies,

More information

User-Specific Learning for Recognizing a Singer s Intended Pitch

User-Specific Learning for Recognizing a Singer s Intended Pitch User-Specific Learning for Recognizing a Singer s Intended Pitch Andrew Guillory University of Washington Seattle, WA guillory@cs.washington.edu Sumit Basu Microsoft Research Redmond, WA sumitb@microsoft.com

More information