A real time study of plosives in Glaswegian using an automatic measurement algorithm
|
|
- Philomena McKinney
- 5 years ago
- Views:
Transcription
1 A real time study of plosives in Glaswegian using an automatic measurement algorithm Jane Stuart Smith, Tamara Rathcke, Morgan Sonderegger University of Glasgow; University of Kent, McGill University NWAV42, Pittsburgh, October, 2013
2 A real time study of plosives in Glaswegian using an automatic measurement algorithm Background the voicing contrast in Scottish English Methodology Glasgow real time corpus automatic phonetic measurement improving the algorithm algorithm performance Preliminary results
3 Background Scottish English is typically observed to show voiceless plosives with shorter aspiration than Southern varieties of English (e.g. Wells 1982)
4 Background Docherty et al (2011): Scottish English border 159 speakers 4 locations: Scottish/English; East/West 4662 tokens of voiced and voiceless plosives read wordlists
5 Background Docherty et al (2011): Younger speakers showed longer aspiration, measured as Voice Onset Time (VOT) less prevoicing than older speakers apparent time change? physiological constraints?
6 Background Docherty et al (2011): Scottish speakers at Eastern end (Eyemouth), showed shorter aspiration/vot than speakers at the Western end of the Border (Gretna) Eyemouth speakers also show more Scottish features (rhoticity; SVLR) fine grained aspect of plosive production subject to subtle sociolinguistic control (cf phonetic imitation studies, e.g. Nielsen 2011)
7 Research Question Is the voicing contrast in plosives changing in real time in Scottish English?
8 Research Question Is the voicing contrast in plosives changing in real time in Scottish English? sample of different ages recorded at different points in time sufficient number of tokens hand labelling VOT in spontaneous speech is very time consuming!
9 Fine phonetic variation and sound change: A real time study of Glaswegian Oct 2011 Sept 2014
10 A real time corpus of Glaswegian vernacular ideal structure Decade of recording Old Middle aged Young s 6 m, 6 f 6 m, 6 f 6 m, 6 f 1980s 6 m, 6 f 6 m, 6 f 6 m, 6 f 1990s 6 m, 6 f 6m, 6 f 6m, 6 f 2000s 6 m, 6 f 6m, 6 f 6m, 6 f
11 Sample for this paper Decade of recording Old Middle aged s 2 f 2 f 2 f 1980s 1990s 2000s 2 f 2 f 2 f Young 10 15
12 Sample for this paper Decade of recording 1970s 1980s 1990s Old f (sociolinguistic interview; oral history interview) Middle aged f (sociolinguistic interview) Young f (sociolinguistic interview) 2000s 2 f (oral history) 2 f (conversation) 2 f (conversation) Sources (with thanks): Labov; Macaulay; M74 Project; Glasgow Media Project
13 Corpus for this study LABB CAT (Fromont and Hay; previously ONZEMiner) Storage of time aligned transcripts Detailed contextualized searches Preliminary segmentation by forced alignment using HTK in LABB CAT
14 Methodology plosives voiceless /p t k/; voiced /b d g/ stressed syllable initial Automatic measurement algorithm Positive VOT voiceless plosives
15 Methodology plosives voiceless /p t k/; voiced /b d g/ stressed syllable initial Automatic measurement algorithm Positive VOT voiceless plosives voiced plosives (partial) release = burst + frication Negative VOT Closure duration
16 Automatic VOT measurement Manuallylabeled VOTs Training Goal: Minimize VOT prediction error on unseen data Classifier Classifier input, for a new stop: Where to start looking for VOT (search boundary) 62 acoustic feature functions Output: Predicted VOT boundaries Sonderegger & Keshet (2012), JASA Henry, Sonderegger, Keshet (2012), Interspeech
17 Feature functions: Based on cues used by human annotators Example: Mean of high frequency energy between burst and voicing onsets minus its mean before the burst onset Algorithm learns: High for good burst/voicing onset pair, low otherwise
18 Previous results: Positive VOT On 4 datasets: Trainable: Optimal performance with examples Accurate: Performance near intertranscriber agreement Intertranscriber Auto/manual Intertranscriber Auto/manual Switchboard Big Brother 2 ms 5 ms 10 ms Sonderegger & Keshet (2012), JASA
19 Procedure Training data: 100 tokens for 5 speakers First round of manual correction Code 1: correct Code 2: close, worth manually correcting Codes 3 8: completely wrong Algorithm altered Another round of manual correction
20 Manual correction (all plosives n = 4491) 100% 90% 80% 70% 60% 50% 40% 30% 20% Code 8 Code 7 Code 6 Code 5 Code 4 close Code 3 and Code 2 easily Code 1 corrected 10% 0% 70 O f01 70 O f02 70 M f01 70 M f02 00 O f01 00 O f02 70 Y f01 70 Y f02 00 M f01 00 M f02 00 Y f01 00 Y f s 1930s 1940s 1960s 1960s Decade of birth 1990s correct
21 Manual correction (all plosives n = 4491) 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 70 O f01 70 O f02 70 M f01 70 M f02 00 O f01 00 O f02 70 Y f01 70 Y f02 00 M f01 00 M f02 00 Y f01 00 Y f s 1930s 1940s 1960s 1960s Decade of birth 1990s background noise Code 8 Code 7 Code 6 overlapping Code 5 speakers Code 4 Code 3 Code 2 Code 1 wrong forcedalignment
22 Manual correction (all plosives n = 4491) 100% 90% 80% strongly reduced 70% 60% 50% 40% 30% 20% Code 8 fricative or Code 7 Code 6 approximant Code 5 Code 4 Code 3 wrong Code 2 but Code 1 unclear why 10% 0% 70 O f01 70 O f02 70 M f01 70 M f02 00 O f01 00 O f02 70 Y f01 70 Y f02 00 M f01 00 M f02 00 Y f01 00 Y f s 1930s 1940s 1960s 1960s Decade of birth 1990s
23 Prediction results N = 4491; 12 speakers Code 1: correct: 52% Code 2: close: 15% Codes 3 8: wrong: 33%
24 Prediction results by voicing voiced: Code 1: correct: 45% Code 2: close: 18% Codes 3 8: wrong: 37% voiceless Code 1: correct: 61% Code 2: close: 12% Codes 3 8: wrong: 25%
25 Preliminary results voiced voiceless n= 3012 Voicing p <
26 Voiced plosives /b/ 1970s 2000s release phase may be getting longer /d/ very short = burst longer = VOT /g/ n= 1669
27 Voiceless plosives: /p/ n = s 2000s
28 Voiceless plosives: /p/ OLD MIDDLE AGED YOUNG n = s 2000s
29 Voiceless plosives: /p/ OLD MIDDLE AGED YOUNG p < n = s 2000s
30 Voiceless plosives: /t/ OLD MIDDLE AGED YOUNG p < n = s 2000s
31 Voiceless plosives: /k/ OLD MIDDLE AGED YOUNG p < n = s 2000s
32 Discussion Methodology large number of tokens (6125 > 3012 usable) processed in a short time 52% correct close to previous results in Sonderegger and Keshet (2012) for Switchboard/Big Brother voiced plosives need more parameters promising for sociolinguistic analysis
33 Discussion Preliminary results real time change? Voicing contrast is robust shift in phonetic realization from voicing to VOT/aspiration? age grading? No consistency in VOT duration according to age group Some younger speakers show much shorter VOTs than much older speakers (and vice versa)
34 Next steps Improve algorithm for voiced plosives: Positive VOT Negative VOT Closure duration % voicing during closure More speakers
35 GULP GLASGOW UNIVERSITY LABORATORY OF PHONETICS Feedback gratefully received Jane.Stuart
Phone-based Plosive Detection
Phone-based Plosive Detection 1 Andreas Madsack, Grzegorz Dogil, Stefan Uhlich, Yugu Zeng and Bin Yang Abstract We compare two segmentation approaches to plosive detection: One aproach is using a uniform
More informationImproving Frame Based Automatic Laughter Detection
Improving Frame Based Automatic Laughter Detection Mary Knox EE225D Class Project knoxm@eecs.berkeley.edu December 13, 2007 Abstract Laughter recognition is an underexplored area of research. My goal for
More informationSemester A, LT4223 Experimental Phonetics Written Report. An acoustic analysis of the Korean plosives produced by native speakers
Semester A, 2017-18 LT4223 Experimental Phonetics Written Report An acoustic analysis of the Korean plosives produced by native speakers CHEUNG Man Chi Cathleen Table of Contents 1. Introduction... 3 2.
More informationWeek 6 - Consonants Mark Huckvale
Week 6 - Consonants Mark Huckvale 1 Last Week Vowels may be described in terms of phonology, phonetics, acoustics and audition. There are about 20 phonological choices for vowels in English. The Cardinal
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox Final Project (EECS 94) knoxm@eecs.berkeley.edu December 1, 006 1 Introduction Laughter is a powerful cue in communication. It communicates to listeners the emotional
More informationExpressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016
Expressive Singing Synthesis based on Unit Selection for the Singing Synthesis Challenge 2016 Jordi Bonada, Martí Umbert, Merlijn Blaauw Music Technology Group, Universitat Pompeu Fabra, Spain jordi.bonada@upf.edu,
More information/s/-stop Blends: Phonetically Consistent Minimal Pairs for Easier Elicitation
/s/-stop Blends: Phonetically Consistent Minimal Pairs for Easier Elicitation Eric Reid, M.S., CCC-SLP Workshop Number PS 5 CSHA 2016 Annual Convention and Exhibition /s/ + Kate = skate OR /s/ + gate =
More informationMyanmar (Burmese) Plosives
Myanmar (Burmese) Plosives Three-way voiceless contrast? Orthographic Contrasts Bilabial Dental Alveolar Velar ပ သ တ က Series 2 ဖ ထ ခ ဘ ဗ သ (allophone) ဒ ဓ ဂ ဃ Myanmar script makes a three-way contrast
More informationAutomatic Laughter Detection
Automatic Laughter Detection Mary Knox 1803707 knoxm@eecs.berkeley.edu December 1, 006 Abstract We built a system to automatically detect laughter from acoustic features of audio. To implement the system,
More informationMusic Radar: A Web-based Query by Humming System
Music Radar: A Web-based Query by Humming System Lianjie Cao, Peng Hao, Chunmeng Zhou Computer Science Department, Purdue University, 305 N. University Street West Lafayette, IN 47907-2107 {cao62, pengh,
More informationAPPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC
APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,
More informationAuditory Illusions. Diana Deutsch. The sounds we perceive do not always correspond to those that are
In: E. Bruce Goldstein (Ed) Encyclopedia of Perception, Volume 1, Sage, 2009, pp 160-164. Auditory Illusions Diana Deutsch The sounds we perceive do not always correspond to those that are presented. When
More informationMeasuring oral and nasal airflow in production of Chinese plosive
INTERSPEECH 2015 Measuring oral and nasal airflow in production of Chinese plosive Yujie Chi 1, Kiyoshi Honda 1, Jianguo Wei 1, *, Hui Feng 1, Jianwu Dang 1, 2 1 Tianjin Key Laboratory of Cognitive Computation
More informationProcessing Linguistic and Musical Pitch by English-Speaking Musicians and Non-Musicians
Proceedings of the 20th North American Conference on Chinese Linguistics (NACCL-20). 2008. Volume 1. Edited by Marjorie K.M. Chan and Hana Kang. Columbus, Ohio: The Ohio State University. Pages 139-145.
More informationEnglish Phonetics and Phonology. 1. Voiced and voiceless plosives. Voiced and voiceless plosives: Word-initial position
English Phonetics and Phonology 1. Voiced and voiceless plosives Lecture 6: English consonants in detail KAMIYAMA, Takeki takeki.kamiyama@univ-paris8.fr Word-initial position Observe the consonant at the
More informationEnglish Consonants - how can we classify them? Phonetics and Phonology. English Consonants - how can we classify them?
English Consonants - how can we classify them? Phonetics and Phonology Lecture 7: English consonants in detail KAMIYAMA, Takeki takeki.kamiyama@univ-paris8.fr Three main properties: VOICE PLACE of articulation
More informationFirst Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text
First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text Sabrina Stehwien, Ngoc Thang Vu IMS, University of Stuttgart March 16, 2017 Slot Filling sequential
More informationA repetition-based framework for lyric alignment in popular songs
A repetition-based framework for lyric alignment in popular songs ABSTRACT LUONG Minh Thang and KAN Min Yen Department of Computer Science, School of Computing, National University of Singapore We examine
More informationBrain-Computer Interface (BCI)
Brain-Computer Interface (BCI) Christoph Guger, Günter Edlinger, g.tec Guger Technologies OEG Herbersteinstr. 60, 8020 Graz, Austria, guger@gtec.at This tutorial shows HOW-TO find and extract proper signal
More informationAutomatic Piano Music Transcription
Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening
More informationThe MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval
The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval IPEM, Dept. of musicology, Ghent University, Belgium Outline About the MAMI project Aim of the
More informationLING 202 Lecture outline W Sept 5. Today s topics: Types of sound change Expressing sound changes Change as misperception
LING 202 Lecture outline W Sept 5 Today s topics: Types of sound change Expressing sound changes Change as misperception 1 Discussion: Group work from last time Take the list of stronger and weaker sounds
More informationSonority as a Primitive: Evidence from Phonological Inventories Ivy Hauser University of North Carolina
Sonority as a Primitive: Evidence from Phonological Inventories Ivy Hauser (ihauser@live.unc.edu, www.unc.edu/~ihauser/) University of North Carolina at Chapel Hill West Coast Conference on Formal Linguistics,
More informationEfficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas
Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications Matthias Mauch Chris Cannam György Fazekas! 1 Matthias Mauch, Chris Cannam, George Fazekas Problem Intonation in Unaccompanied
More informationThe odds of eternal optimization in OT
The odds of eternal optimization in OT Paul Boersma, University of Amsterdam http://www.fon.hum.uva.nl/paul/ December 13, 2000 It is often suggested that if all sound change were due to optimizations of
More informationRetrieval of textual song lyrics from sung inputs
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Retrieval of textual song lyrics from sung inputs Anna M. Kruspe Fraunhofer IDMT, Ilmenau, Germany kpe@idmt.fraunhofer.de Abstract Retrieving the
More informationA New "Duration-Adapted TR" Waveform Capture Method Eliminates Severe Limitations
31 st Conference of the European Working Group on Acoustic Emission (EWGAE) Th.3.B.4 More Info at Open Access Database www.ndt.net/?id=17567 A New "Duration-Adapted TR" Waveform Capture Method Eliminates
More informationComputer Coordination With Popular Music: A New Research Agenda 1
Computer Coordination With Popular Music: A New Research Agenda 1 Roger B. Dannenberg roger.dannenberg@cs.cmu.edu http://www.cs.cmu.edu/~rbd School of Computer Science Carnegie Mellon University Pittsburgh,
More informationhomework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition
INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING homework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition May 3,
More informationInstrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationPitfalls and Windfalls in Corpus Studies of Pop/Rock Music
Introduction Hello, my talk today is about corpus studies of pop/rock music specifically, the benefits or windfalls of this type of work as well as some of the problems. I call these problems pitfalls
More informationA Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems
A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems Jérôme Urbain and Thierry Dutoit Université de Mons - UMONS, Faculté Polytechnique de Mons, TCTS Lab 20 Place du
More informationExperiments on musical instrument separation using multiplecause
Experiments on musical instrument separation using multiplecause models J Klingseisen and M D Plumbley* Department of Electronic Engineering King's College London * - Corresponding Author - mark.plumbley@kcl.ac.uk
More informationSonority as a Primitive: Evidence from Phonological Inventories
Sonority as a Primitive: Evidence from Phonological Inventories 1. Introduction Ivy Hauser University of North Carolina at Chapel Hill The nature of sonority remains a controversial subject in both phonology
More informationAcoustic Prosodic Features In Sarcastic Utterances
Acoustic Prosodic Features In Sarcastic Utterances Introduction: The main goal of this study is to determine if sarcasm can be detected through the analysis of prosodic cues or acoustic features automatically.
More informationSemi-automated extraction of expressive performance information from acoustic recordings of piano music. Andrew Earis
Semi-automated extraction of expressive performance information from acoustic recordings of piano music Andrew Earis Outline Parameters of expressive piano performance Scientific techniques: Fourier transform
More informationPlosive voicing acoustics and voice quality in Yerevan Armenian
Plosive voicing acoustics and voice quality in Yerevan Armenian Scott Seyfarth and Marc Garellek Abstract Yerevan Armenian is a variety of Eastern Armenian with a three-way voicing contrast that includes
More informationHowever, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene
Beat Extraction from Expressive Musical Performances Simon Dixon, Werner Goebl and Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria.
More informationSinger Traits Identification using Deep Neural Network
Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic
More informationCS229 Project Report Polyphonic Piano Transcription
CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project
More informationSarcasm Detection in Text: Design Document
CSC 59866 Senior Design Project Specification Professor Jie Wei Wednesday, November 23, 2016 Sarcasm Detection in Text: Design Document Jesse Feinman, James Kasakyan, Jeff Stolzenberg 1 Table of contents
More informationSpeech To Song Classification
Speech To Song Classification Emily Graber Center for Computer Research in Music and Acoustics, Department of Music, Stanford University Abstract The speech to song illusion is a perceptual phenomenon
More informationSyllabling on instrument imitation: case study and computational segmentation method
Syllabling on instrument imitation: case study and computational segmentation method Jordi Janer Music Technology Group, Pompeu Fabra University, Barcelona jjaner at iua.upf.edu - http://www.mtg.upf.edu
More informationAUD 6306 Speech Science
AUD 3 Speech Science Dr. Peter Assmann Spring semester 2 Role of Pitch Information Pitch contour is the primary cue for tone recognition Tonal languages rely on pitch level and differences to convey lexical
More informationTranscription of the Singing Melody in Polyphonic Music
Transcription of the Singing Melody in Polyphonic Music Matti Ryynänen and Anssi Klapuri Institute of Signal Processing, Tampere University Of Technology P.O.Box 553, FI-33101 Tampere, Finland {matti.ryynanen,
More informationLaughbot: Detecting Humor in Spoken Language with Language and Audio Cues
Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues Kate Park katepark@stanford.edu Annie Hu anniehu@stanford.edu Natalie Muenster ncm000@stanford.edu Abstract We propose detecting
More informationAutomatic Laughter Segmentation. Mary Tai Knox
Automatic Laughter Segmentation Mary Tai Knox May 22, 2008 Abstract Our goal in this work was to develop an accurate method to identify laughter segments, ultimately for the purpose of speaker recognition.
More informationUsing Genre Classification to Make Content-based Music Recommendations
Using Genre Classification to Make Content-based Music Recommendations Robbie Jones (rmjones@stanford.edu) and Karen Lu (karenlu@stanford.edu) CS 221, Autumn 2016 Stanford University I. Introduction Our
More informationVideo-based Vibrato Detection and Analysis for Polyphonic String Music
Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International
More informationLecture 9 Source Separation
10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research
More informationPSYCHOLOGICAL AND CROSS-CULTURAL EFFECTS ON LAUGHTER SOUND PRODUCTION Marianna De Benedictis Università di Bari
PSYCHOLOGICAL AND CROSS-CULTURAL EFFECTS ON LAUGHTER SOUND PRODUCTION Marianna De Benedictis marianna_de_benedictis@hotmail.com Università di Bari 1. ABSTRACT The research within this paper is intended
More informationExpressive performance in music: Mapping acoustic cues onto facial expressions
International Symposium on Performance Science ISBN 978-94-90306-02-1 The Author 2011, Published by the AEC All rights reserved Expressive performance in music: Mapping acoustic cues onto facial expressions
More informationLoad Frequency Control Structure for Ireland and Northern Ireland
Load Frequency Control Structure for Ireland and Northern Ireland EirGrid TSO & TSO consultation on a proposal for the determination of LFC blocks in accordance with Article 141(2) of the Commission Regulation
More informationAN ACOUSTIC-PHONETIC APPROACH TO VOCAL MELODY EXTRACTION
12th International Society for Music Information Retrieval Conference (ISMIR 2011) AN ACOUSTIC-PHONETIC APPROACH TO VOCAL MELODY EXTRACTION Yu-Ren Chien, 1,2 Hsin-Min Wang, 2 Shyh-Kang Jeng 1,3 1 Graduate
More informationFurther Topics in MIR
Tutorial Automatisierte Methoden der Musikverarbeitung 47. Jahrestagung der Gesellschaft für Informatik Further Topics in MIR Meinard Müller, Christof Weiss, Stefan Balke International Audio Laboratories
More informationAutomatic characterization of ornamentation from bassoon recordings for expressive synthesis
Automatic characterization of ornamentation from bassoon recordings for expressive synthesis Montserrat Puiggròs, Emilia Gómez, Rafael Ramírez, Xavier Serra Music technology Group Universitat Pompeu Fabra
More informationTopics in Computer Music Instrument Identification. Ioanna Karydi
Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches
More informationA Beat Tracking System for Audio Signals
A Beat Tracking System for Audio Signals Simon Dixon Austrian Research Institute for Artificial Intelligence, Schottengasse 3, A-1010 Vienna, Austria. simon@ai.univie.ac.at April 7, 2000 Abstract We present
More informationExperiments with Fisher Data
Experiments with Fisher Data Gunnar Evermann, Bin Jia, Kai Yu, David Mrva Ricky Chan, Mark Gales, Phil Woodland May 16th 2004 EARS STT Meeting May 2004 Montreal Overview Introduction Pre-processing 2000h
More informationNarrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts
Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts Gerald Friedland, Luke Gottlieb, Adam Janin International Computer Science Institute (ICSI) Presented by: Katya Gonina What? Novel
More informationAutomatic Labelling of tabla signals
ISMIR 2003 Oct. 27th 30th 2003 Baltimore (USA) Automatic Labelling of tabla signals Olivier K. GILLET, Gaël RICHARD Introduction Exponential growth of available digital information need for Indexing and
More informationTimbre perception
Harvard-MIT Division of Health Sciences and Technology HST.725: Music Perception and Cognition Prof. Peter Cariani Timbre perception www.cariani.com Timbre perception Timbre: tonal quality ( pitch, loudness,
More informationOBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES
OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES Vishweshwara Rao and Preeti Rao Digital Audio Processing Lab, Electrical Engineering Department, IIT-Bombay, Powai,
More informationAutomatic music transcription
Music transcription 1 Music transcription 2 Automatic music transcription Sources: * Klapuri, Introduction to music transcription, 2006. www.cs.tut.fi/sgn/arg/klap/amt-intro.pdf * Klapuri, Eronen, Astola:
More informationBrowsing News and Talk Video on a Consumer Electronics Platform Using Face Detection
Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning Mitsubishi Electric Research Laboratories, Cambridge, MA, USA {peker,ajayd,}@merl.com
More informationAcoustic and musical foundations of the speech/song illusion
Acoustic and musical foundations of the speech/song illusion Adam Tierney, *1 Aniruddh Patel #2, Mara Breen^3 * Department of Psychological Sciences, Birkbeck, University of London, United Kingdom # Department
More informationPhonology. Submission of papers
Phonology Phonology is concerned with all aspects of phonology and related disciplines. Each volume contains three issues, published in May, August and December. Preference is given to papers which make
More informationMANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS
MANDARIN SINGING VOICE SYNTHESIS BASED ON HARMONIC PLUS NOISE MODEL AND SINGING EXPRESSION ANALYSIS Ju-Chiang Wang Hung-Yan Gu Hsin-Min Wang Institute of Information Science, Academia Sinica Dept. of Computer
More informationPowerful Software Tools and Methods to Accelerate Test Program Development A Test Systems Strategies, Inc. (TSSI) White Paper.
Powerful Software Tools and Methods to Accelerate Test Program Development A Test Systems Strategies, Inc. (TSSI) White Paper Abstract Test costs have now risen to as much as 50 percent of the total manufacturing
More informationProjektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder
Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder Präsentation des Papers ICWSM A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews
More informationCS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016
CS 1674: Intro to Computer Vision Intro to Recognition Prof. Adriana Kovashka University of Pittsburgh October 24, 2016 Plan for today Examples of visual recognition problems What should we recognize?
More informationComputational analysis of rhythmic aspects in Makam music of Turkey
Computational analysis of rhythmic aspects in Makam music of Turkey André Holzapfel MTG, Universitat Pompeu Fabra, Spain hannover@csd.uoc.gr 10 July, 2012 Holzapfel et al. (MTG/UPF) Rhythm research in
More informationTOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION
TOWARDS IMPROVING ONSET DETECTION ACCURACY IN NON- PERCUSSIVE SOUNDS USING MULTIMODAL FUSION Jordan Hochenbaum 1,2 New Zealand School of Music 1 PO Box 2332 Wellington 6140, New Zealand hochenjord@myvuw.ac.nz
More informationAutomatic music transcription
Educational Multimedia Application- Specific Music Transcription for Tutoring An applicationspecific, musictranscription approach uses a customized human computer interface to combine the strengths of
More informationMUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES
MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES Jun Wu, Yu Kitano, Stanislaw Andrzej Raczynski, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono and Shigeki Sagayama The Graduate
More informationTopic 10. Multi-pitch Analysis
Topic 10 Multi-pitch Analysis What is pitch? Common elements of music are pitch, rhythm, dynamics, and the sonic qualities of timbre and texture. An auditory perceptual attribute in terms of which sounds
More informationLaughter and Topic Transition in Multiparty Conversation
Laughter and Topic Transition in Multiparty Conversation Emer Gilmartin, Francesca Bonin, Carl Vogel, Nick Campbell Trinity College Dublin {gilmare, boninf, vogel, nick}@tcd.ie Abstract This study explores
More informationAutomatic Construction of Synthetic Musical Instruments and Performers
Ph.D. Thesis Proposal Automatic Construction of Synthetic Musical Instruments and Performers Ning Hu Carnegie Mellon University Thesis Committee Roger B. Dannenberg, Chair Michael S. Lewicki Richard M.
More informationMELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC
MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC Lena Quinto, William Forde Thompson, Felicity Louise Keating Psychology, Macquarie University, Australia lena.quinto@mq.edu.au Abstract Many
More informationCHAPTER 8 CONCLUSION AND FUTURE SCOPE
124 CHAPTER 8 CONCLUSION AND FUTURE SCOPE Data hiding is becoming one of the most rapidly advancing techniques the field of research especially with increase in technological advancements in internet and
More informationMEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION
MEASURING LOUDNESS OF LONG AND SHORT TONES USING MAGNITUDE ESTIMATION Michael Epstein 1,2, Mary Florentine 1,3, and Søren Buus 1,2 1Institute for Hearing, Speech, and Language 2Communications and Digital
More informationBehavioral and neural identification of birdsong under several masking conditions
Behavioral and neural identification of birdsong under several masking conditions Barbara G. Shinn-Cunningham 1, Virginia Best 1, Micheal L. Dent 2, Frederick J. Gallun 1, Elizabeth M. McClaine 2, Rajiv
More informationSeminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012)
project JOKER JOKe and Empathy of a Robot/ECA: Towards social and affective relations with a robot Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012) http://www.chistera.eu/projects/joker
More informationNote : Answer all questions.
I BEGE-102/EEG-02 I BACHELOR'S DEGREE PROGRAMME O Term-End Examination %-1 December, 2009 C\J ELECTIVE COURSE-ENGLISH BEGE-102/EEG-02 : THE STRUCTURE OF MODERN ENGLISH Time : 3 hours Maximum Marks : 100
More informationA Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification
INTERSPEECH 17 August, 17, Stockholm, Sweden A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification Yun Wang and Florian Metze Language
More informationDAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval
DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca
More informationDrum Source Separation using Percussive Feature Detection and Spectral Modulation
ISSC 25, Dublin, September 1-2 Drum Source Separation using Percussive Feature Detection and Spectral Modulation Dan Barry φ, Derry Fitzgerald^, Eugene Coyle φ and Bob Lawlor* φ Digital Audio Research
More informationIMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC
IMPROVED MELODIC SEQUENCE MATCHING FOR QUERY BASED SEARCHING IN INDIAN CLASSICAL MUSIC Ashwin Lele #, Saurabh Pinjani #, Kaustuv Kanti Ganguli, and Preeti Rao Department of Electrical Engineering, Indian
More informationMultimodal databases at KTH
Multimodal databases at David House, Jens Edlund & Jonas Beskow Clarin Workshop The QSMT database (2002): Facial & Articulatory motion Clarin Workshop Purpose Obtain coherent data for modelling and animation
More informationInstructions for producing camera-ready manuscript using MS-Word for publication in conference proceedings *
Instructions for producing camera-ready manuscript using MS-Word for publication in conference proceedings * First Author and Second Author University Department, University Name, Address City, State ZIP/Zone,
More informationWAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS. A. Zehetner, M. Hagmüller, and F. Pernkopf
WAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS A. Zehetner, M. Hagmüller, and F. Pernkopf Graz University of Technology Signal Processing and Speech Communication Laboratory, Austria ABSTRACT Wake-up-word (WUW)
More informationMixed Linear Models. Case studies on speech rate modulations in spontaneous speech. LSA Summer Institute 2009, UC Berkeley
Mixed Linear Models Case studies on speech rate modulations in spontaneous speech LSA Summer Institute 2009, UC Berkeley Florian Jaeger University of Rochester Managing speech rate How do speakers determine
More information1. Introduction NCMMSC2009
NCMMSC9 Speech-to-Singing Synthesis System: Vocal Conversion from Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices * Takeshi SAITOU 1, Masataka GOTO 1, Masashi
More informationBrian C. J. Moore Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England
Asymmetry of masking between complex tones and noise: Partial loudness Hedwig Gockel a) CNBH, Department of Physiology, University of Cambridge, Downing Street, Cambridge CB2 3EG, England Brian C. J. Moore
More informationD. BARD, J. NEGREIRA DIVISION OF ENGINEERING ACOUSTICS, LUND UNIVERSITY
Room Acoustics (1) D. BARD, J. NEGREIRA DIVISION OF ENGINEERING ACOUSTICS, LUND UNIVERSITY Outline Room acoustics? Parameters Summary D. Bard, J. Negreira / May 2018 Basics All our life happens (mostly)
More informationZero Crossover Dynamic Power Synchronization Technology Overview
Technical Note Zero Crossover Dynamic Power Synchronization Technology Overview Background Engineers have long recognized the power benefits of zero crossover (Figure 1) over phase angle (Figure 2) power
More informationAutomatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting
Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced
More informationLaughbot: Detecting Humor in Spoken Language with Language and Audio Cues
Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues Kate Park, Annie Hu, Natalie Muenster Email: katepark@stanford.edu, anniehu@stanford.edu, ncm000@stanford.edu Abstract We propose
More informationThe Human Features of Music.
The Human Features of Music. Bachelor Thesis Artificial Intelligence, Social Studies, Radboud University Nijmegen Chris Kemper, s4359410 Supervisor: Makiko Sadakata Artificial Intelligence, Social Studies,
More informationUser-Specific Learning for Recognizing a Singer s Intended Pitch
User-Specific Learning for Recognizing a Singer s Intended Pitch Andrew Guillory University of Washington Seattle, WA guillory@cs.washington.edu Sumit Basu Microsoft Research Redmond, WA sumitb@microsoft.com
More information