Automatic transcription is not neutral. Wyke Stommel, Tom Koole, Tessa van Charldorp, Sandra van Dulmen en Antal van den Bosch ADVANT

Similar documents
30,000 FATE. Clint Chandler.

The Different Functions of the Discourse Marker well. Hanne Hakonsen. AL 6806 Using Corpora in the Language Classroom. Dr.

Analysis of the Occurrence of Laughter in Meetings

Oliver Twist. More? Nobody asks for more! Ungrateful little brat! Get out of here! What you starin at? Haven t you never seen a toff?

Cambridge International Examinations Cambridge International Advanced Subsidiary and Advanced Level

Level 1 & 2 Mini Story Transcripts

Cambridge International Examinations Cambridge International Advanced Subsidiary and Advanced Level

Imagining. 2. Choose endings: Next, students must drag and drop the correct endings into each square.

Skills 360 Levels of Formality in English (Part 2)

Of Sound Mind and Body

WORKING WITH FRAME GRABS

180 By Mike Shelton Copyright 2008

FLIGHT WITHOUT FORMULAE (A "WITHOUT FORMULAE" BOOK) BY A.C. KERMODE

Conversational Analysis C H A P T E R 5

Conversation analysis

BBC LEARNING ENGLISH 6 Minute English Is aggression useful?

Jacob listens to his inner wisdom

Transcript: Ralph Adamo.They'd have a popular number come out. By me playing popular. music or what-not. They'd have rehearsal. First thing, they say,

The Wonder of Dads A Puppet Script by Tom Smith

Communicating Inclusion: An Analysis of Family Conversation

On prosody and humour in Greek conversational narratives

Chicken Shoot a short comedy about crimes and validation by Jennie Webb

BBC LEARNING ENGLISH Jamaica Inn 10: The truth is out

BBC LEARNING ENGLISH 6 Minute Grammar The present perfect with just, already and yet

THE BENCH PRODUCTION HISTORY

Segmenting Guide I. THE BASICS A. THE SEGMENT TIMER. nssacademy.weebly.com /segmenting guide.html

Laugh and the World Laughs with You From the book, Recitals, Drills and Plays for Children By Bertha Irene Tobin (1921)

ART IMITATES LIFE. By Mike McCafferty. Copyright MMIX by Mike McCafferty All Rights Reserved Heuer Publishing LLC, Cedar Rapids, Iowa

Transcriber(s): Yankelewitz, Dina Verifier(s): Lew, Kristen Date Transcribed: Spring 2009 Page: 1 of 5

The Wonder of Moms by Tom Smith

Spoken Grammar Key features of spoken grammar Implications and ideas for teaching

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

A HISTORY OF THE ANCIENT NEAR EAST CA BC (BLACKWELL HISTORY OF THE ANCIENT WORLD) 2ND (SECOND) EDITION BY MARC VAN DE MIEROOP

Guidelines for use of the Access to HE trademarks

A Christmas Eve Play

-1- It's Up To You: Choose Your Own Adventure

TAYO Episode 13 Nuri is a Superstar. TAYO (VO) Nuri is a Superstar. La-la-la. La-la-la. He-he!

TAINTED LOVE. by WALTER WYKES CHARACTERS MAN BOY GIRL. SETTING A bare stage

I Wish I Had... Preparatory Reading TALK ABOUT REGRETS, UNREAL PAST CONDITIONAL, EXPRESSING REGRETS

Christian H. Wolff Pamphlet collection

THE OLD HOUSE WRITTEN BY ROB GROTNICK

Hugh Dubberly: What do you guys think design is?

When you turned and walked away, that s when I want to say. Come on, baby, give me a whirl, I wanna know, if you ll be my girl.

THE PICTUR E BOOK REVIEW

Interlingual Sarcasm: Prosodic Production of Sarcasm by Dutch Learners of English

Two months ago I completed the Washington State University sheep shearing school,

The Crank Calls. By John Moore. No. 1: CRITICAL MASS. No 2: DIXIE. VOICEOVER: I m not sure I m following... KEVIN: (whispering) 6147 Dover St.

UNIT 4 WHO WE ARE. Conversation Idioms: keep up to date with the latest trends is really important to me

TECHNOLOGY: PURSUING THE DIALECTICAL IMAGE. Craig David van den Bosch. A thesis submitted in partial fulfillment of the requirements for the degree

Wymondham Ukulele Group Elvis & Buddy Holly Songbook

Video capture, editing and production fundamentals for authors Planning & Preparation Recording best practices To narrate, or not?

Me & George. A solo play. Leslie Harrell Dillen

TAYO Episode 18. Frank and Alice are Awesome! TAYO (VO) Frank and Alice are Awesome! NA Tayo and Rogi are going back to the bus garage after work.

By the bed is a large tray with the remnants of a feast. Strewn about the room are four pair of shoes, clothing, and some sex toys.

What does the voice say at the end of the episode? Complete the sentence.

THE HAPPY GUY. Written By 15-DE05-W029. One man's happy life is the envy of many, but perhaps his life is not so different after all.

ANALYZING THE PERFECT PRESENTATION Research Report

Sower and the Seeds1

ESS Questions administered by telephone or in person:

MY DAILY LIFE. By Tom Akers. Copyright MM by Tom Akers All Rights Reserved Heuer Publishing LLC, Cedar Rapids, Iowa

Therapeutic humor in retelling the clients' tellings*

Don t Laugh at Me. 3 Cs F. Preparation. Vocabulary builder breaker

Scene 1: The Street.

Hit the Books. By Dwayne Yancey. Performance Rights

ZYLIA Studio PRO reference manual v1.0.0

Talking About Your Value in Social Situations

The Movies Written by Annie Lewis

SATURDAY NIGHT FEVER

WEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH

Oat. Goat. the. Helping children learn the power of kindness.

Admit One. Mike Shelton

CONFESSIONS OF A FACEBOOK ADDICT

Malta 2018: In English, please!

Face-threatening Acts: A Dynamic Perspective

"The Happiness Squad. A short play. Written and Translated from Hebrew by: Ido Setter. Characters: GLEE SMILEY HAPPY H.

The MAMI Query-By-Voice Experiment Collecting and annotating vocal queries for music information retrieval

DOCUMENT NAME/INFORMANT: PETER CHAMBERLAIN #2 INFORMANT'S ADDRESS: INTERVIEW LOCATION: TRIBE/NATION: OOWEKEENO HISTORY PROJECT

Emotion: The #1 Way to Silence Your Mind & Fade Your Ego. -Rion Freeberg

DOMESTIC TRANQUILITY. An excerpt from. a comedy by Rich Orloff. Characters

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

Problems. Speech Perception Facts and things. Talker Normalization. Lack of Invariance Problem. Why the lack of invariance?

Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts

HOW TO GUIDE FOR TECHSMITH RELAY

Pennies on the Dollar. by Ryan Warren.

HAUNTED MASKED SERIAL KILLER. Written by. D. R. Whiteley

We came to the bottom of the canyon of Alum Rock Park. There was

English as a Second Language Podcast ESL Podcast 217 Lost and Found

Happy Feet. A Puppet Script by Tom Smith

VivoSense. User Manual Galvanic Skin Response (GSR) Analysis Module. VivoSense, Inc. Newport Beach, CA, USA Tel. (858) , Fax.

mr fox V5 _mr fox 13/04/ :32 Page 1

BOBBY S BRAIN A Comedy In One Act By Bruce Kane

Contents 01. Keeping up to date with artists. Intro. Feeling involved with favourite artists. Inspiration for musical choices

Mirth Solutions. Powering Healthcare Transformation.

A Dictionary of Spoken Danish

Don Horn Zion National Park Oral History Project CCC Reunion September 28, 1989

Cedits bim bum bam. OOG series

It is a rough transcript, capturing as much of the audible conversation as possible.

Essential Standards Endurance Leverage Readiness

Worth Saving. Jeff Smith

A trip to Zoo (short) by Anthony Hudson 'alffy' Third Draft Copyright All Rights Reserved

Transcription:

Automatic transcription is not neutral Wyke Stommel, Tom Koole, Tessa van Charldorp, Sandra van Dulmen en Antal van den Bosch ADVANT

Automated annotation and analysis. Tom Koole Wyke Stommel Tessa van Charldorp Antal van den Bosch Sandra van Dulmen

Video is exploding. 300 hours of video are uploaded to YouTube alone every minute. 3

Automatic and manual annotation/ transcription 4

Transcript in conversation analysis (CA) Transcripts and video examined in conjunction Transcription conventions developed over 45 years > Jefferson Including pauses, overlap, intonation, breathing (in and out), clicks, laughing, crying etc. Increasingly embodied behavior in transcript (Mondada)

Example CA transcript 01 Nancy: = I don know it sounds kinda cra:zy = 02 Hyla: = hh [hhhh] = 03 Nancy: [bu: t] = 04 Hyla: = Jista liddle. 05 Nancy: We: : : ll, 06 (0.3) 07 Nancy: e may me feel bet[ter anywa (h) y] = 08 Hyla: [nhhhhhhhhhhhh] = 09 (Hyla): = hk hhhhh 10 (0.4) 11 Nancy: So:. 12 ( ) 13 Nancy: W[hat time, ] eh hnh] =

Technology for the benefit of transcription? Accelerates the process (speech recognition, image recognition?) Possible to work with large(r) corpora Objective measurement (e.g., silences)

Technology is not neutral 1) Theory-driven 2) Shows restricted set of aspects of interaction 3) Steers research questions/agenda s

Ochs 1979 Transcription is theory-driven: Transcriptions are the researcher s data Transcription is a selective process reflecting theoretical goals and definitions => Automatic transcription: theory and technology driven

Downloaded by [Radboud Universiteit Nijmegen] at 02:38 11 August 2015 1A. [NB-Assassination1:00:01:30:AUTO] 69 oh god long week 70 oh my god 71 i ve decided sober i want you to have a t. v. 72 73 i won t either 73.5 (0.7) 74 like uh you know (0.1) that s where they 75 we took off on our charter flight that same spot 76 did you see it 77 (0.8) 78 and they took him and here uh you 79 know i wouldn t 80 watch it 81 i think it s so ridiculous i mean it s (0.4) it s a horrible 82 thing but my god (0.1) play up that s thing it s it s (.) 83 horrible 84 die people that 84.5 (0.3) 85 why is it a native american people think well they re no good 85.5 (0.5) 86 well they aren t very good some of 1B. [NB-Assassination1:00:01:30:JEFFERSON] 69 Lot: Oh: Go:d a lo:ng wee[k. Yeah.] 70 Emm: [O h : my] God 71 I m (.) glad it s over I won t even turn the teevee 72 o[n. 73 Lot: [I won eether. 74 Emm: aoh no. They drag it out so THAT S WHERE THEY 75 WE TOOK OFF on ar chartered flight that sa:me spot 76 didju see it 77 (0.7) 78 Emm: hh when they took him in[the airpla:ne,] 79 Lot: [n : N o : : :. ] Hell I wouldn ev n 80 wa:tch it. 81 Lot: I think it s so ridiculous. I mean it s hhh it s a hôrrible 82 thing but my: Go:d. play up that thing it it s jst 83 hôrri[b l e. ] 84 Emm: [It ll] drive people nu:ts. 85 Lot: Why id ï-en makes Americ n people think why ther no goo:d. 86 Emm: Mm: Well they aren t very good some of m, IBM Attila speech recognition: poor audio from the 60ies (Moore 2015)

Downloaded by [Radboud Universiteit Nijmegen] at 02:38 11 August 2015 1A. [NB-Assassination1:00:01:30:AUTO] 69 oh god long week 70 oh my god 71 i ve decided sober i want you to have a t. v. 72 73 i won t either 73.5 (0.7) 74 like uh you know (0.1) that s where they 75 we took off on our charter flight that same spot 76 did you see it 77 (0.8) 78 and they took him and here uh you 79 know i wouldn t 80 watch it 81 i think it s so ridiculous i mean it s (0.4) it s a horrible 82 thing but my god (0.1) play up that s thing it s it s (.) 83 horrible 84 die people that 84.5 (0.3) 85 why is it a native american people think well they re no good 85.5 (0.5) 86 well they aren t very good some of 1B. [NB-Assassination1:00:01:30:JEFFERSON] 69 Lot: Oh: Go:d a lo:ng wee[k. Yeah.] 70 Emm: [O h : my] God 71 I m (.) glad it s over I won t even turn the teevee 72 o[n. 73 Lot: [I won eether. 74 Emm: aoh no. They drag it out so THAT S WHERE THEY 75 WE TOOK OFF on ar chartered flight that sa:me spot 76 didju see it 77 (0.7) 78 Emm: hh when they took him in[the airpla:ne,] 79 Lot: [n : N o : : :. ] Hell I wouldn ev n 80 wa:tch it. 81 Lot: I think it s so ridiculous. I mean it s hhh it s a hôrrible 82 thing but my: Go:d. play up that thing it it s jst 83 hôrri[b l e. ] 84 Emm: [It ll] drive people nu:ts. 85 Lot: Why id ï-en makes Americ n people think why ther no goo:d. 86 Emm: Mm: Well they aren t very good some of m,

Technology for transcription Steers research questions/agendas (Bolden 2015) Favouring work on high quality recordings Favouring text search RQs (lexical items, discourse markers) as opposed to overlap, phonetic aspects, silences, etc.

Not neutral but useful? Bolden 2015: Going from an automatically produced transcript with its missing speaker identifications, arbitrary line segmentation, word identification errors, etc. to even a simple orthographic transcript where these shortcomings are corrected appears to be a very time-consuming task, without the analytic payoffs of the careful listening required for producing a CA transcript. It is, of course, possible that future versions of this software will address some of these problems and make automated transcription more cost effective.

Automated annotation and analysis. Tom Koole Wyke Stommel Tessa van Charldorp Antal van den Bosch Sandra van Dulmen

Thank you