Bridging the Gap Between Humans and Machines: Lessons from Spoken Language Prof. Roger K. Moore

Similar documents
Team Creativity: Applications of the Jazz Metaphor to Organizations

Concept of ELFi Educational program. Android + LEGO

Computer Coordination With Popular Music: A New Research Agenda 1

Empirical Evaluation of Animated Agents In a Multi-Modal E-Retail Application

MAKING INTERACTIVE GUIDES MORE ATTRACTIVE

The Omnichannel Dilemma: Everyone Wants It, But How Do You Start?

Internet of Things: Cross-cutting Integration Platforms Across Sectors

Accessing Information about Programs and Services through a Voice Site by Underprivileged Students in Education Sector of Sri Lanka

GarageBand for the ipad, A Superstar for the Music Classroom

Automatic Speech Recognition (CS753)

INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR NPTEL ONLINE CERTIFICATION COURSE. On Industrial Automation and Control

How about laughter? Perceived naturalness of two laughing humanoid robots

Strategic innovation programme IoT Sweden Trend report:

WRoCAH White Rose NETWORK Expressive nonverbal communication in ensemble performance

Overview When it comes to designing a video wall system that looks great and synchronizes perfectly, the AV Binloop HD and AV Binloop Uncompressed

SMPTE Update. 23 February Howard Lukk Director of Engineering and Standards. Bruce Devlin Standards Vice President

Music Understanding and the Future of Music

(Skip to step 11 if you are already familiar with connecting to the Tribot)

Beginner-Elementary. Ask two classmates the questions below. Write their answers in the spaces.

Repeating and mistranslating: the associations of GANs in an art context

Frame Processing Time Deviations in Video Processors

Arc Detector for Remote Detection of Dangerous Arcs on the DC Side of PV Plants

The Computer Connected Villager Club, Inc. Presents. Ted Wright. 1 Copyright 2018 Computer Connected Villager, Inc

Math and Music: An Interdisciplinary Approach to Transformations of Functions NCTM Annual Conference, San Francisco, CA April 2016

Digital Logic Design ENEE x. Lecture 24

The BIGGEST. The 2 nd Saudi International Exhibition & Conference for Internet of Things February 2019

WHY NON-BIOLOGICAL INTELLIGENCE ARTIFICIAL. School of Computing, Electronics and Mathematics. Dr. Huma Shah

Approaches to synchronize vision, motion and robotics

IoT Challenges in H2020. Mirko Presser, MSci, MSc, BSS/BTECH/MBIT Lab

Success Providing Excellent Service in a Changing World of Digital Information Resources: Collection Services at McGill

Orchestra Responding Unit, Proficient Level

This full text version, available on TeesRep, is the post-print (final version prior to publication) of:

AN INTRODUCTION TO BIBLIOMETRICS

Yorick Wilks. Machine Translation. Its Scope and Limits

THE CRITICAL CONSIDERATIONS OF OMNICHANNEL SUPPORT

S I N E V I B E S ROBOTIZER RHYTHMIC AUDIO GRANULATOR

Short Course APSA 2016, Philadelphia. The Methods Studio: Workshop Textual Analysis and Critical Semiotics and Crit

Emerging IoT Technologies for Smart Cities

Music 209 Advanced Topics in Computer Music Lecture 4 Time Warping

An Inverse Evaluation of Netflix Architecture Using ATAM

LABORATORY EXPERIMENTS IN DISTANCE LEARNING

COMPONENTS OF A RESEARCH ARTICLE

PEOPLE LESSONS.com YUJA WANG

Follow the Light Pre-Quiz

Directions. Lesson One:

Logic and Artificial Intelligence Lecture 0

Revelation Principle; Quasilinear Utility

Chapter 60 Development of the Remote Instrumentation Systems Based on Embedded Web to Support Remote Laboratory

Tamar Sovran Scientific work 1. The study of meaning My work focuses on the study of meaning and meaning relations. I am interested in the duality of

Thoughts on 25G cable/host configurations. Mike Dudek QLogic. 11/18/14 Presented to 25GE architecture ad hoc 11/19/14.

LABORATORY EXPERIMENTS IN DISTANCE LEARNING

IS1500 (not part of IS1200) Logic Design Lab (LD-Lab)

Introduction to Internet of Things Prof. Sudip Misra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Solutions to Embedded System Design Challenges Part II

CURRICULUM VITAE. Ph.D. University of California / Santa Barbara, CA / September 2010 Music Theory

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki

IoT Strategy Roadmap

Music. Any music course will satisfy the Arts college core requirement credit.

LISTEN A MINUTE.com. Poetry. Focus on new words, grammar and pronunciation in this short text.

WESTFIELD PUBLIC SCHOOLS Westfield, New Jersey

Conceptions and Context as a Fundament for the Representation of Knowledge Artifacts

CLARIN - NL. Language Resources and Technology Infrastructure for the Humanities in the Netherlands. Jan Odijk NO-CLARIN Meeting Oslo 18 June 2010

4. Formal Equivalence Checking

Datasheet. 5 GHz airmax AC AP. Models: LAP-120, LAP-GPS. High-Performance Sector AP. Up To 450+ Mbps Real TCP/IP Throughput

Existential Semiotics (Advances In Semiotics) By Eero Tarasti READ ONLINE

Music In Our Schools Month General Music: 1 st Grade

Music (MUS) Courses. Music (MUS) 1

Quick Guide Book of Sending and receiving card

Part (A) Controlling 7-Segment Displays with Pushbuttons. Part (B) Controlling 7-Segment Displays with the PIC

INFS 321 Information Sources

Company Overview. September MICROVISION, INC. ALL RIGHTS RESERVED.

Where Are We Now? e.g., ADD $S0 $S1 $S2?? Computed by digital circuit. CSCI 402: Computer Architectures. Some basics of Logic Design (Appendix B)

Internet of Things and Smart Cities & Communities Convergence

Savant. Savant. SignalCalc. Power in Numbers input channels. Networked chassis with 1 Gigabit Ethernet to host

Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012)

Spoken Dialog System Framework Supporting Multiple Concurrent Sessions

Exploring Choreographers Conceptions of Motion Capture for Full Body Interaction

I'M A HOG FOR You (3:25) (Clifton Chenier)

IBC 96 Conference - September Amsterdam, NL

What is practical criticism?

Mirth Solutions. Powering Healthcare Transformation.

The smartest media mix is best left to Science.

Embodied music cognition and mediation technology

MUSC 103 Materials and Design Wesleyan University Fall 2012, T/R 9:00 10:20

The Future of Control Room Visualiza on

National Standards for Visual Art The National Standards for Arts Education

What have we done with the bodies? Bodyliness in drama education research

Gus (Guangyu) Xia , NYU Shanghai, Shanghai, Tel: (412) Webpage:

Overview. Signal Averaged ECG

Using Digital Tabletops. Collaborating for Learning May 15, 2013

WOZ Acoustic Data Collection For Interactive TV

M1 OSCILLOSCOPE TOOLS

A Condensed View esthetic Attributes in rts for Change Aesthetics Perspectives Companions

ERL Consumer Service Robots Test Bed Certification

GET YOUR FREQ ON. A Seminar on Navigating the Wireless Spectrum Upheaval

PEOPLE LESSONS.com ANNA

A ROBOT SINGER WITH MUSIC RECOGNITION BASED ON REAL-TIME BEAT TRACKING

Digital and Open. Moving Lighting Controls into the 21st Century

ESP: Expression Synthesis Project

A HIGHLY INTERACTIVE SYSTEM FOR PROCESSING LARGE VOLUMES OF ULTRASONIC TESTING DATA. H. L. Grothues, R. H. Peterson, D. R. Hamlin, K. s.

Transcription:

Bridging the Gap Between Humans and Machines: Lessons from Spoken Language Prof. Roger K. Moore Chair of Spoken Language Processing Dept. Computer Science, University of Sheffield (Visiting Prof., Dept. Phonetics, University College London) (Visiting Prof., Bristol Robotics Lab.) EU-FP7-EASEL DIGIHUM-2017, Helsinki 4 th May 2017 slide 1 Rich History of Technological Development Von Kempelen s talking machine (1791) Radio Rex (1922) Parametric Artificial Talker (1953) Speak n Spell (1983) Interactive Talking Doll (1987) DIGIHUM-2017, Helsinki 4 th May 2017 slide 2 1

Rich History of Technological Development Marconi SR128 (1982) Apple s Siri (2011) Dragon Naturally Speaking (1997) Voice dictation on SmartPhone (2007) DIGIHUM-2017, Helsinki 4 th May 2017 slide 3 Rich History of Technological Development Apple s Siri (2011) Speech-to-Speech Translation DIGIHUM-2017, Helsinki 4 th May 2017 slide 4 2

Rich History of Technological Development Amazon Echo (2015) ch pee -to-s ion h c e t Spe ransla T DIGIHUM-2017, Helsinki Google Home (2016) 4th May 2017 slide 5 Amazing Progress DIGIHUM-2017, Helsinki 4th May 2017 slide 6 3

Amazing Progress Command and Control Systems Dictation Systems Interactive Voice Response (IVR) Systems Voice-Enabled Personal Assistants Embodied Conversational Agents (ECAs) Autonomous Social Agents DIGIHUM-2017, Helsinki 4 th May 2017 slide 7 Amazing Progress Command and Control Systems Dictation Systems Interactive Voice Response (IVR) Systems Voice-Enabled Personal Assistants Embodied Conversational Agents (ECAs) Autonomous Social Agents DIGIHUM-2017, Helsinki 4 th May 2017 slide 8 4

Amazing Progress Command and Control Systems Dictation Systems Interactive Voice Response (IVR) Systems Voice-Enabled Personal Assistants Embodied Conversational Agents (ECAs) Autonomous Social Agents DIGIHUM-2017, Helsinki 4 th May 2017 slide 9 A Glimpse of the Future DIGIHUM-2017, Helsinki 4 th May 2017 slide 10 5

Are We There Yet? Moore, R. K., Li, H., & Liao, S.-H. (2016). Progress and prospects for spoken language technology: what ordinary people think. INTERSPEECH (pp. 3007 3011). San Francisco, CA. DIGIHUM-2017, Helsinki 4th May 2017 slide 11 Are We There Yet? DIGIHUM-2017, Helsinki 4th May 2017 slide 12 6

Are We There Yet? DIGIHUM-2017, Helsinki 4 th May 2017 slide 13 What s the Problem? Variable Ambiguous I do not know I dn uh Meaningful fork handles four candles This nudist play will wreck a nice beach Emotional! Contaminated Contaminated DIGIHUM-2017, Helsinki 4 th May 2017 slide 14 7

What s the Problem? Graph courtesy of Mike Phillips (CEO, Mobeus Corporation) Like a Human Usability Add NL/Dialog Habitability Gap Structured Dialog Flexibility DIGIHUM-2017, Helsinki 4 th May 2017 slide 15 Masahiro Mori DIGIHUM-2017, Helsinki 4 th May 2017 slide 16 8

DIGIHUM-2017, Helsinki 4 th May 2017 slide 17 J J J K L The State-of-the-Art There is steady year-on-year technical progress Recent years have seen significant market penetration and public awareness Improvements come from: increase in available computer power corpus-driven modelling (deep learning) public benchmark testing Progress has not come about as a result of deep insights into human spoken language Spoken language technology is fragile (in real conditions) expensive (to port to new applications / languages) shallow (it doesn t understand language) DIGIHUM-2017, Helsinki 4 th May 2017 slide 18 9

J J J K L The State-of-the-Art There is steady year-on-year technical progress Recent years have seen significant market penetration and public awareness Improvements come from: increase in available computer power corpus-driven modelling (deep learning) public benchmark testing Progress has not come about as a result of deep insights into human spoken language Spoken language technology is fragile (in real conditions) expensive (to port to new applications / languages) shallow (it doesn t understand language) DIGIHUM-2017, Helsinki 4 th May 2017 slide 19 Standard SLP Architecture Introduction and Overview of W3C Speech Interface Framework http://www.w3.org/tr/voice-intro/ DIGIHUM-2017, Helsinki 4 th May 2017 slide 20 10

Standard SLP Architecture Behaviourist STIMULUS RESPONSE Introduction and Overview of W3C Speech Interface Framework http://www.w3.org/tr/voice-intro/ DIGIHUM-2017, Helsinki 4 th May 2017 slide 21 What is Language? Cummins, F. (2011). Periodic and aperiodic synchronization in skilled action. Frontiers in Human Neuroscience, 5(170), 1 9. DIGIHUM-2017, Helsinki 4 th May 2017 slide 22 11

What is Language? Ostensive Inferential Recursive Mind-Reading Scott-Phillips, T. (2015). Speaking Our Minds: Why human communication is different, and how language evolved to make it special. London, New York: Palgrave MacMillan. DIGIHUM-2017, Helsinki 4 th May 2017 slide 23 Human-Human Languaging = Ostensive Inferential Recursive Mind-Reading Moore, R. K. (2016). Introducing a pictographic language for envisioning a rich variety of enactive systems with different degrees of complexity. Int. J. Advanced Robotic Systems, 13(74). DIGIHUM-2017, Helsinki 4 th May 2017 slide 24 12

Human-Agent Languaging = Ostensive Inferential Recursive Mind-Reading Moore, R. K. (2016). Introducing a pictographic language for envisioning a rich variety of enactive systems with different degrees of complexity. Int. J. Advanced Robotic Systems, 13(74). DIGIHUM-2017, Helsinki 4 th May 2017 slide 25 What is Language Like? Cummins, F. (2011). Periodic and aperiodic synchronization in skilled action. Frontiers in Human Neuroscience, 5(170), 1 9. DIGIHUM-2017, Helsinki 4 th May 2017 slide 26 13

Houston, we (may) have a problem Spoken language interaction between human beings is founded on shared experiences, representations and priors The assumption of continuity between a fully coded communication system at one end, and language at the other, is simply not justified. So, is there a fundamental limit to the language-based interaction that can take place between mismatched partners? Moore, R. K. (2016). Is spoken language all-or-nothing? Implications for future speech-based human-machine interaction. In K. Jokinen & G. Wilcock (Eds.), Dialogues with Social Robots Enablements, Analyses, and Evaluation. Springer Lecture Notes in Electrical Engineering (LNEE). DIGIHUM-2017, Helsinki 4 th May 2017 slide 27 Getting it Right Wired: Do you think it s possible to bridge the uncanny valley? Mori: Yes, but why try? I think it s better to design things like Honda s Asimo, which stops right before it gets to be uncanny. DIGIHUM-2017, Helsinki 4 th May 2017 slide 28 14

Getting it Right Wired: Do you think it s possible to bridge the uncanny valley? Mori: Yes, but why try? I think it s better to design things like Honda s Asimo, which stops right before it gets to be uncanny. DIGIHUM-2017, Helsinki 4 th May 2017 slide 29 Getting it Right http://consequentialrobotics.com/miro/ DIGIHUM-2017, Helsinki 4 th May 2017 slide 30 15

Getting it Right http://www.dcs.shef.ac.uk/~roger/ MarkowitzCh12manuscript.pdf Moore, R. K. (2015). From talking and listening robots to intelligent communicative machines. In J. Markowitz (Ed.), Robots That Talk and Listen (pp. 317 335). Boston, MA: De Gruyter. DIGIHUM-2017, Helsinki 4 th May 2017 slide 31 A Glimpse of the Future? DIGIHUM-2017, Helsinki 4 th May 2017 slide 32 16

Thank You Any questions? 4 May 2017 slide 33 http://www.dcs.shef.ac.uk/~roger DIGIHUM-2017, Helsinki th VIHAR-2017 1st International Workshop on Vocal Interactivity in-and-between Humans, Animals and Robots 25-26 August 2017 University of Skövde, Sweden http://vihar-2017.vihar.org DIGIHUM-2017, Helsinki 4th May 2017 slide 34 17