Pre-Translation for Neural Machine Translation

Similar documents

Machine Translation and Advanced Topics on LSTMs

The decoder in statistical machine translation: how does it work?

LSTM Neural Style Transfer in Music Using Computational Musicology

Image-to-Markup Generation with Coarse-to-Fine Attention

Generating Chinese Classical Poems Based on Images

Announcements. HW2 directory structure penalty to be removed due to grading inconsistencies.

arxiv: v1 [cs.lg] 15 Jun 2016

Empirical evaluation of NMT and PBSMT quality for large-scale translation production.

First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text

VThis App Note USING THE 608 TO 708 CAPTION CONVERSION OPTION. App Note

ADDRESSING THE CHALLENGES OF IOT DESIGN JEFF MILLER, PRODUCT MARKETING MANAGER, MENTOR GRAPHICS

arxiv: v1 [cs.ir] 16 Jan 2019

Structured training for large-vocabulary chord recognition. Brian McFee* & Juan Pablo Bello

Chemotion funded by. Chemotion ELN Basis-Funktionen und besondere Anwendungen. Nicole Jung (Stefan Bräse group)

FPGA Implementation of DA Algritm for Fir Filter

Digitization: Sampling & Quantization

An AI Approach to Automatic Natural Music Transcription

Machine Translation: Examples. Statistical NLP Spring MT: Evaluation. Phrasal / Syntactic MT: Examples. Lecture 7: Phrase-Based MT

PIPELINING: BRANCH AND MULTICYCLE INSTRUCTIONS

The new way of audio distribution in real-time using LAN/WAN infrastructures

arxiv: v1 [cs.sd] 8 Jun 2016

Chapter 2 Introduction to

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Multiview Video Coding

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

Algorithmic Music Composition using Recurrent Neural Networking

Learning Musical Structure Directly from Sequences of Music

Using Make.TV s Live Video Cloud and Playout to deliver great content across platforms during the world s biggest sports event.

H.261: A Standard for VideoConferencing Applications. Nimrod Peleg Update: Nov. 2003

ANKA Status Report. N.Smale, on behalf of all ANKA colleagues, Directors : A.-S. Müller, C Heske, T Baumbach.

Technical Information. BER Measurement SFL-K17

Automatic Notes Generation for Musical Instrument Tabla

Machine Translation Part 2, and the EM Algorithm

Finding Sarcasm in Reddit Postings: A Deep Learning Approach

Very Short Answer: (1) (1) Peak performance does or does not track observed performance.

Speech Recognition and Voice Separation for the Internet of Things

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

Temporal dependencies in the expressive timing of classical piano performances

Oaktree School Assessment READING P4

Advanced Statistical Steganalysis

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.

Melody classification using patterns

SWITCHED INFINITY: SUPPORTING AN INFINITE HD LINEUP WITH SDV

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Predicting the immediate future with Recurrent Neural Networks: Pre-training and Applications

HEVC: Future Video Encoding Landscape

Basic Natural Language Processing

8b10b Macro. v2.0. This data sheet defines the functionality of Version 1.0 of the 8b10b macro.

Music Composition with RNN

Automated Performance Modeling for IoT Systems. Connie U. Smith & Amy Spellmann

OPTICAL MUSIC RECOGNITION WITH CONVOLUTIONAL SEQUENCE-TO-SEQUENCE MODELS

Exercise 4. Data Scrambling and Descrambling EXERCISE OBJECTIVE DISCUSSION OUTLINE DISCUSSION. The purpose of data scrambling and descrambling

Mauricio Álvarez-Mesa ; Chi Ching Chi ; Ben Juurlink ; Valeri George ; Thomas Schierl Parallel video decoding in the emerging HEVC standard

Memory interface design for AVS HD video encoder with Level C+ coding order

Everything about the BA Thesis

EXOSTIV TM. Frédéric Leens, CEO

Hardware Implementation for the HEVC Fractional Motion Estimation Targeting Real-Time and Low-Energy

Compressed-Sensing-Enabled Video Streaming for Wireless Multimedia Sensor Networks Abstract:

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Dual Link DVI Receiver Implementation

Multicore Design Considerations

The Multistandard Full Hd Video-Codec Engine On Low Power Devices

arxiv: v1 [cs.cv] 16 Jul 2017

Laboratory 4. Figure 1: Serdes Transceiver

White Paper : Achieving synthetic slow-motion in UHDTV. InSync Technology Ltd, UK

BUILDING A SYSTEM FOR WRITER IDENTIFICATION ON HANDWRITTEN MUSIC SCORES

Chapter 9: Shift Registers

CONCLUSION Restate your thesis Summarize the main points Write a personal comment Prediction Question Recommendation Quotation

Joint source-channel video coding for H.264 using FEC

ANKA RF System - Upgrade Strategies

Part1 박찬솔. Audio overview Video overview Video encoding 2/47

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

Digital Video Telemetry System

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 3. A Network-Centric View on HPC

AUDIOVISUAL COMMUNICATION

See, hear, feel: How Dolby and Channel One put millions of Russians centre stage at top events

Implementation of an MPEG Codec on the Tilera TM 64 Processor

A probabilistic approach to determining bass voice leading in melodic harmonisation

DIGITAL PROGRAM INSERTION FOR LOCAL ADVERTISING Mukta Kar, Ph.D., Majid Chelehmal, Ph.D., Richard S. Prodan, Ph.D. Cable Television Laboratories

arxiv: v1 [cs.cl] 9 Dec 2016

Hands-On 3D TV Digital Video and Television

Towards the analysis of linear aspects in tonal jazz harmony. Michael Kahr, University of Music and Performing Arts in Graz, Austria

Experiments with Fisher Data

The GB3HV digital project part 1. Noel Matthews G8GTZ

Fingerprint Verification System

Conference object, Postprint version This version is available at

A Discriminative Approach to Topic-based Citation Recommendation

INF 4611 Scientific Writing and Presenting

Fronthaul solutions

Quarter-Pixel Accuracy Motion Estimation (ME) - A Novel ME Technique in HEVC

The ChildTrauma Academy

CM3106 Solutions. Do not turn this page over until instructed to do so by the Senior Invigilator.

25.5 A Zero-Crossing Based 8b, 200MS/s Pipelined ADC

Multi-Layer Video Broadcasting with Low Channel Switching Dl Delays

Lyrics Classification using Naive Bayes

Music Generation from MIDI datasets

Andreas Kämper SS Publishing Process I. Div. for Simulation of Biological Systems WSI/ZBIT, Eberhard Karls Universität i Tübingen

6.115 KryptoPhone Final Project Report

CONVOLUTIONAL CODING

Transcription:

Pre-Translation for Neural Machine Translation Jan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel KIT - Institute for Anthropomatics and 0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Mixed Input Implementation: Join source sentence and PBMT translation the goalie der Torwart RNN state encode source and PBMT translation Language specific word embeddings E_the E_goalie D_der D_Torwart BPE for word encoding E_the E_go E_al E_ie D_der D_Tor D_wart 12 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Result by Word Frequency 16 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Alignment 19 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Pre-Translation for Neural Machine Translation Jan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel KIT - Institute for Anthropomatics and 0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Motivation Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant improvements Automatic metrics Manual evaluation More fluent translation 1 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Motivation NMT has different problems Small vocabulary Problems translating rare words English: NMT: NMT(gloss): the goalie parried der Gott the god 2 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Motivation NMT has different problems Small vocabulary Problems translating rare words English: NMT: NMT(gloss): the goalie parried der Gott the god Combine SMT and NMT Simplify the task of NMT 2 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Outline Motivation MT approaches Idea Pipeline Mixed Input Evaluation Conclusion 3 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Neural Machine Translation (NMT) Neural network to predict most probably target sequence Jointly train model Large improvements in translation quality 5 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: the_goalie_parried 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: t h e _ g o a l ie _ p a r r ie d 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: t h e _ g o a l ie _ p a r r ied 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: t h e _ g o a l ie _ pa r r ied 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: the _ go al ie _ par ried 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Difference SMT/NMT SMT: Handle large vocabulary Easily extensible Add translation via new phrase pairs NMT: Joint model Long context Better generalization due to word embeddings 7 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Pre-Translation Combine advantages of both approaches Facilitate advantages of SMT Successful combination of other approaches Idea: Use SMT as input to NMT Encode words using Byte pair encoding Use translation of words not in NMT vocabulary 8 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Related Work Combination of SMT and Rule-based MT (Dugast et al., 2007, Simard et al, 2007) Automatic Post editing (Junczyd-Dowmunt and Grundkiewicz, 2016) Preprocessing for PBMT Compound splitting Pre-reordering Handling of rare words in NMT (Luong et al 2014, Sennrich et al, 2015) 9 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Mixed Input Input: Source sentence Translate using PBMT Combine source and PBMT Translation Translate joined text using NMT 11 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Mixed Input Implementation: Join source sentence and PBMT translation the goalie der Torwart RNN state encode source and PBMT translation Language specific word embeddings E_the E_goalie D_der D_Torwart BPE for word encoding E_the E_go E_al E_ie D_der D_Tor D_wart 12 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Training Training data: Parallel corpus PBMT translation of corpus Problem: PBMT tends to overfit on the training data Filter singletons from phrase table Successful used in other models 13 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Experiments Training data: WMT EN-DE Data PBMT In-house translation system NMT Nematus BPE with 40K operations 14 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Results English - German System Dev/Valid Test tst2014 tst2015 tst2016 NMT 20.79 23.34 27.65 NMT Ensemble 21.42 24.03 28.89 PBMT 19.76 21.80 26.42 Advanced PBMT 21.62 23.34 28.13 15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Results English - German System Dev/Valid Test tst2014 tst2015 tst2016 NMT 20.79 23.34 27.65 NMT Ensemble 21.42 24.03 28.89 PBMT 19.76 21.80 26.42 Advanced PBMT 21.62 23.34 28.13 Pipeline 20.56 22.04 26.75 Pipeline Advanced 21.76 22.92 27.61 15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Results English - German System Dev/Valid Test tst2014 tst2015 tst2016 NMT 20.79 23.34 27.65 NMT Ensemble 21.42 24.03 28.89 PBMT 19.76 21.80 26.42 Advanced PBMT 21.62 23.34 28.13 Pipeline 20.56 22.04 26.75 Pipeline Advanced 21.76 22.92 27.61 Mix 21.88 24.11 28.04 Mix Advanced 22.53 24.37 29.62 Mix Advanced Ensemble 23.16 25.35 30.67 15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Result by Word Frequency 16 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Examples English: PBMT: NMT: Pre: Pre(gloss): Then with a shot which the goalie parried with his knee in the 35th minute. Dann mit einem Schuss, die der Torwart pariert mit seinem Knie in der 35. Minute. Dann mit einem Schuss, den der Gott mit seinem Knie in der 35. Minute. Dann mit einem Schuss, das der Torwart mit seinem Knie in der 35. Minute pariert. Then with a shoot, that the goalie with his knee in the 35th minute parried. 17 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Examples English: PBMT: NMT: Pre: Pre (gloss):... a riot in the stadium.... einen Aufruhr im Stadion.... einen Riot im Stadion.... einen Aufruhr im Station.... a riot in_the stadium. 18 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Alignment 19 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Conclusion Combine advantages of NMT and SMT Improve handling of rare words Easy handling different input streams Increase overall translation performance Further work: Do we need to do a full translation? 20 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and

Thanks 21 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and