Pre-Translation for Neural Machine Translation Jan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel KIT - Institute for Anthropomatics and 0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu
Mixed Input Implementation: Join source sentence and PBMT translation the goalie der Torwart RNN state encode source and PBMT translation Language specific word embeddings E_the E_goalie D_der D_Torwart BPE for word encoding E_the E_go E_al E_ie D_der D_Tor D_wart 12 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Result by Word Frequency 16 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Alignment 19 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Pre-Translation for Neural Machine Translation Jan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel KIT - Institute for Anthropomatics and 0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu
Motivation Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant improvements Automatic metrics Manual evaluation More fluent translation 1 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Motivation NMT has different problems Small vocabulary Problems translating rare words English: NMT: NMT(gloss): the goalie parried der Gott the god 2 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Motivation NMT has different problems Small vocabulary Problems translating rare words English: NMT: NMT(gloss): the goalie parried der Gott the god Combine SMT and NMT Simplify the task of NMT 2 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Outline Motivation MT approaches Idea Pipeline Mixed Input Evaluation Conclusion 3 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Statistical Machine Translation (SMT) Build translations from blocks of source and target words (phrase pairs) 4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Neural Machine Translation (NMT) Neural network to predict most probably target sequence Jointly train model Large improvements in translation quality 5 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: the_goalie_parried 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: t h e _ g o a l ie _ p a r r ie d 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: t h e _ g o a l ie _ p a r r ied 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: t h e _ g o a l ie _ pa r r ied 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Neural Machine Translation (NMT) Fixed vocabulary size Byte pair encoding (Sennrich et al. 2016) Represent all words with n sub-words Start with character representation Join most common bi-gram sequence to new symbol Exampel: the _ go al ie _ par ried 6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Difference SMT/NMT SMT: Handle large vocabulary Easily extensible Add translation via new phrase pairs NMT: Joint model Long context Better generalization due to word embeddings 7 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Pre-Translation Combine advantages of both approaches Facilitate advantages of SMT Successful combination of other approaches Idea: Use SMT as input to NMT Encode words using Byte pair encoding Use translation of words not in NMT vocabulary 8 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Related Work Combination of SMT and Rule-based MT (Dugast et al., 2007, Simard et al, 2007) Automatic Post editing (Junczyd-Dowmunt and Grundkiewicz, 2016) Preprocessing for PBMT Compound splitting Pre-reordering Handling of rare words in NMT (Luong et al 2014, Sennrich et al, 2015) 9 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Pipeline Input: Source sentence Translate using PBMT Translate from PBMT German to German using NMT 10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Mixed Input Input: Source sentence Translate using PBMT Combine source and PBMT Translation Translate joined text using NMT 11 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Mixed Input Implementation: Join source sentence and PBMT translation the goalie der Torwart RNN state encode source and PBMT translation Language specific word embeddings E_the E_goalie D_der D_Torwart BPE for word encoding E_the E_go E_al E_ie D_der D_Tor D_wart 12 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Training Training data: Parallel corpus PBMT translation of corpus Problem: PBMT tends to overfit on the training data Filter singletons from phrase table Successful used in other models 13 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Experiments Training data: WMT EN-DE Data PBMT In-house translation system NMT Nematus BPE with 40K operations 14 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Results English - German System Dev/Valid Test tst2014 tst2015 tst2016 NMT 20.79 23.34 27.65 NMT Ensemble 21.42 24.03 28.89 PBMT 19.76 21.80 26.42 Advanced PBMT 21.62 23.34 28.13 15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Results English - German System Dev/Valid Test tst2014 tst2015 tst2016 NMT 20.79 23.34 27.65 NMT Ensemble 21.42 24.03 28.89 PBMT 19.76 21.80 26.42 Advanced PBMT 21.62 23.34 28.13 Pipeline 20.56 22.04 26.75 Pipeline Advanced 21.76 22.92 27.61 15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Results English - German System Dev/Valid Test tst2014 tst2015 tst2016 NMT 20.79 23.34 27.65 NMT Ensemble 21.42 24.03 28.89 PBMT 19.76 21.80 26.42 Advanced PBMT 21.62 23.34 28.13 Pipeline 20.56 22.04 26.75 Pipeline Advanced 21.76 22.92 27.61 Mix 21.88 24.11 28.04 Mix Advanced 22.53 24.37 29.62 Mix Advanced Ensemble 23.16 25.35 30.67 15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Result by Word Frequency 16 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Examples English: PBMT: NMT: Pre: Pre(gloss): Then with a shot which the goalie parried with his knee in the 35th minute. Dann mit einem Schuss, die der Torwart pariert mit seinem Knie in der 35. Minute. Dann mit einem Schuss, den der Gott mit seinem Knie in der 35. Minute. Dann mit einem Schuss, das der Torwart mit seinem Knie in der 35. Minute pariert. Then with a shoot, that the goalie with his knee in the 35th minute parried. 17 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Examples English: PBMT: NMT: Pre: Pre (gloss):... a riot in the stadium.... einen Aufruhr im Stadion.... einen Riot im Stadion.... einen Aufruhr im Station.... a riot in_the stadium. 18 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Alignment 19 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Conclusion Combine advantages of NMT and SMT Improve handling of rare words Easy handling different input streams Increase overall translation performance Further work: Do we need to do a full translation? 20 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and
Thanks 21 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and