searching for the best machine translation combination

23
Matīss Rikters Searching for the Best Machine Translation Combination Tartu, Estonia 22.03.2017

Upload: matiss-rikters

Post on 14-Apr-2017

16 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Searching for the Best Machine Translation Combination

Matīss Rikters

Searching for the Best Machine Translation Combination

Tartu, Estonia

22.03.2017

Page 2: Searching for the Best Machine Translation Combination

Machine Translation

Hybrid Machine Translation

Methods I used• A count-based language model for candidate selection from full whole translations

• Combining translations of sentence chunks

• Combining translations of linguistically motivated chunks

• A character-level neural language model for candidate selection

A graphical implementation of the methods

Translation of multiword expressions

Other academic activities

Future plans

Contents

Page 3: Searching for the Best Machine Translation Combination

• Machine translation (MT) is a sub-field of natural language processing that investigates the use of computers to translate text from one language to another

• Statistical MT (SMT) consists of subcomponents that are separately engineered to learn how to translate from vast amounts of translated text

• Rule-based MT (RBMT) is based on linguistic information covering the main semantic, morphological, and syntactic regularities of source and target languages

• Neural MT (NMT) consists of a large neural network in which weights are trained jointly to maximize the translation performance

Machine Translation

Page 4: Searching for the Best Machine Translation Combination

• One of the first metrics to report high correlation with human judgments

• One of the most popular in the field

• The closer MT is to a professional human translation, the better it is

• Scores a translation on a scale of 0 to 100

Automatic Evaluation of MT: BLEU

Page 5: Searching for the Best Machine Translation Combination

Statistical rule generation• Rules for RBMT systems are generated from training corpora

Multi-pass• Process data through RBMT first, and then through SMT

Multi-System hybrid MT• Multiple MT systems run in parallel

• SMT + RBMT (Ahsan and Kolachina, 2010)• Confusion Networks (Barrault, 2010)

+ Neural Network Model (Freitag et al., 2015)• SMT + EBMT + TM + NE (Santanu et al., 2014)• Recursive sentence decomposition (Mellebeek et al., 2006)

Literature Review: Hybrid Machine Translation

Page 6: Searching for the Best Machine Translation Combination

Combining full whole translations• Translate the full input sentence with multiple MT systems

• Choose the best translation as the output

Combining translations of sentence chunks• Split the sentence into smaller chunks

• The chunks are the top level subtrees of the syntax tree of the sentence

• Translate each chunk with multiple MT systems

• Choose the best translated chunks and combine them

Combining Translations

Page 7: Searching for the Best Machine Translation Combination

KenLM (Heafield, 2011) calculates probabilities based on the observed entry with longest matching history :

where the probability and backoff penalties are given by an already-estimated language model. Perplexity is then calculated using this probability: where given an unknown probability distribution p and a proposed probability model q, it is evaluated by determining how well it predicts a separate test sample x1,

x2... xN drawn from p.

Candidate Selection

Page 8: Searching for the Best Machine Translation Combination

Teikumu dalīšana tekstvienībās

Tulkošana ar tiešsaistes MT API

Google Translate Bing Translator LetsMT

Labākā tulkojuma izvēle

Tulkojuma izvade

Sentence tokenization

Translation with online MT

Selection of the best translation

Output

Whole Translations

Page 9: Searching for the Best Machine Translation Combination

Teikumu dalīšana tekstvienībās

Tulkošana ar tiešsaistes MT API

Google Translate

Bing Translator LetsMT

Labāko fragmentu izvēle

Tulkojumu izvade

Teikumu sadalīšana fragmentos

Sintaktiskā analīze

Teikumu apvienošana

Sentence tokenization

Translation with online MT

Selection of the best chunks

Output

Syntactic analysis

Sentence chunking

Sentence recomposition

Chunks

Page 10: Searching for the Best Machine Translation Combination

An advanced approach to chunking• Traverse the syntax tree bottom up, from right to left

• Add a word to the current chunk if• The current chunk is not too long (sentence word count / 4)• The word is non-alphabetic or only one symbol long• The word begins with a genitive phrase («of »)

• Otherwise, initialize a new chunk with the word

• When chunking results in too many chunks, repeat the process, allowing more (than sentence word count / 4) words in a chunk

Candidate Selection:

12-gram LM trained with• KenLM

• DGT-Translation Memory corpus (Steinberger, 2011)3.1 million legal domain sentences

• Sentences scored with the query program from KenLM

Test data• 1581 random sentences from the JRC-Acquis corpus

• ACCURAT balanced evaluation corpus

Linguistically Motivated Chunks

CICLing 2016

Page 11: Searching for the Best Machine Translation Combination

Linguistically Motivated Chunks

Simple chunks Linguistically motivated chunks

• Recently

• there

• has been an increased interest in the automated discovery of equivalent expressions in different languages

• .

• Recently there has been an increased interest

• in the automated discovery of equivalent expressions

• in different languages . 

Page 12: Searching for the Best Machine Translation Combination

0.11

0.32

0.50

0.70

0.88

1.09

1.29

1.47

1.67

1.77

15.00

20.00

25.00

30.00

35.00

40.00

45.00

50.00

16.00

17.00

18.00

19.00

20.00

21.00

22.00

23.00

24.00

25.00

Perplexity BLEU-HY Linear (BLEU-HY)

Epoch

Perp

lexi

ty

BLE

U

Neural Language Models

0.11

0.32

0.50

0.70

0.88

1.09

1.29

1.47

1.67

1.77

15.00

20.00

25.00

30.00

35.00

40.00

45.00

50.00

13.30

13.80

14.30

14.80

15.30

15.80

16.30

Perplexity BLEU Linear (BLEU)

Epoch

Perp

lexi

ty

BLE

U

Page 13: Searching for the Best Machine Translation Combination

System BLEU

Whole translations – G+B(Rikters 2015) 17.70

Simple Chunks– G+B(Rikters and Skadiņa 2016a) 17.95

Linguistic Chunks – G+B(Rikters and Skadiņa 2016b) 18.29

Linguistic Chunks – G+B+H+Y(Rikters and Skadiņa 2016b) 19.21

+ Char-RNN Neural Language Model(Rikters 2016d) 19.51

Some Results

Baselines BLEU

Bing 17.43

Google 17.63

Hugo.lv 17.14

Yandex 16.04

Page 14: Searching for the Best Machine Translation Combination

Start page

Translate with online systems

Input translations to combine

Input translated

chunks

Settings

Translation results

Input source sentence

Input source sentence

Interactive MS MT(Rikters 2016a)

Page 15: Searching for the Best Machine Translation Combination

Translation of Multi-Word Expressions (MWEs)

Find & Mark MWE candidates

in corpora

Pre-process monolingual texts with TreeTagger

Extract MWE candidate lists from corpora

Mark MWE candidates in

text

Find translation equivalents for monolingual MWE candidates

with MPAligner

Monolingual MWE extraction and annotation

MWE alignment

SMT Experiments

Adding data to the parallel

corpora

Adding a second translation table

Adding a sixth feature to the

translation table

Using the Jaccard Index for translation

probabilities

Using a Levenshtein distance-based

similarity metric for translation

probabilities

Method BLEU

Baseline 62.23

Baseline + MWE training data 62.10

Baseline + 2nd translation table 62.04

Baseline + 6th feature 62.37

Page 16: Searching for the Best Machine Translation Combination

MWEs in Neural Machine Translation

English-Latvian English-Czech

Training

Validation

2.5M 1xMWE 2.5M 2xMWE 5M 2xMWE 5M

1M 1xMWE 1M 2xMWE 2M 2xMWE 0.5M

Page 17: Searching for the Best Machine Translation Combination

• Matīss Rikters"Multi-system machine translation using online APIs for English-Latvian" The Fourth Workshop on Hybrid Approaches to Translation (2015)

• Matīss Rikters and Inguna Skadiņa"Syntax-based multi-system machine translation" The 10th edition of the Language Resources and Evaluation Conference (2016a)

• Matīss Rikters and Inguna Skadiņa"Combining machine translated sentence chunks from multiple MT systems" The 17th International Conference on Computational Linguistics and Intelligent Text Processing (2016b)

• Matīss Rikters"K-translate – interactive multi-system machine translation"12th International Baltic Conference on Databases and Information Systems (2016a)

• Matīss Rikters“Searching for the Best Translation Combination Across All Possible Variants”The 7th Conference on Human Language Technologies - the Baltic Perspective (2016b)

• Matīss Rikters“Interactive Multi-System Machine Translation with Neural Language Models” IOS Press Ebook (2016c)

• Matīss Rikters“Neural Network Language Models for Candidate Scoring in Hybrid Multi-System Machine Translation” The Sixth Workshop on Hybrid Approaches to Translation (2016d)

Publications

CICLing 2016

Page 18: Searching for the Best Machine Translation Combination

• Matīss Rikters and Ondřej Bojar"Handling Multi-Word Expressions in Neural Machine Translation"

Publications in Progress

Page 19: Searching for the Best Machine Translation Combination

http://ej.uz/ChunkMT

http://ej.uz/SyMHyT

http://ej.uz/MSMT

http://ej.uz/chunker

http://ej.uz/NeuralLM

Code on GitHub

Page 20: Searching for the Best Machine Translation Combination

Teaching• Supervised multiple course, qualification and bachelor theses

• Average grade 8.67

• Student curator

Attended Summer / Winter Schools• Machine Translation Marathon 2015

• Deep Learning For Machine Translation 2015

• ParseME 2nd Training School

• Neural Machine Translation Marathon 2016

Other Academic Activities

Page 21: Searching for the Best Machine Translation Combination

Future Work

• Complete experiments and inspect results for English – Estonian

• Win WMT17 news translation task• At least for English-Latvian

• At least beat Tilde

• Perform chunking on the target side• Get chunks from dependency parses

• Complete PhD thesis draft

• Pass final exams• Experiment with other types of LMs for candidate selection

• Factored Language Models (POS tag + lemma)

• Convolutional Neural Network Language Models

• Perform candidate selection using MT quality estimation• QuEst++ (Specia et al., 2015)

• SHEF-NN (Shah et al., 2015)

Page 22: Searching for the Best Machine Translation Combination

Ahsan, A., and P. Kolachina. "Coupling Statistical Machine Translation with Rule-based Transfer and Generation, AMTA-The Ninth Conference of the Association for Machine Translation in the Americas." Denver, Colorado (2010).

Barrault, Loïc. "MANY: Open source machine translation system combination." The Prague Bulletin of Mathematical Linguistics 93 (2010): 147-155.

Heafield, Kenneth. "KenLM: Faster and smaller language model queries." Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics, 2011.

Kim, Yoon, et al. "Character-aware neural language models." arXiv preprint arXiv:1508.06615 (2015).

Mellebeek, Bart, et al. "Multi-engine machine translation by recursive sentence decomposition." (2006).

Mikolov, Tomas, et al. "Recurrent neural network based language model." INTERSPEECH. Vol. 2. 2010.

Petrov, Slav, et al. "Learning accurate, compact, and interpretable tree annotation." Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2006.

Raivis Skadiņš, Kārlis Goba, Valters Šics. 2010. Improving SMT for Baltic Languages with Factored Models. Proceedings of the Fourth International Conference Baltic HLT 2010, Frontiers in Artificial Intelligence and Applications, Vol. 2192. , 125-132.

Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. LREC 2016. (2016a)

Rikters, M., Skadiņa, I.: Combining machine translated sentence chunks from multiple MT systems. CICLing 2016. (2016b)

Santanu, Pal, et al. "USAAR-DCU Hybrid Machine Translation System for ICON 2014" The Eleventh International Conference on Natural Language Processing. , 2014.

Schwenk, Holger, Daniel Dchelotte, and Jean-Luc Gauvain. "Continuous space language models for statistical machine translation." Proceedings of the COLING/ACL on Main conference poster sessions. Association for Computational Linguistics, 2006.

Shah, Kashif, et al. "SHEF-NN: Translation Quality Estimation with Neural Networks." Proceedings of the Tenth Workshop on Statistical Machine Translation. 2015.

Specia, Lucia, G. Paetzold, and Carolina Scarton. "Multi-level Translation Quality Prediction with QuEst++." 53rd Annual Meeting of the Association for Computational Linguistics and Seventh International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing: System Demonstrations. 2015.

Steinberger, Ralf, et al. "Dgt-tm: A freely available translation memory in 22 languages." arXiv preprint arXiv:1309.5226 (2013).

Steinberger, Ralf, et al. "The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages." arXiv preprint cs/0609058 (2006).

References

Page 23: Searching for the Best Machine Translation Combination

Aitäh!