document context neural machine translation with memory ... · document context neural machine...

78
Document Context Neural Machine Translation with Memory Networks Document Context Neural Machine Translation with Memory Networks Sameen Maruf, Gholamreza Haffari Faculty of Information Technology Monash University July 17, 2017 1 / 30

Upload: others

Post on 03-Aug-2020

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document Context Neural Machine Translationwith Memory Networks

Sameen Maruf, Gholamreza Haffari

Faculty of Information Technology

Monash University

July 17, 2017

1 / 30

Page 2: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

2 / 30

Page 3: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

3 / 30

Page 4: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Page 5: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Page 6: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Page 7: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Page 8: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Most MT models translate sentences independently

Discourse phenomena are ignored, e.g. pronominal anaphoraand lexical consistency which may have long range dependency

4 / 30

Page 9: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Page 10: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Page 11: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Page 12: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Introduction

Why document-level machine translation?

Statistical MT attempts to document MT do not yieldsignificant empirical improvements[Hardmeier and Federico, 2010, Gong et al., 2011,Garcia et al., 2014]

Previous context-NMT models only use local context andreport deteriorated performance when using the target-sidecontext[Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018]

We incorporate global source and target document contexts

5 / 30

Page 13: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

6 / 30

Page 14: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Page 15: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Page 16: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Page 17: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Page 18: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Page 19: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

7 / 30

Page 20: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Two types of factors: fθ(yt ; xt , x−t), gθ(yt ; y−t)

8 / 30

Page 21: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Page 22: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:

Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Page 23: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Page 24: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Training objective:Maximise P(y1, . . . , y|d ||x1, . . . , x|d |)

=⇒ Maximise the pseudo-likelihood

arg maxθ

|d |∏t=1

Pθ(yt |xt , y−t , x−t) (1)

where fθ and gθ are subsumed in the Pθ(yt |xt , y−t , x−t)

9 / 30

Page 25: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Page 26: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Page 27: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Page 28: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Page 29: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Page 30: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Challenge: During test time, the target document is not given

Coordinate Ascent (i.e., Iterative Decoding)

10 / 30

Page 31: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Page 32: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Page 33: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Page 34: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Page 35: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Page 36: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document MT as Structured Prediction

Document MT as Structured Prediction

Iterative Decoding

11 / 30

Page 37: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

12 / 30

Page 38: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒ Pθ(yt |xt , y−t , x−t)

13 / 30

Page 39: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

14 / 30

Page 40: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

15 / 30

Page 41: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

16 / 30

Page 42: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

Memory-to-Context:

st,j = GRU(st,j−1,ET [yt,j−1], ct,j , csrct , c trg

t )

17 / 30

Page 43: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

=⇒

Memory-to-Output:

yt,j ∼ softmax(Wy · rt,j + Wym · csrct + Wyt · c trg

t + by )

18 / 30

Page 44: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Document NMT with MemNets

Document NMT with MemNets

Use only source, target, or both external memories

Use Memory-to-Context/Memory-to-Output architectures forincorporating the different contexts

19 / 30

Page 45: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

20 / 30

Page 46: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Page 47: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Page 48: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Page 49: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Experimental Setup

Training/dev/test corpora statistics:

corpus #docs (H) #sents (K) avg doc len

Fr→En Ted-Talks 10/1.2/1.5 123/15/19 123/128/124Et→En Europarl v7 150/10/18 209/14/25 14/14/14De→En News-Commentary 49/.9/1.6 191/2/3 39/23/19

Evaluation Metrics: BLEU, METEOR

Baselines:

Context-free baseline (S-NMT)

Local source context baselines:• [Jean et al., 2017] & [Wang et al., 2017]

21 / 30

Page 50: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→EnFr→EnFr→En0

0.2

0.4

0.6

0.8

1

BL

EU

De→EnDe→EnDe→En0

0.2

0.4

0.6

0.8

1

Et→EnEt→EnEt→En0

0.2

0.4

0.6

0.8

1

22 / 30

Page 51: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

BL

EU

De→En9

9.5

10

10.5

11

9.18

S-NMT

Et→En20

21

22

23

20.42

22 / 30

Page 52: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

21.91

BL

EU

De→En9

9.5

10

10.5

11

9.18

10.2

S-NMT S-NMT+src

Et→En20

21

22

23

20.42

22.1

22 / 30

Page 53: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

21.91

21.74

BL

EU

De→En9

9.5

10

10.5

11

9.18

10.2

9.97

S-NMT S-NMT+src S-NMT+trg

Et→En20

21

22

23

20.42

22.121.94

22 / 30

Page 54: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Context Results

Fr→En20

21

22

23

20.85

21.91

21.74

22

BL

EU

De→En9

9.5

10

10.5

11

9.18

10.2

9.97

10.54

S-NMT S-NMT+src S-NMT+trg S-NMT+both

Et→En20

21

22

23

20.42

22.121.94

22.32

22 / 30

Page 55: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→EnFr→EnFr→En0

0.2

0.4

0.6

0.8

1

BL

EU

De→EnDe→EnDe→En0

0.2

0.4

0.6

0.8

1

Et→EnEt→EnEt→En0

0.2

0.4

0.6

0.8

1

23 / 30

Page 56: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

BL

EU

De→En9

9.5

10

10.5

11

9.18

S-NMT

Et→En20

21

22

23

20.42

23 / 30

Page 57: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

21.8

BL

EU

De→En9

9.5

10

10.5

11

9.18

9.98

S-NMT S-NMT+src

Et→En20

21

22

23

20.42

21.5

23 / 30

Page 58: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

21.8 21.76

BL

EU

De→En9

9.5

10

10.5

11

9.18

9.9810.04

S-NMT S-NMT+src S-NMT+trg

Et→En20

21

22

23

20.42

21.5

21.82

23 / 30

Page 59: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Memory-to-Output Results

Fr→En20

21

22

23

20.85

21.8 21.76 21.77

BL

EU

De→En9

9.5

10

10.5

11

9.18

9.9810.04

10.23

S-NMT S-NMT+src S-NMT+trg S-NMT+both

Et→En20

21

22

23

20.42

21.5

21.82

22.2

23 / 30

Page 60: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→EnFr→EnFr→En0

0.2

0.4

0.6

0.8

1

BL

EU

De→EnDe→EnDe→En0

0.2

0.4

0.6

0.8

1

Et→EnEt→EnEt→En0

0.2

0.4

0.6

0.8

1

24 / 30

Page 61: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→En21

21.5

22

22.5

23

21.9521.87

BL

EU

De→En10

10.2

10.4

10.6

10.8

11

10.26

10.14

[Jean et al., 2017] [Wang et al., 2017]

Et→En21

21.5

22

22.5

23

21.67

22.06

24 / 30

Page 62: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→En21

21.5

22

22.5

23

21.9521.87 21.91

BL

EU

De→En10

10.2

10.4

10.6

10.8

11

10.26

10.14

10.2

[Jean et al., 2017] [Wang et al., 2017] S-NMT+src

Et→En21

21.5

22

22.5

23

21.67

22.06 22.1

24 / 30

Page 63: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Main Results

Fr→En21

21.5

22

22.5

23

21.9521.87 21.91

22

BL

EU

De→En10

10.2

10.4

10.6

10.8

11

10.26

10.14

10.2

10.54

[Jean et al., 2017] [Wang et al., 2017] S-NMT+src S-NMT+both

Et→En21

21.5

22

22.5

23

21.67

22.06 22.1

22.32

24 / 30

Page 64: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Page 65: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Page 66: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Page 67: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation

Source qimonda taidab lissaboni strateegia eesmarke.Target qimonda meets the objectives of the lisbon strategy.

S-NMT <UNK> is the objectives of the lisbon strategy.+Src Mem the millennium development goals are fulfilling the

millennium goals of the lisbon strategy.+Trg Mem in writing. - (ro) the lisbon strategy is fulfilling the

objectives of the lisbon strategy.+Both Mems qimonda fulfils the aims of the lisbon strategy.

[Wang et al., 2017] <UNK> fulfils the objectives of the lisbon strategy.

25 / 30

Page 68: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation (contd.)

Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.

Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.

S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.

+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.

+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.

+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.

[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.

26 / 30

Page 69: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation (contd.)

Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.

Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.

S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.

+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.

+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.

+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.

[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.

26 / 30

Page 70: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Experiments and Analysis

Example translation (contd.)

Source ... et riigis kehtib endiselt lukasenka diktatuur,mis rikub inim- ning etnilise vahemuse oigusi.

Target ... this country is still under the dictatorship oflukashenko, breaching human rights and the rightsof ethnic minorities.

S-NMT ... the country still remains in a position of lukashenkoto violate human rights and ethnic minorities.

+Src Mem ... the country still applies to the brutal dictatorship ofhuman and ethnic minority rights.

+Trg Mem ... the country still keeps the <UNK> dictatorship thatviolates human rights and ethnic rights.

+Both Mems ... the country still persists in lukashenko’s dictatorshipthat violate human rights and ethnic minority rights.

[Wang et al., 2017] ... there is still a regime in the country that isviolating the rights of human and ethnic minorityin the country.

26 / 30

Page 71: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Conclusion

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

27 / 30

Page 72: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Page 73: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Page 74: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Page 75: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Page 76: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

Conclusion

Conclusion

Proposed a model which incorporates the global source andtarget document contexts

Proposed effective training and decoding methodologies forour model

Future Work:Investigate document-context NMT models which incorporatespecific discourse-level phenomena

28 / 30

Page 77: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

References

Overview

1 Introduction

2 Document MT as Structured Prediction

3 Document NMT with MemNets

4 Experiments and Analysis

5 Conclusion

6 References

29 / 30

Page 78: Document Context Neural Machine Translation with Memory ... · Document Context Neural Machine Translation with Memory Networks Experiments and Analysis Experimental Setup Training/dev/test

Document Context Neural Machine Translation with Memory Networks

References

References I

Hardmeier C. and Federico, M. (2010).

Modelling pronominal anaphora in statistical machine translation.International Workshop on Spoken Language Translation.

Gong Z. and Zhang M. and Zhou G. (2011).

Cache-based document-level statistical machine translation.Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Garcia E. M. and Espana-Bonet C. and Marquez L. (2014).

Document-level machine translation as a re-translation process.Procesamiento del Lenguaje Natural, 53:103110..

Jean, S. and Lauly, L. and Firat, O. and Cho, K. (2017).

Does Neural Machine Translation Benefit from Larger Context?arXiv:1704.05135.

Wang, L. and Tu, Z. and Way, A. and Liu, Q. (2017).

Exploiting Cross-Sentence Context for Neural Machine Translation.Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Bawden, R. and Sennrich, R. and Birch, A. and Haddow, B. (2018).

Evaluating Discourse Phenomena in Neural Machine Translation.Proceedings of the NAACL-HLT 2018.

30 / 30