saliency-driven word alignment interpretation for nmtsaliency-driven word alignment interpretation...

41
Saliency-driven Word Alignment Interpretation for NMT Shuoyang Ding Hainan Xu Philipp Koehn The Fourth Conference on Machine Translation Florence, Italy August 1st, 2019

Upload: others

Post on 08-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Shuoyang Ding Hainan Xu Philipp Koehn

The Fourth Conference on Machine Translation Florence, Italy

August 1st, 2019

Page 2: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Revisiting Six Challenges• poor out-of-domain performance

• poor low-resource performance

• low frequency words

• long sentences

• attention is not word alignment

• large beam does not help

2

[Koehn and Knowles 2017]

Page 3: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Revisiting Six Challenges• poor out-of-domain performance

• poor low-resource performance

• low frequency words

• long sentences

• attention is not word alignment • large beam does not help

3

[Koehn and Knowles 2017]

Page 4: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

A Model Interpretation Problem

4

Page 5: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

A Model Interpretation Problem

5

Page 6: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Related Findings Outside MT

• “Attention is not Explanation”[Jain and Wallace NAACL 2019]

• “Is Attention Interpretable?” (Spoiler: No) [Serrano and Smith ACL 2019]

• We also have empirical results that corroborate these findings.

• … and we have method that works better!

6

Page 7: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency: Identifying Important Features

Page 8: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Recap

8

Page 9: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Recap

9

Page 10: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Focus on solten

10

Page 11: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Perturbation

11

Page 12: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Perturbation

12

Page 13: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Assumption

13

The output score is more sensitive to perturbations in important features.

Page 14: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

E.g.

14

Page 15: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

E.g.

15

Page 16: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

E.g.

16

Page 17: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Saliency

17

Page 18: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Saliency

18

when :

Page 19: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Saliency

19

Page 20: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

What’s good about this?

1. Derivatives are easy to obtain for any DL toolkit

2. Model-agnostic

3. Adapts with the choice of output words

20

Page 21: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Prior Work on Saliency

• Widely used and studied in Computer Vision! [Simonyan et al. 2013][Springenberg et al. 2014][Smilkov et al. 2017]

• Also in a few NLP work for qualitative analysis [Aubakirova and Bansal 2016][Li et al. 2016][Ding et al. 2017][Arras et al. 2016;2017][Mudrakarta et al. 2018]

21

Page 22: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

SmoothGrad• Gradients are very local measure of sensitivity.

• Highly non-linear models may have pathological points where the gradients are noisy.

• Solution: calculate saliency for multiple copies of the same input corrupted with gaussian noise, and average the saliency of copies.

22

[Smilkov et al. 2017]

Page 23: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Establishing Saliency for Words

Page 24: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

“Feature” in Computer Vision

24

Photo Credit: Hainan Xu

Page 25: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

“Feature” in NLP

25

It’s straight-forward to compute saliency for a single dimension of the word embedding.

Page 26: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

“Feature” in NLP

26

But how to compose the saliency of each dimension into the saliency of a word?

Page 27: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Our Proposal

27

Consider word embedding look-up as a dot product between the embedding matrix and an one-hot vector.

Page 28: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Our Proposal

28

The 1 in the one-hot vector denotes the identity of the input word.

Page 29: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Our Proposal

29

Let’s perturb that 1 like a real value! i.e. take gradients with regard to the 1.

Page 30: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Our Proposal

30

∑i

ei ⋅∂y∂ei

(−∞, ∞)range:

Page 31: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Experiment

Page 32: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Evaluation

• Evaluation of interpretations is tricky!

• Fortunately, there’s human judgments to rely on.

• Need to do force decoding with NMT model.

32

Page 33: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Setup

• Architecture: Convolutional S2S, LSTM, Transformer (with fairseq default hyper-parameters)

• Dataset: Following Zenkel et al. [2019], which covers de-en, fr-en and ro-en.

• SmoothGrad hyper-parameters: N=30 and σ=0.15

33

Page 34: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Baselines• Attention weights

• Smoothed Attention: forward pass on multiple corrupted input samples, then average the attention weights over samples

• [Li et al. 2016]: compute element-wise absolute value of embedding gradients, then average over embedding dimensions

• [Li et al. 2016] + SmoothGrad

34

Page 35: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Convolutional S2S on de-en

AER

15

20

25

30

35

40

45

Atte

ntio

n

Smoo

thed

Att

entio

n

Li+G

rad

Li+S

moo

thG

rad

Our

s+G

rad

Our

s+Sm

ooth

Gra

d

fast

-alig

n

Zenk

el e

t al.

[201

9]

GIZ

A++

35

Page 36: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Attention on de-en

36

AER

15

25

35

45

55

65

Conv

LSTM

Tran

sfor

mer

fast

-alig

n

Zenk

el e

t al.

[201

9]

GIZ

A++

Page 37: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Ours+SmoothGrad on de-en

37

AER

15

25

35

45

55

65

Conv

LSTM

Tran

sfor

mer

fast

-alig

n

Zenk

el e

t al.

[201

9]

GIZ

A++

Page 38: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Li vs. Ours

38

Page 39: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Li vs. Ours

39

Page 40: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Conclusion

Page 41: Saliency-driven Word Alignment Interpretation for NMTSaliency-driven Word Alignment Interpretation for NMT SmoothGrad • Gradients are very local measure of sensitivity. • Highly

Saliency-driven Word Alignment Interpretation for NMT

Conclusion• Saliency + proper word-level score formulation is a

better interpretation method than attention

• NMT models do learn interpretable alignments. We just need to properly uncover them!

41

Paper Code Slides

https://github.com/shuoyangd/meerkat