a non-contiguous tree sequence alignment-based model for statistical machine translation jun sun...

36
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun , Min Zhang , Chew Lim Tan

Upload: aubrey-marshall

Post on 20-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

A non-contiguous Tree Sequence Alignment-based

Model for Statistical Machine Translation

Jun Sun┼, Min Zhang╪, Chew Lim Tan┼

┼ ╪

Page 2: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Outline

Introduction

Non-contiguous Tree Sequence Modeling

Rule Extraction

Non-contiguous Decoding: the Pisces Decoder

Experiments

Conclusion

2

Page 3: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Contiguous and Non-contiguousBilingual Phrases

3

Contiguoustranslational equivalences

Non-contiguoustranslational equivalence

VP

NP

VV PN

IP

CP

NNDECVV

到 时候的出场他

SBAR

VP

S

RPVBZPRPWRB

upshowshewhen

(at) (he)(show up) (‘ s) (time)

Page 4: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Previous Work on Non-contiguous phrases

(-) Zhang et al. (2008) acquire the non-contiguous phrasal rules from the contiguous tree sequence pairs, and find them useless via real syntax-based translation systems.

(+) Wellington et al. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.

(+) Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model.

4

Page 5: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

VP

NP

VV

CP

NN

到 时候

SBAR

S

WRB

when

(at) (time)

Previous Work on Non-contiguous phrases (cont.)

5

VP

NP

VV PN

IP

CP

NNDECVV

到 时候的出场他

SBAR

VP

S

RPVBZPRPWRB

upshowshewhen

(at) (he)(show up) (‘ s) (time)

VP(VV( 到 ),NP(CP[0],NN( 时候 ))) SBAR(WRB(when),S[0])

Non-contiguous

Contiguous tree sequence pair

Contiguous tree sequence pair

Page 6: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Previous Work on Non-contiguous phrases (cont.)

6

No match in rule set

PN

IP

CP

DECVV

的出场他

VP

S

RPVBZPRP

upshowshe

(he) (show up) (‘ s)

VP

NP

VV PN

IP

CP

NNDECVV

到 时候的出场他

SBAR

VP

S

RPVBZPRPWRB

upshowshewhen

(at) (he)(show up) (‘ s) (time)

VP

NP

VV

CP

NN

到 时候

SBAR

S

WRB

when

(at) (time)

VP

NP

ASVV PN

IP

CP

NNDECVV

到 时候的出场他了(at) (NULL) (he) (show up) (‘ s) (time)

VP

NP

ASVV

CP

NN

到 时候了(at) (NULL) (time)

VP

S

RPVBZPRP

upshowshe

Page 7: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Proposed Non-contiguous phrases Modeling

7

PN

IP

CP

DECVV

的出场他

VP

S

RPVBZPRP

upshowshe

(he) (show up) (‘ s)

VP

NP

VV PN

IP

CP

NNDECVV

到 时候的出场他

SBAR

VP

S

RPVBZPRPWRB

upshowshewhen

(at) (he)(show up) (‘ s) (time)

VP

NP

VV

CP

NN

到 时候

SBAR

S

WRB

when

(at) (time)

VP

NP

ASVV PN

IP

CP

NNDECVV

到 时候的出场他了(at) (NULL) (he) (show up) (‘ s) (time)

VP

NP

ASVV

CP

NN

到 时候了(at) (NULL) (time)

VP

S

RPVBZPRP

upshowshe

WRB

when

VV NN

到 时候

WRB

when

(at) (time)

. . .

Extracted from non-contiguous tree sequence

pairs

Page 8: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Contributions

The proposed model extracts the translation rules not only from the contiguous tree sequence pairs but also from the non-contiguous tree sequence pairs (with gaps). With the help of the non-contiguous tree sequence, the proposed model can well capture the non-contiguous phrases in avoidance of the constraints of large applicability of context and enhance the non-contiguous constituent modeling.

A decoding algorithm for non-contiguous phrase modeling

8

Page 9: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Outline

Introduction

Non-contiguous Tree Sequence Modeling

Rule Extraction

Non-contiguous Decoding: the Pisces Decoder

Experiments

Conclusion

9

Page 10: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

SncTSSG

Synchronous Tree Substitution Grammar (STSG, Chiang, 2006)

Synchronous Tree Sequence Substitution Grammar (STSSG, Zhang et al. 2008)

Synchronous non-contiguous Tree Sequence Substitution Grammar (SncTSSG)

10

Page 11: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Word Aligned Parse Tree and Two Parse Tree Sequence

11

VBA

把 我给钢笔

P RVGNG

VO

VBA

把 给

P RVGNG

VO

subtree

Substructure

abstract

1. Word-aligned bi-parsed Tree 2. Two Structure 3. Two Tree Sequences

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

Ts:

A:

Tt:

我(me)

给(give)

RVG

Give to me

VBP TO PRP

PP

, *** ,

,

Page 12: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Contiguous Translation Rules

12

VBA

把(NULL)

我(me)

P RVGNG

to me

VBP TO PRP

NG PP

VP

1 2

2

1

r1:

钢笔(pen)

NG VG

给(give)

the

DT

NPVBP

give

, NN

pen

r2:

VO

r1. Contiguous Tree-to-Tree Rule r2. Contiguous Tree Sequence Rule

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

Ts:

A:

Tt:

Page 13: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Non-contiguous Translation Rules

13

VBA

把(NULL)

P NG

VO

the

VBP DT NN

NP PP

VP

,1

1

1

give

VG

ncTSr1:

pen钢笔(pen)

我(me)

给(give)

RVG

VO

to me

TO PRP

PP

ncTSr2: ,, ***

r1. Non-contiguous Tree-to-Tree Rule r2. Non-contiguous Tree Sequence Rule

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

Ts:

A:

Tt:

Page 14: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Outline

14

Introduction

Non-contiguous Tree Sequence Modeling

Rule Extraction

Non-contiguous Decoding: the Pisces Decoder

Experiments

Conclusion

Page 15: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

A word-aligned parse tree pairs

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

Ts:

A:

Tt:

Page 16: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Example for contiguous rule extraction(1)

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

钢笔(pen)

NG

pen

NN

Page 17: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Example for contiguous rule extraction(2)

给(give)

VG

give

VBP

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

Page 18: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Example for contiguous rule extraction(3)

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

P RVGNG

VO

Give topenthe me

VBP DT NN TO PRP

NP PP

VP

,

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

Page 19: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Example for contiguous rule extraction(4)

VBA

把(NULL)

我(me)

P RVGNG

VO

tothe me

VBP DT NN TO PRP

NP PP

VP

,

1 2

2 1

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S Abstract into substructures

Page 20: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Example for non-contiguous rule extraction(1)

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

give

VG

我(me)

给(give)

RVG

VO

to me

TO PRP

PP

,, ***

Extracted from non-contiguous tree sequence

pairs

Page 21: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Example for non-contiguous rule extraction(2)

S

VBA

把(NULL)

我(me)

给(give)

钢笔(pen)

。(.)

P WJRVGNG

VO

Give topenthe me .

VBP DT NN TO PRP PUNC.

NP PP

VP

S

VBA

把(NULL)

P NG

VO

the

VBP DT NN

NP PP

VP

,1

1

1

pen钢笔(pen)

Abstract into substructures from non-contiguous tree sequence pairs

Page 22: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Outline

22

Introduction

Non-contiguous Tree Sequence Modeling

Rule Extraction

Non-contiguous Decoding: the Pisces Decoder

Experiments

Conclusion

Page 23: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

The Pisces Decoder

Pisces conducts searching by the following two modules The first one is a CFG-based chart parser as a pre-processor for mapping an input sentence to a parse tree Ts (for details of chart parser, please refer to Charniak (1997))

The second one is a span-based tree decoder (3 phases)Contiguous decoding (same with Zhang et al. 2008)

Source side non-contiguous translation

Tree sequence reordering in Target side

23

Page 24: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Source side non-contiguous translation

Source gap insertion

24

PP

LCP

在(in)

P LC

DNP

NNDEGNT

NP

近期(recent)

的 调查(survey)

NP(DNP(NT(近期),DEG(的)),NN(调查)) NP(DT(the),JJ(recent),NNS(surveys))

P(在) … LC(中) IN(in)

IN(in)NP(...) NP(...)

NP

NNSJJDT

surveysrecentthe

IN

in

,…,

Right insertion: Left insertion:

NP

NNSJJDT

surveysrecentthe

,…,

IN

in

Page 25: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Tree sequence reordering in Target side

Binarize each span into the left one and the right one.

Generating the new translation hypothesis for this span by inserting the candidate translations of the right span to each gap in the ones of the left span.

Generating the translation hypothesis for this span by inserting the candidate translations of the left span to each gap in the ones of the right span.

25

A candidate hypo

taget span

with gaps

Left span

Right span

Page 26: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Modeling

26

: source/target sentence

: source/target parse tree

: a non-contiguous source/target tree sequence

: source/target spans

hm : the feature function

Page 27: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Features

The bi-phrasal translation probabilities

The bi-lexical translation probabilities

The target language model

The # of words in the target sentence

The # of rules utilized

The average tree depth in the source side of the rules adopted

The # of non-contiguous rules utilized

The # of reordering times caused by the utilization of the non-contiguous rules

27

Page 28: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Outline

28

Introduction

Non-contiguous Tree Sequence Modeling

Rule Extraction

Non-contiguous Decoding: the Pisces Decoder

Experiments

Conclusion

Page 29: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Training Corpus: Chinese-English FBIS corpus

Development Set: NIST MT 2002 test set

Test Set: NIST MT 2005 test set

Evaluation Metrics: case-sensitive BLEU-4

Parser: Stanford Parser (Chinese/English)

29

Experimental settings

Evaluation:mteval-v11b.pl

Language Model: SRILM 4-gram

Minimum error rate training: (Och, 2003)

Model Optimization: Only allow gaps in one side

Page 30: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Model comparison in BLEU

Table 1: Translation results of different models (cBP refers to contiguous bilingual phrases without syntactic structural information, as used in Moses)

30

System Model BLEU

Moses cBP 23.86

PiscesSTSSG 25.92

SncTSSG 26.53

Page 31: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Rule combination

Table 2: Performance of different rule combination

31

ID Rule Set BLEU

1 cR (STSSG) 25.922 cR w/o ncPR 25.87

3 cR w/o ncPR + tgtncR 26.14

4 cR w/o ncPR + srcncR 26.505 cR w/o ncPR + src&tgtncR 26.51

6 cR + tgtncR 26.11

7 cR + srcncR 26.568 cR+src&tgtncR(SncTSSG) 26.53

cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules)

ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes

srcncR: non-contiguous rules with gaps in the source side

tgtncR: non-contiguous rules with gaps in the target side

src&tgtncR : non-contiguous rules with gaps in either side

Page 32: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Bilingual Phrasal Rules

Table 3: Performance of bilingual phrasal rules

32

System Rule Set BLEU

Moses cBP 23.86

PiscescBP 22.63cBP + tgtncBP 23.74cBP + srcncBP 23.93cBP + src&tgtncBP 24.24

cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules)

ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes

srcncBP: non-contiguous phrasal rules with gaps in the source side

tgtncBP: non-contiguous phrasal rules with gaps in the target side

src&tgtncBP : non-contiguous phrasal rules with gaps in either side

Page 33: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Maximal number of gaps

Table 4: Performance and rule size changing with different maximal number of gaps

33

Max gaps allowed Rule # BLEUsource target

0 0 1,661,045 25.921 1 +841,263 26.532 2 +447,161 26.553 3 +17,782 26.56

∞ +8,223 26.57

Page 34: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Sample translations

34

Output & ReferencesSource 才 /only 过 /pass 了 /null 五年 /five years , 两人 /two people 就 /null 对簿公堂 /confront at

courtReference after only five years the two confronted each other at courtSTSSG only in the five years , the two candidates would 对簿公堂SncTSSG the two people can confront other countries at court leisurely manner only in the five yearskey rules VV( 对簿公堂 )→VB(confront)NP(JJ(other),NNS(countries))IN(at) NN(court) ***

JJ(leisurely)NN(manner)Source 欧元 /Euro 的 /’s 大幅 /substantial 升值 /appreciation 将 /will 在 /in 近期 /recent 的 /’s 调

查 /survey 中 /middle 持续 /continue 对 /for 经济 /economy 信心 /confidence 产生 /produce 影响 /impact

Reference substantial appreciation of the euro will continue to impact the economic confidence in the recent surveys

STSSG substantial appreciation of the euro has continued to have an impact on confidence in the economy , in the recent surveys will

SncTSSG substantial appreciation of the euro will continue in the recent surveys have an impact on economic confidence

key rules AD( 将 /will) *** VV( 持续 /continue) → VP(MD(will),VB(continue))P( 在 /in) *** LC( 中 /middle) → IN(in)

Page 35: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Conclusion

Able to attain better ability of non-contiguous phrase modeling and the reordering caused by non-contiguous constituents with large gaps fromNon-contiguous tree sequence alignment model based on SncTSSG

ObservationsIn Chinese-English translation task, gaps are more effective in Chinese side than in the English side.

Allowing one gap only is effective

Future WorkRedundant non-contiguous rules

Optimization of the large rule set35

Page 36: A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

36

The End