using parallel features in parsing of machine-translated

27
Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel {rosa,odusek,marecek,popel}@ufal.mff.cuni.cz Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics SSST, Jeju, 12th July 2012

Upload: others

Post on 15-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Parallel Features in Parsing of Machine-Translated

Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel{rosa,odusek,marecek,popel}@ufal.mff.cuni.cz

Using Parallel Featuresin Parsing

of Machine-Translated Sentencesfor Correction of Grammatical Errors

Charles University in PragueFaculty of Mathematics and Physics

Institute of Formal and Applied Linguistics

SSST, Jeju, 12th July 2012

Page 2: Using Parallel Features in Parsing of Machine-Translated

Parsing of SMT Outputs

can be useful in many applications automatic classification of translation errors automatic correction of translation errors (Depfix) confidence estimation, multilingual question

answering...

✔ we have the source sentence available Can we use it to help parsing?

✗ SMT outputs noisy (errors in fluency, grammar...) parsers trained on gold standard treebanks Can we adapt parser to noisy sentences?

Page 3: Using Parallel Features in Parsing of Machine-Translated

MST Parser

Maximum Spanning Tree dependency parser by Ryan McDonald

Page 4: Using Parallel Features in Parsing of Machine-Translated

(1) Words and Tags

RudolphNNP

relaxesVBZ

abroadRB

#rootwords = nodes

Page 5: Using Parallel Features in Parsing of Machine-Translated

(2) (Nearly) Complete Graph

RudolphNNP

relaxesVBZ

abroadRB

#rootall possible edges =

directed edges

Page 6: Using Parallel Features in Parsing of Machine-Translated

(3) Assign Edge Weights

RudolphNNP

relaxesVBZ

abroadRB

#root

-7 -1659325

1490 1154

-263 24

185

368

Margin InfusedRelaxed Algorithm

(MIRA)

edge weight =

sum ofedge featuresweights

Page 7: Using Parallel Features in Parsing of Machine-Translated

(4) Maximum Spanning Tree

RudolphNNP

relaxesVBZ

abroadRB

#root

-7 -1659325

1490 1154

-263 24

185

368

non-projective trees:Chu-Liu-Edmonds algorithm

(projective trees:

Eisner algorithm)

Page 8: Using Parallel Features in Parsing of Machine-Translated

(5) Unlabeled Dependency Tree

RudolphNNP

relaxesVBZ

abroadRB

#rootdependency tree =

maximumspanning tree

Page 9: Using Parallel Features in Parsing of Machine-Translated

(6) Labeled Dependency Tree

RudolphNNP

relaxesVBZ

abroadRB

#root

Predicate

Subject Adverbial

labels asignedby a second stagelabeler

Page 10: Using Parallel Features in Parsing of Machine-Translated

RUR Parser

reimplementation of MST Parser (so far only) first-order, non-projective

adapted for SMT outputs parsing parallel features ”worsening” the training treebank

Page 11: Using Parallel Features in Parsing of Machine-Translated

English-to-Czech SMT

Czech language highly flective

4 genders, 2 numbers, 7 cases, 3 persons... Czech grammar requires agreement in related words

word order relatively free: word order errors not crucial

Phrase-Based SMT often makes inflection errors:➔ Rudolph's car is black.

✗ Rudolfova/fem auto/neut je černý/masc.

✔ Rudolfovo/neut auto/neut je černé/neut.

Page 12: Using Parallel Features in Parsing of Machine-Translated

Parser Training Data

Prague Czech-English Dependency Treebank parallel treebank 50k sentences, 1.2M words morphological tags, surface syntax, deep syntax word alignment

Page 13: Using Parallel Features in Parsing of Machine-Translated

Parallel Features

word alignment (using GIZA++) additional features (if aligned node exists):

aligned tag (NNS, VBD...) aligned dependency label (Subject, Attribute...) aligned edge existence (0/1)

Page 14: Using Parallel Features in Parsing of Machine-Translated

Parallel Features Example

RudolfNN M S 1

relaxujeVB S 3

vRR 6

zahraničíNN N S 6

RudolphNNP

relaxesVBZ

abroadRB

#root

#root

Pred

AuxP

AdvSubj

Subj

Pred

Adv

Page 15: Using Parallel Features in Parsing of Machine-Translated

Worsening the Treebank

treebank used for training contains correct sentences

SMT output is noisy grammatical errors incorrect word order missing/superfluous words …

let's introduce similar errors into the treebank! so far, we have only tried inflection errors

Page 16: Using Parallel Features in Parsing of Machine-Translated

Worsen (1): Apply SMT

translate English side of PCEDT to Czech by an SMT system (we used Moses)

now we have (e.g.): Gold English

Rudolph's car is black. Gold Czech

RudolfovoNEUT

autoNEUT

je černéNEUT

.

SMT Czech Rudolfova

FEM auto

NEUT je černý

MASC.

Page 17: Using Parallel Features in Parsing of Machine-Translated

Worsen (2): Align SMT to Gold

align SMT Czech to Gold Czech Monolingual Greedy Aligner

alignment link score = linear combination of: similarity of word forms (or lemmas) similarity of morphological tags (fine-grained) similarity of positions in the sentence indication whether preceding/following words aligned

repeat: align best scoring pair until below threshold no training: weights and threshold set manually

Page 18: Using Parallel Features in Parsing of Machine-Translated

Worsen (3): Create Error Model

for each tag: estimate probabilities of SMT system using an

incorrect tag instead of the correct tag(Maximum Likelihood Estimate)

Czech tagset: fine-grained morphological tags part-of-speech, gender, number, case, person,

tense, voice... 1500 different tags in training data

Page 19: Using Parallel Features in Parsing of Machine-Translated

Worsen (3): Error Model

Adjective, Masculine, Plural, Instrumental case(AAMP7), e.g. lingvistickými (linguistic)➔ 0.2 Adjective, Masculine, Singular, Nominative case

➔ e.g. lingvistický➔ 0.1 Adjective, Masculine, Plural, Nominative case

➔ e.g. lingvističtí➔ 0.1 Adjective, Neuter, Singular, Accusative case

➔ e.g. lingvistické

… altogether 2000 such change rules

Page 20: Using Parallel Features in Parsing of Machine-Translated

Worsen (4): Apply Error Model

take Gold Czech for each word:

assign a new tag randomly sampled according to Tag Error Model

generate a new word form rule-based generator, generates even unseen forms new_form = generate_form(lemma, tag) || old_form

→ get Worsened Czech use resulting Gold English-Worsened Czech

parallel treebank to train the parser

Page 21: Using Parallel Features in Parsing of Machine-Translated

Direct Evaluation by Inspection

manual inspection of several parse trees comparing baseline and adapted parser ouputs

examples of improvements: subject identification even if not in nominative case adjective-noun dependence identification even if

agreement violated (gender, number, case)

hard to do reliably trying to find a correct parse tree for an (often)

incorrect sentence – not well defined

Page 22: Using Parallel Features in Parsing of Machine-Translated

Indirect Evaluation: in Depfix

rule-based grammar correction of SMT outputs input = aligned, tagged and parsed sentences:

target (Czech) sentence – to be corrected source (English) sentence – additional information

applies 20 correction rules: noun – adjective agreement (gender, number, case) subject – predicate agreement (gender, number) preposition – noun agreement (case) …

Page 23: Using Parallel Features in Parsing of Machine-Translated

Depfix: Rudolph's Car

RudolphNNP

carNN

RudolfovaAA fem

autoNN neutAtr Atr

'sPOS

RudolfovoAA neut

autoNN neut

Atr

Atr

Adjective – Noun Agreement

Page 24: Using Parallel Features in Parsing of Machine-Translated

Indirect Evaluation Results

differences in Depfix corrections evaluated by humans: better / worse / indefinite

three different parsers RUR + parallel features + worsened treebank – original McDonald's MST Parser RUR – our baseline setup

RUR + parallel features + worsened treebankbetter worse indefinite51% 30% 18%

RUR 54% 28% 18%

Page 25: Using Parallel Features in Parsing of Machine-Translated

Conclusion

SMT outputs often hard to parse RUR parser – adapted to parsing SMT outputs

parallel features (tag, dep. label, edge existence) worsening the training treebank (tag error model)

outputs of English-to-Czech translation evaluated in Depfix

SMT errors correction system

Page 26: Using Parallel Features in Parsing of Machine-Translated

Future Work

more sophisticated parallel features more experiments on worsening more languages

parallel tagging

Page 27: Using Parallel Features in Parsing of Machine-Translated

Thank you for your attention

For this presentation and other information, visit:

http://ufal.mff.cuni.cz/~rosa/depfix/

Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel{rosa,odusek,marecek,popel}@ufal.mff.cuni.cz

Charles University in PragueFaculty of Mathematics and Physics

Institute of Formal and Applied Linguistics