cs395t: structured models for nlp administrivia lecture 12...
TRANSCRIPT
![Page 1: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/1.jpg)
1
CS395T:StructuredModelsforNLPLecture12:MachineTransla=on
GregDurreAAdaptedfromDanKlein–UCBerkeley
AdministriviaProject2dueoneweekfromtoday!
P1testsetresults:top3
YasumasaOnoe:78.55F1(78.27P/78.83R)
PrateekShrishailKolhar:82.32F1(82.61P/82.07R)
Conjunc=onsofwords,POS,andshapesinneighborhoodVeryfastvectorizedimplementa=on(15sperepoch)
SuWang:84.03F1(86.10P/82.05R)
LargerwindowsizeandWikipediagazeAeer
Usedtransi=onprobabili=esfromHMM,character5-gramsandotherfeaturetuning
MachineTransla=on
MachineTransla=on:Examples
![Page 2: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/2.jpg)
2
LevelsofTransfer Word-LevelMT:Examples
§ lapoli'quedelahaine. (ForeignOriginal)§ poli=csofhate. (ReferenceTransla=on)§ thepolicyofthehatred. (IBM4+N-grams+Stack)
§ nousavonssignéleprotocole. (ForeignOriginal)§ wedidsignthememorandumofagreement. (ReferenceTransla=on)§ wehavesignedtheprotocol. (IBM4+N-grams+Stack)
§ oùétaitleplansolide? (ForeignOriginal)§ butwherewasthesolidplan? (ReferenceTransla=on)§ wherewastheeconomicbase? (IBM4+N-grams+Stack)
PhrasalMT:Examples
Metrics
![Page 3: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/3.jpg)
3
MT:Evalua=on§ Humanevalua=ons:subjectmeasures,fluency/
adequacy
§ Automa=cmeasures:n-grammatchtoreferences§ NISTmeasure:n-gramrecall(workedpoorly)§ BLEU:n-gramprecision(noonereallylikesit,but
everyoneusesit)§ Lotsmore:TER,HTER,METEOR,…
§ BLEU:§ P1=unigramprecision§ P2,P3,P4=bi-,tri-,4-gramprecision§ WeightedgeometricmeanofP1-4§ Brevitypenalty(why?)§ Somewhathardtogame…§ Magnitudeonlymeaningfulonsamelanguage,corpus,
numberofreferences,probablyonlywithinsystemtypes…
Automa=cMetricsWork(?)
SystemsOverview
Corpus-BasedMTModeling correspondences between languages
Sentence-aligned parallel corpus:
Yo lo haré mañana I will do it tomorrow
Hasta pronto See you soon
Hasta pronto See you around
Yo lo haré pronto Novel Sentence
I will do it soon
I will do it around
See you tomorrow
Machine translation system:
Model of translation
![Page 4: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/4.jpg)
4
Phrase-BasedSystemOverview
Sentence-aligned corpus
cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 … Phrase table
(translation model) Word alignments
Many slides and examples from Philipp Koehn or John DeNero
Phrase-BasedSystemOverview
Unlabeled English data
cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 …
Language model P(e)
Many slides and examples from Philipp Koehn or John DeNero
Phrase table P(f|e) P (e|f) / P (f |e)P (e)
Noisy channel model: combine scores from translation model + language model to translate foreign to
English
“Translate faithfully but make fluent English”
}
WordAlignment
WordAlignment
![Page 5: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/5.jpg)
5
WordAlignment
What is the anticipated cost of collecting fees under the new proposal?
En vertu des nouvelles propositions, quel est le coût prévu de perception des droits?
x z What
is the
anticipated cost
of collecting
fees under
the new
proposal ?
En vertu de les nouvelles propositions , quel est le coût prévu de perception de les droits ?
UnsupervisedWordAlignment§ Input:abitext:pairsoftranslatedsentences
§ Output:alignments:pairsoftranslatedwords
§ Notalwaysone-to-one!
nous acceptons votre opinion .
we accept your view .
1-to-ManyAlignments Evalua=ngModels§ Howdowemeasurequalityofaword-to-wordmodel?
§ Method1:useinanend-to-endtransla=onsystem§ Slowdevelopmentcycle§ MisleadingifyourMTsystemwas“tuned”forcertainaspectsofbadalignments
§ Method2:measurequalityofthealignmentsproduced§ Easytomeasure§ Hardtoknowwhatthegoldalignmentsshouldbe§ Onendoesnotcorrelatewellwithtransla=onquality
![Page 6: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/6.jpg)
6
AlignmentErrorRate§ AlignmentErrorRate
Sure align.
Possible align.
Predicted align.
=
=
=
IBMModel1
IBMModel1(Brown93)§ Alignments:ahiddenvectorcalledanalignmentspecifieswhichEnglish
source(oraspecialnulltoken)isresponsibleforeachFrenchtargetword.
A:
IBMModel1
Thank you , I shall do so gladly .
1 3 7 6 9
1 2 3 4 5 7 6 8 9
Gracias , lo haré de muy buen grado .
8 8 8 8
E:
F:
Model Parameters
P( F1 = Gracias | A1 = 1) = P(Gracias | Thank) <- learn these translation probs
P(A1 = 1) = 1/10, nothing to learn
![Page 7: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/7.jpg)
7
EMforModel1
§ Model1Parameters:Transla=onprobabili=es
§ Startwith uniform,including§ Foreachsentence,foreachforeignposi=onj:
§ CalculateposterioroverEnglishposi=ons
§ Incrementcountofwordfjwithwordeibytheseamounts
§ Doforwholecorpus,re-es=mateP(f|e)withM-step
P (aj = i|f , e) = P (fj |ei)Pi0 P (fj |e0i)
ProblemswithModel1
§ There’sareasontheydesignedmodels2-5!
§ Problems:alignmentsjumparound,aligneverythingtorarewords
§ Experimentalsetup:§ Trainingdata:1.1Msentences
ofFrench-Englishtext,CanadianHansards
§ Evalua=onmetric:alignmenterrorRate(AER)
§ Evalua=ondata:447hand-alignedsentences
IntersectedModel1
§ Post-intersec=on:standardprac=cetotrainmodelsineachdirec=onthenintersecttheirpredic=ons[OchandNey,03]
§ Secondmodelisbasicallyafilteronthefirst§ Precisionjumps,recalldrops§ Endupnotguessinghard
alignments
Model P/R AER Model 1 E→F 82/58 30.6 Model 1 F→E 85/58 28.7 Model 1 AND 96/46 34.8
HMMModel:LocalMonotonicity
![Page 8: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/8.jpg)
8
MonotonicTransla=on
Le Japon secoué par deux nouveaux séismes
Japan shaken by two new quakes
LocalOrderChange
Le Japon est au confluent de quatre plaques tectoniques
Japan is at the junction of four tectonic plates
TheHMMModel
§ Wantlocalmonotonicity:mostjumpsaresmall§ HMMmodel(Vogel96)
§ Re-es=mateusingtheforward-backwardalgorithm -2 -1 0 1 2 3
HMMExamples
![Page 9: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/9.jpg)
9
AERforHMMs
Model AER Model 1 INT 19.5 HMM E→F 11.4 HMM F→E 10.8 HMM AND 7.1 HMM INT 4.7 GIZA M4 AND 6.9
LanguageModeling
Phrase-BasedSystemOverview
Unlabeled English data
cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 …
Language model P(e)
Many slides and examples from Philipp Koehn or John DeNero
Phrase table P(f|e) P (e|f) / P (f |e)P (e)
Noisy channel model: combine scores from translation model + language model to translate foreign to
English
“Translate faithfully but make fluent English”
}N-gramLanguageModeling
§ Couldgiveseverallecturesonthis!
§ Es=mate
§ Genera=vemodel:readoffcountsandnormalize§ P(fox|thequickbrown)=0.9,etc.
§ Verycomplexdistribu=ons,needtosmooth§ Interpolatewithlower-ordermodels§ Lotsofcomplextechniques
P (wn|wn�k, wn�k+1, . . . , wn�1)
![Page 10: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/10.jpg)
10
Phrase-BasedMT
Phrase-BasedSystemOverview
Sentence-aligned corpus
cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 … Phrase table
(translation model) Word alignments
§ Wehaveaphrasetablenow(ranaligner,extractedphrasesandcountedthemtogetscores)–phraseextrac=onandcoun=ngaretricky,butwe’llignorethis...
Phrase-BasedSystemOverview
Unlabeled English data
cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 …
Language model P(e)
Many slides and examples from Philipp Koehn or John DeNero
Phrase table P(f|e) P (e|f) / P (f |e)P (e)
Noisy channel model: combine scores from translation model + language model to translate foreign to
English
“Translate faithfully but make fluent English”
}
![Page 11: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/11.jpg)
11
Phrase-BasedDecoding
这 7人 中包括 来自 法国 和 俄罗斯 的 宇航 员 .
Decoder design is important: [Koehn et al. 03]
Phrase-BasedDecoding
MonotonicWordTransla=on
§ CostisLM*TM§ It’sanHMM?
§ P(e|e-1,e-2)§ P(f|e)
§ Stateincludes§ ExposedEnglish§ Posi=oninforeign
§ Dynamicprogramloop?
a <- to 0.8
a <- by 0.1
[…. a slap, 5] 0.00001
[…. slap to, 6] 0.00000016
[…. slap by, 6] 0.00000001
a slap to
0.02
a slap by 0.01
for (fPosition in 1…|f|) for (eContext in allEContexts) for (eOption in translations[fPosition]) score = scores[fPosition-1][eContext] * LM(eContext+eOption) * TM(eOption, fWord[fPosition]) scores[fPosition][eContext[2]+eOption] =max score
BeamDecoding§ ForrealMTmodels,thiskindofdynamicprogramisadisaster(why?)§ Standardsolu=onisbeamsearch:foreachposi=on,keeptrackofonlythe
bestkhypotheses
for (fPosition in 1…|f|) for (eContext in bestEContexts[fPosition]) for (eOption in translations[fPosition]) score = scores[fPosition-1][eContext] * LM(eContext+eOption) * TM(eOption, fWord[fPosition]) bestEContexts.maybeAdd(eContext[2]+eOption, score)
Example from David Chiang
![Page 12: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/12.jpg)
12
PhraseTransla=on
§ Ifmonotonic,almostanHMM;technicallyasemi-HMM
§ Ifdistor=on…nowwhat?
for (fPosition in 1…|f|) for (lastPosition < fPosition) for (eContext in eContexts) for (eOption in translations[fPosition]) … combine hypothesis for (lastPosition ending in eContext) with eOption
Non-MonotonicPhrasalMT
Pruning:Beams+ForwardCosts
§ Problem:easypar=alanalysesarecheaper§ Solu=on1:usebeamsperforeignsubset§ Solu=on2:es=mateforwardcosts(A*-like)
ThePharaohDecoder
![Page 13: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/13.jpg)
13
HypotheisLawces
Syntac=cModels
![Page 14: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/14.jpg)
14
![Page 15: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/15.jpg)
15
![Page 16: CS395T: Structured Models for NLP Administrivia Lecture 12 ...gdurrett/courses/fa2017/lectures/lec12-4pp.pdf · translation model + language model to translate foreign to English](https://reader034.vdocuments.us/reader034/viewer/2022043017/5f39b95a038eeb78983118a9/html5/thumbnails/16.jpg)
16
Syntac=cTransla=on§ Lotsofcomplexity:largephrasetables,errorsintroducedbyparsers,parsesdon’tagree,inferenceisharder,...
§ Goodforsomelanguages(Japanese->English),butgenerallymoretroublethanit’sworth
§ Easiermethod:syntac=c“pre-reordering”
MT:Takeaways§ Wordalignments:unsupervisedprocessforfindingword-levelcorrespondences.Turntheseintophraselevelcorrespondences->phrasetable
§ Languagemodel:es=maten-grammodelonaverylargecorpus
§ Transla=onprocess:usebeamsearchtofindthebesttransla=onargmaxeP(f|e)P(e)