final class - cs.columbia.edukathy/nlp/2017/classslides/class25-final/... · • an gle pretenders...
TRANSCRIPT
Announcements• Takeafewminutesfortheevalua3onatthebeginningofclass
• Grades:expecttopostfinalgradesby12/23• SeveralfundedGRAposi3onsopentoMSstudentsnextspring–ifinterestedemailme.
• HW3gradesreturnedtoday• Finalexamwillbeintwoloca3ons.Youwillreceiveanemailaboutwhere.
• DONOTANSWERPOLLEVERWHEREAHEADOFTIME
HW4• Instruc3onsforsubmiMng:Thedirectorystructureonthebitbucketreposhouldbeexactlysameasthehw4.zipprovidedtoyou(withtheexcep3onofdatadirectory.Donotuploadit).Topushthecodetoremoterepo,usethesameinstruc3onsasgiveninHW0.Doublecheckyouremoterepoforcorrectdirectorystructure.Wewon'tconsideranyregraderequestsbasedonwrongdirectorystructurepenalty.Again,donotuploaddatatoyourbitbucketrepo.
• HW4:UselogofweightsinaVen3ontogetslightlybeVervisualiza3on.SeepostbyDheeraj.
• ApoorvhasofficehoursTODAY4-6forthosewithques3ons.
• WriVenpart:Notjusttalkingaboutsmallextensionstoneuralnet.Thinkbig.
ProjectsinNLP• Searchandsummariza3onoverlowresourcelanguages• Howdowesummarizeinthesource?Howdowesummarizewhentransla3onisbad?Howdowesummarizespeech?
• Iden3fyingaggressionandlossinpostsfromgang-involvedyouth• Canweiden3fypaVernsinpostsover3me?Canweusethesocialnetwork?Canweiden3fyreferencestotriggeringevents?
• Iden3fyinghatespeech• Bullying,threatsagainstjournalists,howdoescultureaffectinterpreta3on?
• Jointuseofvisualandtextualcues• Toiden3fysen3menttowardstargets,toiden3fyevents
PoetryGeneration• Genera3ngTopicalPoetry,Ghazvininejadetal2016)hVps://aclweb.org/anthology/D16-1126
• Hafez–generatessonnetsonauserprovidedtopic• Iambicpentameter• Everyotherlinerhymes
• Roughoverviewofmethodsplusoutput
SystemOverview• SelectlargevocabularyandcomputestresspaVerns
• Selectwordsrelatedtouser-suppliedtopic• Selectpairsofrhymingwordstoendlines• BuildFSAwithapathforeveryconceivablesequenceofvocabularywordstoobeyformalrhythmconstraintswithrhymewordsinplace
• SelectafluentpaththroughtheFSAusingaRNNforscoring
Vocabulary• Iambicpentameter:tensyllablesalterna3ngbetweenstressedandunstressed• A<endingonhisgoldenpilgramage0101010101
• UseCMUpronuncia3ondic3onary• RemovewordswhosestresspaVerndoesnotmatchiambicpaVern
• Removeambiguouswords(recordN10;recordV01)• Avoidsto,it,in,is
• Finalvocabulary:14368words• 4833monosyllabic• 9535mul3syllabic
Selectingtopicallyrelatedwords• Usersuppliesatopic:colonel• Output:colonel,lieutenant_colonel,brigadier_general,commander,army
• UseWord2Vecusingwindowsizeof40• Wordembeddingvectorfortopicwordorphrase
• Wordembeddingsforeachvocabularyword
• Howwouldsimilaritybecomputed?
Rhymewords• Shakespeareansonnet:ABABCDCDEFEFGG
• Strictrhyme:soundsoftwowordsmustmatchfromthelaststressedvowelonwards• Masculinerhyme:thelastsyllableisstressed• Femininerhyme:thepenul3matesyllableisstressed• Pre-computestrictrhymeclassesforwordsandhashvocabularyintothoseclasses(CMUpronuncia3ondic3onary)
• Slantrhymes:viking/fighGng,snoopy/spooky,baby/crazyandcomic/ironic
Rhymewordselection• Hashallrelatedwords/phrasesintorhymeclasses
• Eachcollisiongeneratesacandidaterhymepair(s1,s2)
• Scorepairwithmax:cosine(s1,topic);cosine(s2,topi)
• Chooserhymepairsrandomlywithprobabilitypropor3onaltotheirscore
FSAConstruction• CreatelargeFSAthatencodesallwordsequencesthatuseselectedrhymepairsandobeysformalsonnetconstraints• Contains14lines• LinesareiniambicpentameterwithstresspaVern(01)5or(01)50(feminine)
• Eachlineendswithchosenrhymeword/phrase• Eachlineispunctuatedwithcommaorperiodexceptfor4th,8th,12thand14thwhichhaveaperiod
FSAOutput• Topic:naturallanguage• Contains10229paths• Randomlyselectedpath
• OfpocketsolaceammuniGongrammar.• AnGlepretendersspreadinglogical.• AnstoriesJackiegallonposingbanner.• AncorpsesKatobiological…
PathextractionthroughFSAwithRNN• Needscoringfunc3onandsearchprocedure• RNN“genera3onmodel”
• TwolayerLSTMwithbeamsearchguidedbyFSA• Beamsearchstate:(h,s,word,score)
• HthehiddenstateofLSTMatsteptintheithstate• StheFSAstateatsteptintheithstate• Generatesonewordateachstep
• Trainedusingsonglyrics->repea3ngwords(nevereverevereverever)• Applypenaltytowordsalreadygenerated
• Beamof50onenresultsinnotbeingabletogeneratefinalrhymingwordinFSA• Generatethewholesonnetinreverse
Translationmodel• Useencoder-decoderLSTM
• Assemblerhymingwordsinreverseorder(encoderside)
• Pairedwithen3rereversedlyric(decoderside)
• Atgenera3on3me:allselectedrhymewordsonsourcesideandletmodelgeneratethepoemcondi3onedonthoserhymewords
• Whengenera3ngthelastline,italreadyknowsall14rhymewords
Results• Transla3onmodelbeVerthangenera3onmodel
• EncouragingtopicwordsbeVerthannot
• Doesthesystemplagiarize?• Averagecopyingof1.25-gramspersonnet• Ifrelaxrepeated-wordpenalty->7.9copied5-grams
• Ifrelaxiambicmeter->10.6copied5-grams
BipolarDisorder• ExistenceentersyourenGrenaGon.Atwistedmindrevealsbecomingmanic,AnendlessmodernendingmedicaGon,Anotherro<ensoulbecomesdynamic.
• OrunderpressureongeneGctests.Surroundedbycontrollingmydepression,Andonlyhumantortureneverrests,Ormaybeyouexpectaneasylesson.
• Orsomethingfromthecancerheartdisease,AndIconsideryouafriendofmine.Withoutali<lesignofjudgementplease,Delivermeacrosstheborderline.
• Analteredstateofmanicepisodes,Ajourneythroughthelongandwindingroads.
Othertopics• LoveatFirstSightAnearlymorningonarainynight,Relaxandmaketheotherpeoplehappy,Ormaybegetali<leoutofsight,AndwanderdownthestreetsofCincinnaG.
Girlfriend
• AnotherpartystartedgeVngheavy.Andneverhadali<lebitofBobby,OrsomethinggoingbythenameofEddie,Andgotafingeronthetriggersloppy.
Noodles
• ThepeoplewannadrinkspageValla,Andmaybeeatalotoftheothercrackers,Orsitaroundandtalkaboutthesalsa,Ali<lebitofnothingreallyma<ers.
FinalReview• Finalexamwillbeintwoloca3ons.Youwillreceiveanemailaboutwhere
• Today:anermidtermonly.Lookatmidtermreviewtoreviewearliertopics
• Finalwillbecumula3ve• Someemphasistowardslasthalfoftheclass• Someemphasistowardstopicsnottestedinhomeworks• Anythingcoveredinclassisapoten3altopicfortheexam
• Calculatorallowedinfinal.Nootherelectronics.Nonotesorbooks.
• ThreereviewsessionswillbeofferedbyTas• Bhavana,December13th,NeuralNetbasics,HW3• Elsbeth,December18th,evening• Fei-Tzin,December19th,evening
• Allofficehourswillbeheldbetweennowandfinal
AbstractMeaningRepresentation• Givenasentenceproposearepresenta3on
• Givenarepresenta3on,providethesentence
• Understandtheparsingframework
• Discussprosandcons
AMRcharacteristics• Rooted,labeledgraphs• Abstractawayfromsyntac3cdifferences
• Hedescribedherasagenius• Hisdescrip3onofher:genius• Shewasageniusaccordingtohisdescrip3on
• UsePropbankframesets• “bondinvestor”:invest-01
• HeavilybiasedtowardsEnglish
AMRrelations• ~100rela3ons• Framearguments
• Arg0,arg1,arg2,arg3,arg4,arg5(Propbank)• Generalseman3crela3ons
• :Accompanier,:age,:beneficiary,:cause,:compared-to,:concession,:condi3on,:consistof,:degree,:des3na3on,:direc3on,:domain,:dura3on,:employed-by,:example,:extent,:frequency,:instrument,:li,:loca3on,:manner,:medium,:mod,:mode,:name,:part,:path,:polarity,:poss,:purpose,:source,:subevent,:subset,:3me,:topic,:value.
• Rela3onsforquan3ty• :quant,:unit,:scale
• Rela3onsfordateen3ty• :day,:month,:year,:weekday,:3me,:3mezone,:quarter,:dayperiod,:season,:year2,:decade,:century,:calendar,:era.
• Rela3onsforlists• :op1,:op2,….:op10
• Plusinverses(e.g.,:arg0-of,:loca3on-of)
AMRrelations• ~100rela3ons• Framearguments
• Arg0,arg1,arg2,arg3,arg4,arg5(Propbank)• Generalseman3crela3ons
• :Accompanier,:age,:beneficiary,:cause,:compared-to,:concession,:condi3on,:consistof,:degree,:des3na3on,:direc3on,:domain,:dura3on,:employed-by,:example,:extent,:frequency,:instrument,:li,:loca3on,:manner,:medium,:mod,:mode,:name,:part,:path,:polarity,:poss,:purpose,:source,:subevent,:subset,:3me,:topic,:value.
• Rela3onsforquan3ty• :quant,:unit,:scale
• Rela3onsfordateen3ty• :day,:month,:year,:weekday,:3me,:3mezone,:quarter,:dayperiod,:season,:year2,:decade,:century,:calendar,:era.
• Rela3onsforlists• :op1,:op2,….:op10
• Plusinverses(e.g.,:arg0-of,:loca3on-of)
NOTNECESSARYTOMEMORIZE–WOULDBEPROVIDED
Framesets• ExamplesofusingFramesetstoextractawayfromEnglishsyntax
• (d/describe-01• :arg0(m/man)• :arg1(m2/mission)• :arg2(d/disaster))
• :arg0thedescriber,:arg1thethingdescribed,:arg2whatitisdescribing
• Themandescribedthemissionasadisaster.Asthemandescribedit,themissionwasadisaster
Questions• Amr-unknowntoindicatewh-ques3ons
• (f/find-01:arg0(g/girl):arg1(a/amr-unknown))
Whatdidthegirlfind?
Compositionality• Themeaningofthewholeisequaltothesumofthemeaningofitsparts
• HowisAMRcomposi3onal?(d/describe-01• :arg0(m/man):arg1(m2/mission):arg2(d/disaster))
• (s/spy:arg0-of(a/aVract-01))
• WhatistheAMRfortheaVrac3vespydescribedthemissionasadisaster?
LearningtoSearch(L2S)• Familyofapproachesthatsolvesstructuredpredic3onproblems• Decomposestheproduc3onofthestructuredoutputintermsofexplicitsearchspace
• Learnshypothesesthatcontrolapolicythattakesac3onsinthesearchspace
• AMRisastructuredseman3crepresenta3on
• Modellearningofconceptsandrela3onsinaunifiedseMng.
AMRparsingtaskdecomposed• Predic3ngconcepts
• Predic3ngtheroot
• Predic3ngrela3onsbetweenpredicatedconcepts
Searchspace• States={x1,x2,....,xn,y1,y2,….,yi-1}wheretheinput{x1,x2,....,xn}arethenwordsofthesentence
• Conceptpredic3on:labelsy1,y2,….,yi-1aretheconceptspredicteduptoi-1.• Nextac3on:yiistheconceptforwordxifromak-bestlistofconcepts
• Rela3onpredic3on:labelsarerela3onsforpredictedpairsofconcepts
• Rootpredic3on:mul3-taskclassifierselectsrootconceptfromallpredictedconcepts
Topicstoknow• Howtodoworddisambigua3on
• Distributedvsdistribu3onalrepresenta3ons
• Howtocomputetextsimilarity
• Whatwordembeddingscapture
MainIdeaofword2vec• Predictbetweeneverywordanditscontext
• Twoalgorithms• Skip-gram(SG)Predictcontextwordsgiventarget(posi3onindependent)
• Con3nuousBagofWords(CBOW)Predicttargetwordfrombag-of-wordscontext
SlideadaptedfromChrisManning
TrainingMethods• Two(moderatelyefficient)trainingmethods
HierarchicalsonmaxNega3vesamplingToday:naïvesonmax
SlideadaptedfromChrisManning
Instead,abankcanholdtheinvestmentsinacustodialaccountContextcentercontextwordswordsword2wordt2wordwindowwindowButasagricultureburgeonsontheeastbank,theriverwillshrinkContextwordscentercontext2wordwindowt2wordwindow
ObjectiveFunction• Maximizetheprobabilityofcontextwordsgiventhecenterword
J’(Θ)=ΠΠP(wt+j|wtjΘ)t=1-m≤j≤mj≠0Nega3veloglikelihood
J’(Θ)=-1/TΣΣlogP(wt+j|wt)t=1-m≤j≤mj≠0WhereΘrepresentsallvariablestobeop3mized
SlideadaptedfromChrisManning
Softmaxusingwordctoobtainprobabilityofwordo
• ConvertP(wt+j|wt)
P(o|c)=exp(uoTvc)/Σvw=1exp(uwTvc)exponen3atenormalizetomakeposi3vewhereoistheoutside(oroutput)wordindexandcisthecenterwordindex,vcanduoarecenterandoutsidevectorsofindicescando
SlideadaptedfromChrisManning
NeuralNets• Basicarchitectureoffeedforwardneuralnetwork
• Lossandgradientdescent• Sonmax• Backpropaga3on• Determiningdimensionsofparameters,inputandoutput(basically,allHW3ques3ons)
• RecurrentNeuralNetwork• LSTM
RNN–Ihadinmindyourfacts,buddy,nothers.
wx
U
x1
h0
h1
wx
U
x2
h2
σwx
U
x3
h3
σ
y3sigmoid
I had in
wx
U
x3
h3
σ
mind …
σ
RNN–Ihadinmindyourfacts,buddy,nothers.
wx
U
x1
h0
h1
wx
U
x2
h2
σwx
U
x3
h3
σ
y3sigmoid
I had in
wx
U
x3
h3
σ
mind …
Waretheweights:thewordembeddingmaatrixmul3plica3onwithxtyieldstheembeddingforxUisanotherweightmatrixH0isonennotspecified.Histhehiddenlayer.
σ
RNN–Ihadinmindyourfacts,buddy,nothers.
wx
U
x1
h0
h1
wx
U
x2
h2
σwx
U
x3
h3
σ
y3sigmoid
I had in
wx
U
x3
h3
σ
mind …
ht=σ(Uwxt)ht-1
σ
RNN–Ihadinmindyourfacts,buddy,nothers.
wx
wh
x1
h0
h1
σwx
wh
x2
h2
σwx
wh
x3
h3
σ
Sigmoid
I had in
wx
wh
x3
h3
σ
mind …
Y=posi3ve?Y=nega3ve?Finalembeddingrunthroughthesigmoid
func3on->[0,1]1=posi3ve0=nega3veOnenfinalhisusedaswordembeddingforthesentence
Questions• Howishcomputed?
• Whatparametersarelearned?
• Howisypredicted?
• WhataretheproblemswithanRNN?
UpdatingParametersofanRNN
wx
wh
x1
h0
h1
σwx
wh
x2
h2
σwx
wh
x3
h3
σ
y3sigmoid
Cost
wy
Backpropaga3onthrough3meGoldlabel=0(nega3ve)AdjustweightsusinggradientRepeatmany3meswithallexamples
SlidefromRadev
I had in
QuestionQuesJon30Supposeyouaregiventhefollowingstepfunc3on:defstep(x_t,h_tm1,c_tm1):u_t=T.nnet.sigmoid(T.dot(params[“Wx”],x_t)+T.dot(params[“Wh”],h_tm1))#Calculatetheinputgatei=T.nnet.sigmoid(T.dot(params[“Wxi”],x_t)+T.dot(params[“Whi”],h_tm1))#Calculatetheforgetgatef=T.nnet.sigmoid(T.dot(params[“Wxf”],x_t)+T.dot(params[“Whf”],h_tm1))#Calculatetheoutputgateo=T.nnet.sigmoid(T.dot(params[“Wxo”],x_t)+T.dot(params[“Who”],h_tm1))#Findthememorycellvalueforthecurrent3mestepc_t=f*c_tm1+i*u_t#Findthehiddenvalueforthecurrent3mesteph_t=o*T.tanh(c_t)returnh_t,c_t• •
• AssumethatT.nnet.sigmoidappliesthesigmoidfunc3on.T.tanhappliesthetanhfunc3on.T.dot(A,B)computesthedotproductofAandB.The*operatorwillperformelementwisemul3plica3onwhenappliedtotwovectors.paramsisadic3onaryofparametersthathavealreadybeenini3alized.WhichdeeplearningarchitecturedoesthisfuncJonbelongto?a.RecursiveNeuralNetworkb.GatedRecurrentUnitc.LongShortTermMemoryNetworkd.ConvoluJonalNeuralNetwork
GatedArchitectures• RNN:ateachstateofthearchitecture,theen3rememorystate(h)isreadandwriVen
• Gate=binaryvectorgε{0,1}• Controlsaccesston-dimensionalvectorx�g
• Consider• Readsentriesfromxspecifiedbyg• Copiesremainingentriesfroms(orhaswe’vebeenlabelingthehiddenstate)
LSTMSolution§ Usememorycelltostoreinforma3onateach3mestep.
§ Use“gates”tocontroltheflowofinforma3onthroughthenetwork.§ Inputgate:protectthecurrentstepfromirrelevantinputs
§ Outputgate:preventthecurrentstepfrompassingirrelevantoutputstolatersteps
§ Forgetgate:limitinforma3onpassedfromonecelltothenext
[slidesfromCatherineFinegan-Dollak]
TransformingRNNtoLSTM
𝑢↓𝑡 =𝜎( 𝑊↓ℎ ℎ↓𝑡−1 + 𝑊↓𝑥 𝑥↓𝑡 )
wx
wh
x1
h0
u1
σ
[slidesfromCatherineFinegan-Dollak]
TransformingRNNtoLSTM
𝑐↓𝑡 = 𝑓↓𝑡 ⊙ 𝑐↓𝑡−1 + 𝑖↓𝑡 ⊙ 𝑢↓𝑡
wx
wh
x1
h0
u1
σ
c0
+ c1
f1
i1
[slidesfromCatherineFinegan-Dollak]
TransformingRNNtoLSTM
𝑐↓𝑡 = 𝑓↓𝑡 ⊙ 𝑐↓𝑡−1 + 𝑖↓𝑡 ⊙ 𝑢↓𝑡
wx
wh
x1
h0
u1
σ
c0
+ c1
f1
i1
[slidesfromCatherineFinegan-Dollak]
TransformingRNNtoLSTM
𝑐↓𝑡 = 𝑓↓𝑡 ⊙ 𝑐↓𝑡−1 + 𝑖↓𝑡 ⊙ 𝑢↓𝑡
wx
wh
x1
h0
u1
σ
c0
+ c1
f1
i1
[slidesfromCatherineFinegan-Dollak]
Summarization• Extrac3vevsabstrac3vesummariza3on
• Indica3vevsinforma3vesummary
• Singledocumentvsmul3-document
• Genericvsuser-focused
TopicSignatureWords• Usesthelogra3otesttofindwordsthatarehighlydescrip3veoftheinput
• thelog-likelihoodra3otestprovidesawayofseMngathresholdtodivideallwordsintheinputintoeitherdescrip3veornot• theprobabilityofawordintheinputisthesameasinthebackground• thewordhasadifferent,higherprobability,intheinputthaninthebackground
• Binomialdistribu3onusedtocomputethera3oofthetwolikelihoods
• Thesentencescontainingthehighestpropor3onoftopicsignaturesareextracted.
Loglikelihoodratio
WherethecountswithsubscriptioccurintheinputcorpusandthosewithsubscriptBoccurinthebackgroundcorpusProbability(p)ofwoccuringk3mesinNBernoullitrialsThesta3s3c-2λhasaknownsta3s3caldistribu3on:chi-squared
•
Graph-basedmethods• Sentencesimilarityismeasuredasafunc3onofwordoverlap• Frequentlyoccurringwordslinkmanysentences• Similarsentencesgivesupportforeachother’simportance
• Inputrepresentedashighlyconnec3vedgraph• Ver3cesrepresentsentences• Edgesbetweensentencesweightedbysimilaritybetweentwosentences
• CosinesimilaritywithTF*IDFweightsforwords
SentenceSelection• Verteximportance(centrality)computedusinggraphalgorithms• Edgeweightsnormalizedtoformprobabilitydistribu3on->Markovchain
• Computeprobablityofofbeingineachvertexofgraphat3metwhilemakingconsecu3vetransi3onsfromonevertextonext
• Asmoretransi3onsmade,probabilityofeachvertexconverges->sta3onarydistribu3on
• Ver3ceswithhigherprobability=moreimportantsentences
Abstractivesummarization• Whatiscompression?
• Whatisfusion?
• Whattradi3onalmethodmightIuseforasupervisedcompressionsystem?
Datasetforcompression(~3000sentencepairs)
Clarke&Lapata(2008)Input
• ItalianairforcefightersscrambledtointerceptaLibyanairlinerflyingtowardsEuropeyesterdayastheUnitedNa3onsimposedsanc3onsonLibyaforthefirst3meinColMuammarGaddafi'sturbulent22yearsinpower.
Compression• ItalianairforcefightersscrambledtointerceptaLibyanairlinerastheUnitedNa3onsimposedsanc3onsonLibya.
TexttoTextGenerationModeltexttransforma3onasastructuredpredicGonproblem• Input:Oneormoresentenceswithparses• Output:Singlesentence+parse
Jointinferenceover• wordchoice,• n-gramordering• dependencystructure
Thadani&McKeown,CONLL2013
Compression• Input:singlesentence• Output:sentencewithsalientinforma3on• Dataset+baselinefromClarke&Lapata(2008)
NeuralSummarizationArchitecture• Hierarchicaldocumentreader
• Derivemeaningrepresenta3onofdocumentfromitscons3tuentsentences
• AVen3onbasedhierarchicalcontentextractor
• Encoder-decoderarchitecture
DocumentReader• CNNsentenceencoder
• Usefulforsentenceclassifica3on• Easytotrain
• LSTMdocumentencoder• Avoidsvanishinggradients
Typesofsummarizationevaluation• Automated:Rougescores
• Manual:Pyramidscores• Whatarethey?
• Taskbasedevalua3on• DoesasummaryhelpyoutoperformaresearchtaskbeVer?
MachineTranslation• Challengesformul3lingualtransla3on
• WhatistheMTpyramid?
• WhatarethedifferenttrainedmodelsusedintheIBMmodel?
• Whatisphrased-basedMT?
IBM’sEMtrainedmodels(1-5)• Wordtransla3on• Localalignment• Fer3li3es• Class-basedalignment• Re-orderingAllareseparatemodelstotrain!Model1: ∏
=+==
m
jajm jefp
nceafpeapeafp
1
)|()1(
),|(*)|()|,(
Phrase-BasedStatisticalMT
• Foreigninputsegmentedintophrases– “phrase”isanysequenceofwords
• Eachphraseisprobabilis3callytranslatedintoEnglish
– P(totheconference|zurKonferenz)– P(intothemee3ng|zurKonferenz)
• Phrasesareprobabilis3callyre-orderedSee[Koehnetal,2003]foranintro.Thiswasstate-of-the-artbeforeneuralMT
Morgen fliege ich nach Kanada zur Konferenz
Tomorrow I will fly to the conference In Canada
SlidecourtesyofKevinKnighthVp://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt
Marydidnotslapthegreenwitch
Marianodióunabofetadaalabrujaverde
(Maria, Mary) (no, did not) (slap, dió una bofetada) (la, the) (bruja, witch) (verde, green) (a la, the) (dió una bofetada a, slap the) (Maria no, Mary did not) (no dió una bofetada, did not slap), (dió una bofetada a la, slap the) (bruja verde, green witch) (Maria no dió una bofetada, Mary did not slap) (a la bruja verde, the green witch) … (Maria no dió una bofetada a la bruja verde, Mary did not slap the green witch)
WordAlignmentInducedPhrasesSlidecourtesyofKevinKnighthVp://www.sims.berkeley.edu/courses/is290-2/f04/lectures/mt-lecture.ppt
HowisMTevaluationdone?• Automatedmetrics:Bleu,Meteor
• Humanjudgments:• Adequacy(accuracy)• Fluency
• HowwerehumanjudgmentsdoneinWMT2017?• Whatweresomeapproachestoqualitycontrolforcrowdsourcing?
WhendidNeuralMTsurpassstatisticalmethods(phrase-basedandsyntax)?
• WMT2016
• WhendidcompaniesfirstreleaseNMTsystems?• 2016
NeuralMT• Encoder-decoderapproach
• WhatistheproblemwithabasicRNN?
• HowisaVen3onused?
• HowelsehastheRNNmemoryproblembeenaddressed?
Whatotherapproaches?• TrainstackedRNNSusingmul3plelayers
• Useabidirec3onalencoder• Thiscanhelpinrememberingtheearlypartofthesourceinputsentence
• Traintheinputsequenceinreverseorder
• Deepernetworks:decoderdepthof8
• Data:parallel,back-translated,duplicatedmonolingual