discriminative modeling extraction sets for machine translation author john denero and dan kleinuc...
DESCRIPTION
Progress of Statistical MT Generate translated sentences word by word Using while fragments of training example, building translation rules ◦ Aligned at the word level ◦ Extract fragment-level rules from word aligned sentence pair Tree to string translation Extraction Set Models ◦ Set of all overlapping phrasal translation rule + alignmentTRANSCRIPT
Discriminative Modeling Discriminative Modeling extraction Sets for Machine extraction Sets for Machine TranslationTranslationAuthorJohn DeNero and Dan Klein UC BerkeleyPresenterJustin Chiu
ContributionContributionExtraction set
◦Nested collections of all the overlapping phrase pairs consistent with an underlying word-alignment
Advantages over word-factored alignment model◦Can incorporate features on phrase pairs,
more than word link◦Optimize a extraction-based loss function
really direct to generating translationPerform better than both supervised
and unsupervised baseline
Progress of Statistical MTProgress of Statistical MTGenerate translated sentences
word by wordUsing while fragments of training
example, building translation rules◦Aligned at the word level ◦Extract fragment-level rules from word
aligned sentence pair Tree to string translation
Extraction Set Models◦Set of all overlapping phrasal
translation rule + alignment
OutlineOutlineExtraction Set ModelsModel EstimationModel InferenceExperiments
EXTRACTION SET EXTRACTION SET MODELSMODELS
Extraction Set ModelsExtraction Set ModelsInput
◦Unaligned sentence
Output◦Extraction set of phrasal translation
rules◦Word alignment
Extraction Sets from Word Extraction Sets from Word AlignmentsAlignments
Extraction Sets from Word Extraction Sets from Word AlignmentsAlignments
Extraction Sets from Word Extraction Sets from Word AlignmentsAlignments
Possible and Null Alignment Possible and Null Alignment LinksLinksPossible links has two types
◦ Function words that is unique in its language◦ Short phrase that has no lexical equivalent
Null alignment◦ Express content that is
absent in its translation
Interpreting Possible and Null Interpreting Possible and Null Alignment LinksAlignment Links
Interpreting Possible and Null Interpreting Possible and Null Alignment LinksAlignment Links
Linear Model for Linear Model for Extraction SetExtraction Set
Scoring Extraction SetsScoring Extraction Sets
MODEL ESTIMATIONMODEL ESTIMATION
MIRA(Margin-infused Relaxed MIRA(Margin-infused Relaxed Algorithm)Algorithm)
Extraction Set Loss Extraction Set Loss FunctionFunction
MODEL INFERENCEMODEL INFERENCE
Possible DecompositionsPossible Decompositions
DP for Extraction SetsDP for Extraction Sets
DP for Extraction SetsDP for Extraction Sets
Finding Pseudo-Gold ITG Finding Pseudo-Gold ITG AlignmentAlignment
EXPERIMENTSEXPERIMENTS
Five systems for Five systems for comparisoncomparisonUnsupervised baseline◦ Giza++◦ Joint HMMSupervised baseline◦ Block ITGExtraction Set Coarse Pass◦ Does not score bispans that corss
bracketing of ITG derivationsFull Extraction Set Model
DataDataDiscriminative training and
alignment evaluation◦Trained baseline HMM on 11.3 million
words of FBIS newswire data◦Hand-aligned portion of the NIST MT02
test set 150 training and 191 test sentences
End-to-end translation experiments◦Trained on 22.1 million word prarllel
corpus consisting of sentence up to 40 of newswire data from GALE program
◦NIST MT04/MT05 test sets
ResultsResults
DiscussionDiscussionSyntax labels v.s wordsWord align to rule Rule to word
alignInformation from two directions65% of type 1 error