language model methods and metrics

9
Language Model Methods and Metrics Gary Luu Ryan Fortune

Upload: dara-burke

Post on 31-Dec-2015

21 views

Category:

Documents


1 download

DESCRIPTION

Language Model Methods and Metrics. Gary Luu Ryan Fortune. Skip N-grams. Interpolated with Bigram Get Influence of words further away without increasing dimensionality Learning Curve. Skip N-gram Learning Curve. Content Word Language Model. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Language Model Methods and Metrics

Language Model Methods and Metrics

Gary LuuRyan Fortune

Page 2: Language Model Methods and Metrics

Skip N-grams

• Interpolated with Bigram• Get Influence of words further away without

increasing dimensionality• Learning Curve

Page 3: Language Model Methods and Metrics

Skip N-gram Learning Curve

Page 4: Language Model Methods and Metrics

Content Word Language Model

• Help predict next word using last uncommon word, try to capture context

• Found list of 250 most common words• Tried different sizes for common words• Interpolated with language models, since this

wouldn’t maintain grammar• P(w|C)

Page 5: Language Model Methods and Metrics

Content Word Model

Page 6: Language Model Methods and Metrics

Bag Generation Metrics

• Bag Generation – NP-Hard• Random Restart Greedy Hill-Climbing• Stability Metric

• Give model correct sentence, does it maintain it as an optima?

• A percentage of sentences that remain stable

• Reconstruction Metric• Needs to be compared against lucky/random

Page 7: Language Model Methods and Metrics

Bag Generation Metrics

Page 8: Language Model Methods and Metrics

Clustering -IBMFullPredict

• Clustering overview• Perplexity down to 107 with million sentence

corpus

• Pibmfullpredict(wi|wi-2wi-1) = [λP(W|wi-2wi-1) + (1-λ)P(W|Wi-1Wi-2)] * [μP(w|wi-1wi-2,W) + (1-μ)P(w|Wi-2,Wi-1,W)]

Page 9: Language Model Methods and Metrics

Learning Curve for IBMFullPredict