probabilistic cky
DESCRIPTION
Probabilistic CKY. Roger Levy [thanks to Jason Eisner]. Managing Ambiguity. John saw Mary Typhoid Mary Phillips screwdriver Mary note how rare rules interact I see a bird is this 4 nouns – parsed like “city park scavenger bird”? - PowerPoint PPT PresentationTRANSCRIPT
Probabilistic CKY
Roger Levy
[thanks to Jason Eisner]
Managing Ambiguity
• John saw Mary • Typhoid Mary• Phillips screwdriver Mary
note how rare rules interact
• I see a bird • is this 4 nouns – parsed like “city park scavenger bird”?
rare parts of speech, plus systematic ambiguity in noun sequences
• Time flies like an arrow• Fruit flies like a banana• Time reactions like this one• Time reactions like a chemist• or is it just an NP?
Our bane: Ambiguity
• John saw Mary • Typhoid Mary• Phillips screwdriver Mary
note how rare rules interact
• I see a bird • is this 4 nouns – parsed like “city park scavenger bird”?
rare parts of speech, plus systematic ambiguity in noun sequences
• Time | flies like an arrow NP VP• Fruit flies | like a banana NP VP
• Time | reactions like this one V[stem] NP
• Time reactions | like a chemist S PP
• or is it just an NP?
How to solve this combinatorial explosion of ambiguity?
1. First try parsing without any weird rules, throwing them in only if needed.
2. Better: every rule has a weight. A tree’s weight is total weight of all its rules. Pick the overall lightest parse of sentence.
3. Can we pick the weights automatically?We’ll get to this later …
The plan for the rest of parsing
• There’s probability (inference) and then there’s statistics (model estimation)
• These two problems are logically separate, yet interrelated in practice
• We’ll start with the problem of efficient inference (probabilistic bottom-up parsing)
• Then we’ll move to the estimation of good (generative) models that permit efficient bottom-up parsing
• Finally, we’ll mention state-of-the-art work that doesn’t permit exact inference (“reranking” approaches)
The critical recursion
• We’ll do bottom-up parsing (weighted CKY)• When combining constituents into a larger constituent:
• The weight of the new constituent is the sum of the weights of the combined subconstituents…
• …plus the weight of the rule used to combine the subconstituents
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
1
NP 4VP 4
2
P 2V 5
3 Det 1
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
1
NP 4VP 4
2
P 2V 5
3 Det 1
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10
1
NP 4VP 4
2
P 2V 5
3 Det 1
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8
1
NP 4VP 4
2
P 2V 5
3 Det 1
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
3 Det 1
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
3 Det 1
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
PP 12
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
NP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
NP 18S 21
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
SFollow backpointers …
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
S
NP VP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
S
NP VP
VP PP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
S
NP VP
VP PP
P NP
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
S
NP VP
VP PP
P NP
Det N
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Which entries do we need?
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Which entries do we need?
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Not worth keeping …
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
… since it just breeds worse options
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Keep only best-in-class!
“inferior stock”
time 1 flies 2 like 3 an 4 arrow 5
NP 3Vst 3
NP 10S 8
NP 24S 22
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Keep only best-in-class!(and backpointers so you can recover parse)
Probabilistic Trees
• Instead of lightest weight tree, take highest probability tree• Given any tree, your assignment 1 generator would have some
probability of producing it! • Just like using n-grams to choose among strings …• What is the probability of this tree?
S
NPtime
VP
VPflies
PP
Plike
NP
Detan
N arrow
Probabilistic Trees
• Instead of lightest weight tree, take highest probability tree
• Given any tree, your assignment 1 generator would have some probability of producing it!
• Just like using n-grams to choose among strings …• What is the probability of this tree?
• You rolled a lot of independent dice …
S
NPtime
VP
VPflies
PP
Plike
NP
Detan
N arrow
p( | S)
Chain rule: One word at a time
p(time flies like an arrow) = p(time)
* p(flies | time)* p(like | time flies)* p(an | time flies like)* p(arrow | time flies like an)
Chain rule + backoff (to get trigram model)
p(time flies like an arrow) = p(time)
* p(flies | time)* p(like | time flies)* p(an | time flies like)* p(arrow | time flies like an)
Chain rule – written differently
p(time flies like an arrow) = p(time)
* p(time flies | time)* p(time flies like | time flies)* p(time flies like an | time flies like)* p(time flies like an arrow | time flies like an)
Proof: p(x,y | x) = p(x | x) * p(y | x, x) = 1 * p(y | x)
Chain rule + backoff
p(time flies like an arrow) = p(time)
* p(time flies | time)* p(time flies like | time flies)* p(time flies like an | time flies like)* p(time flies like an arrow | time flies like an)
Proof: p(x,y | x) = p(x | x) * p(y | x, x) = 1 * p(y | x)
Chain rule: One node at a time
S
NPtime
VP
VPflies
PP
Plike
NP
Detan
N arrow
p( | S) = p(S
NP VP| S) * p(
S
NPtime
VP|
S
NP VP)
* p(S
NPtime
VP
VP PP
|S
NPtime
VP)
* p(S
NPtime
VP
VPflies
PP
|S
NPtime
VP) * …
VP PP
Chain rule + backoff
S
NPtime
VP
VPflies
PP
Plike
NP
Detan
N arrow
p( | S) = p(S
NP VP| S) * p(
S
NPtime
VP|
S
NP VP)
* p(S
NPtime
VP
VP PP
|S
NPtime
VP)
* p(S
NPtime
VP
VPflies
PP
|S
NPtime
VP) * …
VP PP
Simplified notation
S
NPtime
VP
VPflies
PP
Plike
NP
Detan
N arrow
p( | S) = p(S NP VP | S) * p(NP flies | NP)
* p(VP VP NP | VP)
* p(VP flies | VP) * …
Already have a CKY alg for weights …
S
NPtime
VP
VPflies
PP
Plike
NP
Detan
N arrow
w( | S) = w(S NP VP) + w(NP flies | NP)
+ w(VP VP NP)
+ w(VP flies) + …
Just let w(X Y Z) = -log p(X Y Z | X)Then lightest tree has highest prob
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8 S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
multiply to get 2-22
2-8
2-12
2-2
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8 S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
multiply to get 2-22
2-8
2-12
2-2
2-13
Need only best-in-class to get best parse
Why probabilities not weights?• We just saw probabilities are really just a special case
of weights …• … but we can estimate them from training data by
counting and smoothing!• Warning: What kind of training data do we need for
this type of estimation?
A slightly different task
• One task: What is probability of generating a given tree with a PCFG generator? • To pick tree with highest prob: useful in parsing.
• But could also ask: What is probability of generating a given string with the generator?
• To pick string with highest prob: useful in speech recognition, as substitute for an n-gram model. • (“Put the file in the folder” vs. “Put the file and the
folder”)• To get prob of generating string, must add up
probabilities of all trees for the string …
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S 8 S 13
NP 24S 22S 27NP 24S 27S 22S 27
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 12VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Could just add up the parse probabilities
2-22
2-27
2-27
2-22
2-27
oops, back to finding
exponentially many parses
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10SS 2-13
NP 24S 22S 27NP 24S 27SS
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 2-12
VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2-2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Any more efficient way?
2-8
2-22
2-27
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S
NP 24S 22S 27NP 24S 27S
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 2-12
VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2-2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Add as we go … (the “inside algorithm”)
2-8+2-13
2-22
+2-27
time 1 flies 2 like 3 an 4 arrow 5
0
NP 3Vst 3
NP 10S
NP
S
1
NP 4VP 4
NP 18S 21VP 18
2
P 2V 5
PP 2-12
VP 16
3 Det 1 NP 10
4 N 8
1 S NP VP6 S Vst NP2-2 S S PP
1 VP V NP2 VP VP PP
1 NP Det N2 NP NP PP3 NP NP NP
0 PP P NP
Add as we go … (the “inside algorithm”)
2-8+2-13
+2-22
+2-27
2-22
+2-27 2-22
+2-27 +2-27
PCFGs and HMMs
• There is a natural connection between:• the inside algorithm for filling out a probabilistic CKY
chart • and the forward algorithm for filling out an HMM trellis
• Do you see the relationship?
• However, bottom-up for PCFGs and left-to-right for HMMs are not the only ways to go for DPs• You can go outside-in for PCFGs• We’ll see left-to-right (the Earley algorithm) for PCFGs • And you can go right-to-left for HMMs
Charts and lattices
• You can equivalently represent a parse chart as a lattice constructed over some initial arcs
• This will also set the stage for Earley parsing later
salt flies scratch
NP
N
NP
S
S
VP
V
N
NP
S
VP
NP
N
NP
V
VP salt
N NP V VP
flies scratch
N NP V VPN NP
NP S NP VP S
S
S NP VPVP V NPVPVNP NNP N N
(Speech) Lattices
• There was nothing magical about words spanning exactly one position.
• When working with speech, we generally don’t know how many words there are, or where they break.
• We can represent the possibilities as a lattice and parse these just as easily.
I
aweof
van
eyes
saw
a
‘ve
an
Ivan