![Page 1: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/1.jpg)
Natural Language ProcessingSoSe 2016
Part-of-Speech Tagging
Dr. Mariana Neves May 9th, 2016
![Page 2: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/2.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
2
![Page 3: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/3.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
3
![Page 4: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/4.jpg)
Part-of-Speech (POS) Tags
● Also known as:
– Part-of-speech tags, lexical categories, word classes, morphological classes, lexical tags
Plays[VERB] well[ADVERB] with[PREPOSITION] others[NOUN]
Plays[VBZ] well[RB] with[IN] others[NNS]
4
![Page 5: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/5.jpg)
Examples of POS tags
● Noun: book/books, nature, Germany, Sony
● Verb: eat, wrote
● Auxiliary: can, should, have
● Adjective: new, newer, newest
● Adverb: well, urgently
● Number: 872, two, first
● Article/Determiner: the, some
● Conjuction: and, or
● Pronoun: he, my
● Preposition: to, in
● Particle: off, up
● Interjection: Ow, Eh
5
![Page 6: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/6.jpg)
Motivation: Speech Synthesis
● Word „content“
– „Eggs have a high protein content.“
– „She was content to step down after four years as chief executive.“
6 (http://www.thefreedictionary.com/content)
![Page 7: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/7.jpg)
Motivation: Machine Translation
● e.g., translation from English to German:
– „I like ...“
● „Ich mag ….“ (verb)● „Ich wie ...“ (preposition)
7
![Page 8: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/8.jpg)
Motivation: Syntactic parsing
8 (http://nlp.stanford.edu:8080/parser/index.jsp)
![Page 9: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/9.jpg)
Motivation: Information extraction
● Named-entity recognition (usually nouns)
9 (http://www.nactem.ac.uk/tsujii/GENIA/tagger/)
![Page 10: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/10.jpg)
Motivation: Information extraction
● Relation extraction (usually verbs)
10 (http://www.nactem.ac.uk/tsujii/GENIA/tagger/)
![Page 11: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/11.jpg)
Open vs. Closed Classes
● Closed
– limited number of words, do not grow usually
– e.g., Auxiliary, Article, Determiner, Conjuction, Pronoun, Preposition, Particle, Interjection
● Open
– unlimited number of words
– e.g., Noun, Verb, Adverb, Adjective
11
![Page 12: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/12.jpg)
POS Tagsets
● There are many parts of speech tagsets
● Tag types
– Coarse-grained
● Noun, verb, adjective, ...– Fine-grained
● noun-proper-singular, noun-proper-plural, noun-common-mass, ..
● verb-past, verb-present-3rd, verb-base, ...● adjective-simple, adjective-comparative, ...
12
![Page 13: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/13.jpg)
POS Tagsets
● Brown tagset (87 tags)
– Brown corpus
● C5 tagset (61 tags)
● C7 tagset (146 tags!)
● Penn TreeBank (45 tags)
– A large annotated corpus of English tagset
13
![Page 14: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/14.jpg)
Penn Treebank Tagset
14 (https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html)
![Page 15: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/15.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
15
![Page 16: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/16.jpg)
POS Tagging
● The process of assigning a part of speech to each word in a text
● Challenge: words often have more than one POS
– On my back[NN] (noun)
– The back[JJ] door (adjective)
– Win the voters back[RB] (adverb)
– Promised to back[VB] the bill (verb)
16
![Page 17: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/17.jpg)
Ambiguity in POS tags
● 45-tags Brown corpus (word types)
– Unambiguous (1 tag): 38,857
– Ambiguous: 8,844
● 2 tags: 6,731● 3 tags: 1,621● 4 tags: 357● 5 tags: 90● 6 tags: 32● 7 tags: 6 (well, set, round, open, fit, down)● 8 tags: 4 ('s, half, back, a)● 9 tags: 3 (that, more, in)
17
![Page 18: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/18.jpg)
Baseline method
1. Tagging unambiguous words with the correct label
2. Tagging ambiguous words with their most frequent label
3. Tagging unknown words as a noun
● This method performs around 90% precision
18
![Page 19: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/19.jpg)
POS Tagging
● Plays well with others
– Plays (NNS/VBZ)
– well (UH/JJ/NN/RB)
– with (IN)
– others (NNS)
– Plays[VBZ] well[RB] with[IN] others[NNS]
19
![Page 20: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/20.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
20
![Page 21: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/21.jpg)
Rule-Based Tagging
● Standard approach (two steps):
1. Dictionaries to assign a list of potential tags
● Plays (NNS/VBZ) ● well (UH/JJ/NN/RB)● with (IN)● others (NNS)
2. Hand-written rules to restrict to a POS tag
● Plays (VBZ)● well (RB)● with (IN)● others (NNS)
21
![Page 22: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/22.jpg)
Rule-Based Tagging
● Some approaches rely on morphological parsing
– e.g., EngCG Tagger
22
![Page 23: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/23.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
23
![Page 24: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/24.jpg)
Sequential modeling
● Many of the NLP techniques should deal with data represented as sequence of items
– Characters, Words, Phrases, Lines, …
● Part-of-speech tagging
– I[PRP] saw[VBP] the[DT] man[NN] on[IN] the[DT] roof[NN] .
● Named-entity recognition
– Steven[PER] Paul[PER] Jobs[PER] ,[O] co-founder[O] of[O] Apple[ORG] Inc[ORG] ,[O] was[O] born[O] in[O] California[LOC].
24
![Page 25: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/25.jpg)
Sequential modeling
● Making a decision based on:
– Current Observation:
● Word (W0): „35-years-old“
● Prefix, Suffix: „computation“ „comp“, „ation“● Lowercased word: „New“ „new“● Word shape: „35-years-old“ „d-a-a“
– Surrounding observations
● Words (W+1, W−1)
– Previous decisions
● POS tags (T−1, T−2)
25
![Page 26: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/26.jpg)
Learning Model (classification)
26
Training
Testing
T1
T2
…T
n
C1
C2
…C
n
F1
F2
…F
n
Model(F,C)
Tn+1
? Fn+1
Cn+1
![Page 27: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/27.jpg)
Sequential modeling
● Greedy inference
– Start in the beginning of the sequence
– Assign a label to each item using the classifier
– Using previous decisions as well as the observed data
27
![Page 28: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/28.jpg)
Sequential modeling
● Beam inference
– Keeping the top k labels in each position
– Extending each sequence in each local way
– Finding the best k labels for the next position
28
![Page 29: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/29.jpg)
Hidden Markov Model (HMM)
● Finding the best sequence of tags (t1 ...tn ) that corresponds to the sequence of observations (w1 ...wn )
● Probabilistic View
– Considering all possible sequences of tags
– Choosing the tag sequence from this universe of sequences, which is most probable given the observation sequence
29
t̂1n=argmax t1n P (t1
n∣w1n)
![Page 30: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/30.jpg)
Using the Bayes Rule
30
t̂1n=argmax
t1n P (t1
n∣w1
n)
P (A∣B)=P (B∣A)⋅P (A)
P (B )
P(t1n∣w1
n)=P(w1
n∣t 1n)⋅P (t 1
n)
P(w1n)
t̂1n=argmax
t1n P (w1
n∣t1n)⋅P (t 1
n)
likelihood prior probability
![Page 31: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/31.jpg)
Using Markov Assumption
31
t̂1n=argmax
t1n P (w1
n∣t1n)⋅P (t 1
n)
P(w1n∣t 1n)≃ ∏ P (w i∣t i)
n
i=1
P(t1n)≃ ∏ P (t i∣t i−1)
n
i=1
t̂1n=argmax
t1n ∏ P (wi∣t i)⋅P (t i∣t i−1)
n
i=1
(it depends only on its POS tag and independent of other words)
(it depends only on the previous POS tag, thus, bigram)
![Page 32: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/32.jpg)
Two Probabilities
● The tag transition probabilities: P(ti|ti−1)
– Finding the likelihood of a tag to proceed by another tag
– Similar to the normal bigram model
32
P(t i∣t i−1)=C (t i−1 , t i)
C (t i−1)
![Page 33: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/33.jpg)
Two Probabilities
● The word likelihood probabilities: P(wi|ti)
– Finding the likelihood of a word to appear given a tag
33
P(w i∣t i)=C ( t i ,w i)
C (t i)
![Page 34: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/34.jpg)
Two Probabilities
I[PRP] saw[VBP] the[DT] man[NN?] on[] the[] roof[] .
34
P([NN ]∣[DT ])=C ([DT ] ,[NN ])
C ([DT ])
P(man∣[NN ])=C ([NN ] ,man)C ([NN ])
![Page 35: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/35.jpg)
Ambiguity
Secretariat[NNP] is[VBZ] expected[VBN] to[TO] race[VB] tomorrow[NR] .
People[NNS] inquire[VB] the[DT] reason[NN] for[IN] the[DT] race[NN] .
35
![Page 36: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/36.jpg)
Ambiguity
Secretariat[NNP] is[VBZ] expected[VBN] to[TO] race[VB] tomorrow[NR] .
36
NNP VBZ VBN TO VB NR
Secretariat is expected to race tomorrow
NNP VBZ VBN TO NN NR
Secretariat is expected to race tomorrow
![Page 37: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/37.jpg)
Ambiguity
Secretariat[NNP] is[VBZ] expected[VBN] to[TO] race[VB] tomorrow[NR] .
P(VB|TO) = 0.83
P(race|VB) = 0.00012
P(NR|VB) = 0.0027
P(VB|TO).P(NR|VB).P(race|VB) = 0.00000027
37
NNP VBZ VBN TO VB NR
Secretariat is expected to race tomorrow
![Page 38: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/38.jpg)
Ambiguity
Secretariat[NNP] is[VBZ] expected[VBN] to[TO] race[VB] tomorrow[NR] .
P(NN|TO) = 0.00047
P(race|NN) = 0.00057
P(NR|NN) = 0.0012
P(NN|TO).P(NR|NN).P(race|NN) = 0.00000000032
38
NNP VBZ VBN TO NN NR
Secretariat is expected to race tomorrow
![Page 39: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/39.jpg)
Formalization of Hidden Markov Model (HMM)
● Extension of Finite-State Automata
● Finite-State Automata
– set of states
– set of transitions between states
39
![Page 40: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/40.jpg)
Formalization of Hidden Markov Model (HMM)
● Weighted finite-state automaton
– Each arc is associated with a probability
– The probabilities leaving any arc must sum to one
40
TO2
Start0
VB1 NN3
End4
a02
a01
a03a31
a13
a14
a34a33
a32
a23a21
a12
a11
a22
a24
![Page 41: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/41.jpg)
Hidden Markov Model (HMM)
● POS tagging
– Ambiguous
– We observe the words, not the POS tags
● HMM
– Observed events: words
– Hidden events: POS tags
41
![Page 42: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/42.jpg)
Hidden Markov Model (HMM)
● Transition probabilities:
42
P(t i∣t i−1)
TO2
Start0
VB1 NN3
End4
a02
a01
a03a31
a13
a14
a34a33
a32
a23a21
a12
a11
a22
a24
![Page 43: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/43.jpg)
Hidden Markov Model (HMM)
● Word likelihood probabilities:
43
P(w i∣t i)
TO2
Start0
VB1 NN3
End4
B2P(“aardvark”|TO)..P(“race”|TO)..P(“the”|TO)..P(“to”|TO)..P(“zebra”|TO)
B1P(“aardvark”|VB)..P(“race”|VB)..P(“the”|VB)..P(“to”|VB)..P(“zebra”|VB)
B3P(“aardvark”|NN)..P(“race”|NN)..P(“the”|NN)..P(“to”|NN)..P(“zebra”|NN)
![Page 44: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/44.jpg)
Viterbi algorithm
● Decoding algorithm for HMM
– Determine the best sequence of POS tags
● Probability matrix
– Columns corresponding to inputs (words)
– Rows corresponding to possible states (POS tags)
44
![Page 45: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/45.jpg)
Viterbi algorithm
1. Move through the matrix in one pass filling the columns left to right using the transition probabilities and observation probabilities
2. Store the max probability path to each cell (not all paths) using dynamic programming
45
![Page 46: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/46.jpg)
46
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
![Page 47: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/47.jpg)
47
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
V0(0)=1.0
vt-1: previous Viterbi path probability(from the previous time step)
![Page 48: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/48.jpg)
48
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
v0(0)=1.0
start
PPSS
VB
TO
NN
end
P(PPSS|start)·P(start) .067·1.0=0.067
P(VB|start)·P(start) .019·1.0=0.019
P(TO|start)·P(start) .0043·1.0=0.0043
P(NN|start)·P(start) .0041·1.0=0.0041
aij: transition probability (from previous state qi to current state qj)bj(ot): state observation likelihood (observation ot given the current state j)
![Page 49: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/49.jpg)
49
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
v1(1)=P(PPSS|start)·P(start)·P(I|PPSS)=0.067·0.37=0.025
start
PPSS
VB
TO
NN
end
v1(2)=P(PPSS|start)·P(start)·P(I|VB)=0.019·0=0
v1(3)=P(PPSS|start)·P(start)·P(I|TO)=0.043·0=0
v1(4)=P(PPSS|start)·P(start)·P(I|NN)=0.041·0=0
v t( j)=maxi=1
N
v t−1(i)⋅aij⋅b j (ot)
![Page 50: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/50.jpg)
50
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
start
PPSS
VB
TO
NN
end
start
PPSS
VB
TO
NN
end
v1(4)·P(VB|NN)
0·.0040=0
v1(3)·P(VB|TO)
0·.83=0
v1(2)·P(VB|VB)
0·.0038=0
v1(1)·P(VB|PPSS)
.025·.23=0.0055
![Page 51: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/51.jpg)
51
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
start
PPSS
VB
TO
NN
end
start
PPSS
VB
TO
NN
end
v2(2)=max(0,0,0,.0055)·.0093=.000051
![Page 52: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/52.jpg)
52
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
start
PPSS
VB
TO
NN
end
start
PPSS
VB
TO
NN
end
![Page 53: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/53.jpg)
53
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
start
PPSS
VB
TO
NN
end
start
PPSS
VB
TO
NN
end
PPSS
VB
TO
NN
PPSS
VB
TO
NN
end end end
start start
PPSS
VB
TO
NN
start
![Page 54: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/54.jpg)
54
i want to race
Q1
Q2
Q3
Q4
start
PPSS
VB
TO
NN
end
q1
q2
q0
q3
q4
qend
start
PPSS
VB
TO
NN
end
start
PPSS
VB
TO
NN
end
PPSS
VB
TO
NN
PPSS
VB
TO
NN
end end end
start start
PPSS
VB
TO
NN
start
![Page 55: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/55.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
55
![Page 56: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/56.jpg)
Transformation-Based Tagging
● Also called Brill Tagging
● Uses Transformation-Based Learning (TBL)
– Rules are automatically induced from training data
56
![Page 57: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/57.jpg)
Transformation-Based Tagging
● Brill's Tagger
1. Assign every word with the most likely tag:
● Secretariat/NNP is/VBZ expected/VBN to/TO race/NN tomorrow/NN
● The/DT race/NN for/IN outer/JJ space/NN
2. Apply transformation rules:
● e.g., change „NN“ to „VB“ when previous tag is „TO“– Secretariat/NNP is/VBZ expected/VBN to/TO
race/VB tomorrow/NN
57
![Page 58: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/58.jpg)
Transformation-Based Tagging
● TBL learning process
– Based on a set of templates (abstracted transformations)
● „the word two before (after) is tagged z“● „the preceding word is tagged z and the following word
is tagged w“● etc.
58
![Page 59: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/59.jpg)
Outline
● Part-of-Speech tags
● Part-of-Speech tagging
– Rule-Based Tagging
– HMM Tagging
– Transformation-Based Tagging
● Evaluation
59
![Page 60: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/60.jpg)
Evaluation
● Corpus
– Training and test, and optionally also development set
– Training (cross-validation) and test set
● Evaluation
– Comparison of gold standard (GS) and predicted tags
– Evaluation in terms of Precision, Recall and F-Measure
60
![Page 61: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/61.jpg)
Precision and Recall
● Precision:
– Amount of labeled items which are correct
● Recall:
– Amount of correct items which have been labeled
61
Precision=tp
tp+ fp
Recall=tp
tp+ fn
![Page 62: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/62.jpg)
F-Measure
● There is a strong anti-correlation between precision and recall
● Having a trade off between these two metrics
● Using F-measure to consider both metrics together
● F -measure is a weighted harmonic mean of precision and recall
62
F=(β
2+1)P R
β2P+R
![Page 63: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/63.jpg)
Error Analysis
● Confusion matrix or contingency table
– Percentage of overall tagging error
63
IN JJ NN NNP RB VBD VBN
IN - .2 .7
JJ .2 - 3.3 2.1 1.7 .2 2.7
NN 8.7 - .2
NNP .2 3.3 4.1 - .2
RB 2.2 2.0 .5 -
VBD .3 .5 - 4.4
VBN 2.8 2.6
![Page 64: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/64.jpg)
Further reading and tools
● Book Jurafski & Martin
– Chapter 5
● Tools
– Stanford POS parser
– OpenNLP
– TreeTagger
– and many others...
64
![Page 65: Natural Language Processing SoSe 2016 - hpi.deSequential modeling Many of the NLP techniques should deal with data represented as sequence of items – Characters, Words, Phrases,](https://reader036.vdocuments.us/reader036/viewer/2022062919/5ee08c32ad6a402d666bb69b/html5/thumbnails/65.jpg)
Project update
● Integration of POS tagging
– Presentation in the next lecture (May 23rd, 2016)
65