part-of-speech tagging torbjörn lager department of linguistics stockholm university
Post on 20-Dec-2015
217 views
TRANSCRIPT
Part-of-Speech Tagging
Torbjörn LagerDepartment of LinguisticsStockholm University
NLP1 - Torbjörn Lager 2
Part-of-Speech Tagging: Definition
From Jurafsky & Martin 2000:
Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word in a corpus.
The input to a tagging algorithm is a string of words and a specified tagset. The output is a single best tag for each word.
A bit too narrow for my taste...
NLP1 - Torbjörn Lager 3
Part-of-Speech Tagging: Example 1
Input
He can can a can
Output
He/pron can/aux can/vb a/det can/n
Another possible output
He/{pron} can/{aux,n} can/{vb} a/{det} can/{n,vb}
NLP1 - Torbjörn Lager 4
Tag Sets
The Penn Treebank tag set (see appendix in handout)
NLP1 - Torbjörn Lager 5
Why Part-of-Speech Tagging?
A first step towards parsing
A first step towards word sense disambiguation
Provide clues to pronounciation "object" -> OBject or obJECT (but note: BAnan vs baNAN)
Research in Corpus Linguistics
NLP1 - Torbjörn Lager 6
Part-of-Speech Tagging: Example 2
I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.
NLP1 - Torbjörn Lager 7
Relevant Information
Lexical informationLocal contextual information
NLP1 - Torbjörn Lager 8
Part-of-Speech Tagging: Example 2
I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.
I/PRP can/MD light/VB a/DT fire/NN and/CC you/PRP can/MD open/VB a/DT can/NN of/IN beans/NNS ./. Now/RB the/DT can/NN is/VBZ open/JJ and/CC we/PRP can/MD eat/VB in/IN the/DT light/NN of/IN the/DT fire/NN ./.
NLP1 - Torbjörn Lager 9
Part-of-Speech Tagging
Processor
Knowledge
Text POS tagged text
Needed:- Some strategy for representing the knowledge- Some method for acquiring the knowledge- Some method of applying the knowledge
NLP1 - Torbjörn Lager 10
Approaches to PoS Tagging
The bold approach: 'Use all the information you have and guess"'
The whimsical approach: 'Guess first, then change your mind if nessessary!'
The cautious approach: 'Don't guess, just eliminate the impossible!'
NLP1 - Torbjörn Lager 11
Some POS-Tagging Issues
AccuracySpeedSpace requirementsLearningIntelligibility
Processor
Knowledge
Text POS tagged text
NLP1 - Torbjörn Lager 12
Cutting the Cake
Tagging methods Rule based Statistical Mixed Other methods
Learning methods Supervised learning Unsupervised learning
NLP1 - Torbjörn Lager 13
HMM Tagging
The bold approach: 'Use all the information you have and guess"'
Statistical methodSupervised (or unsupervised) learning
NLP1 - Torbjörn Lager 14
NLP1 - Torbjörn Lager 15
The Naive Approach and its Problem
Traverse all the paths compatible with the input and then pick the most probable one
Problem: There are 27 paths in the HMM for S="he can
can a can" Doubling the length of S (with a conjunction in
between) -> 729 paths Doubling S again -> 531431 paths! Exponential time complexity!
NLP1 - Torbjörn Lager 16
Solution
Use the Viterbi algorithm
Tagging can be done in time proportional to the length of input.
How and Why does the Viterbi algorithm work? We save this for later...
NLP1 - Torbjörn Lager 17
Training an HMM
Estimate probabilities from relative frequencies. Output probabilities P(w|t): the number of occurrences of w
tagged as t, divided by the number of occurrences of t. Transitional probabilities P(t1|t2): the number of occurrences
of t1 followed by t2, divided by the number of occurrences of t2.
Use smoothing to overcome the sparse data problem (unknown words, uncommon words, uncommon contexts)
NLP1 - Torbjörn Lager 18
Transformation-Based Learning
The whimsical approach: 'Guess first, then change your mind if nessessary!'
Rule based tagging, statistical learningSupervised learningMethod due to Eric Brill (1995)
NLP1 - Torbjörn Lager 19
A Small PoS Tagging Example
rules
tag:NN>VB <- tag:TO@[-1] otag:VB>NN <- tag:DT@[-1] o....
inputShe decided to table her data
lexicondata:NN
decided:VB
her:PN
she:PN
table:NN VB
to:TO
NP VB TO NN PN NN
NLP1 - Torbjörn Lager 20
Lexicon for Brill Tagging
I PRPNow RBa DTand CCbeans NNScan MDeat VBfire NN VB
in INis VBZlight NN JJ VBof INopen JJ VBthe DTwe PRPyou PRP. .
NLP1 - Torbjörn Lager 21
A Rule Sequence
tag:'NN'>'VB' <- tag:'TO'@[-1] otag:'VBP'>'VB' <- tag:'MD'@[-1,-2,-3] otag:'NN'>'VB' <- tag:'MD'@[-1,-2] otag:'VB'>'NN' <- tag:'DT'@[-1,-2] otag:'VBD'>'VBN' <- tag:'VBZ'@[-1,-2,-3] otag:'VBN'>'VBD' <- tag:'PRP'@[-1] otag:'POS'>'VBZ' <- tag:'PRP'@[-1] otag:'VB'>'VBP' <- tag:'NNS'@[-1] otag:'IN'>'RB' <- wd:as@[0] & wd:as@[2] otag:'IN'>'WDT' <- tag:'VB'@[1,2] otag:'VB'>'VBP' <- tag:'PRP'@[-1] otag:'IN'>'WDT' <- tag:'VBZ'@[1] o...
NLP1 - Torbjörn Lager 22
Transformation-Based Painting
blue
green
red
brown
brown
yellow
blue blue
blue
red
K. Samuel 1998
NLP1 - Torbjörn Lager 23
Transformation-Based Learning
Learner
Tagged Corpus
Rules
I nitial Corpus
Templates
Hand Coded Corpus
NLP1 - Torbjörn Lager 24
Transformation-Based Learning
see appendix in handout
NLP1 - Torbjörn Lager 25
Constraint-Grammar Tagging
Due to Fred Karlsson et al.The cautious approach: 'Don't guess, just
eliminate the impossible!'Rule basedNo learning ('learning by injection')
NLP1 - Torbjörn Lager 26
Constraint Grammar Example
I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.
I/{PRP} can/{MD,NN} light/{JJ,NN,VB} a/{DT} fire/{NN} and/{CC} you/{PRP} can/{MD,NN} open/{JJ,VB} a/{DT} can/{MD,NN} of/{IN} beans/{NNS} ./{.} Now/{RB} the/{DT} can/{MD,NN} is/{VBZ} open/{JJ,VB} and/{CC} we/{PRP} can/{MD,NN} eat/{VB} in/{IN} the/{DT} light/{JJ,NN,VB} of/{IN} the/{DT} fire/{NN} ./{.}
NLP1 - Torbjörn Lager 27
Constraint Grammar Example
tag:red 'RP' <- wd:in@[0] & tag:'NN'@[-1] otag:red 'RB' <- wd:in@[0] & tag:'NN'@[-1] otag:red 'VB' <- tag:'DT'@[-1] otag:red 'NP' <- wd:'The'@[0] otag:red 'VBN' <- wd:said@[0] otag:red 'VBP' <- tag:'TO'@[-1,-2] otag:red 'VBP' <- tag:'MD'@[-1,-2,-3] otag:red 'VBZ' <- wd:'\'s'@[0] & tag:'NN'@[1] otag:red 'RP' <- wd:in@[0] & tag:'NNS'@[-1] otag:red 'RB' <- wd:in@[0] & tag:'NNS'@[-1] o...
NLP1 - Torbjörn Lager 28
Constraint Grammar Example
I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire.
I/{PP} can/{MD} light/{JJ,VB} a/{DT} fire/{NN} and/{CC} you/{PP} can/{MD} open/{JJ,VB} a/{DT} can/{MD,NN} of/{IN} beans/{NNS} ./{.} Now/{RB} the/{DT} can/{MD,NN} is/{VBZ} open/{JJ} and/{CC} we/{PP} can/{MD} eat/{VB} in/{IN} the/{DT} light/{NN} of/{IN} the/{DT} fire/{NN} ./{.}
NLP1 - Torbjörn Lager 29
Evaluation
Two reasons for evaluating: Compare with other peoples methods/systems Compare with earlier versions of your own system
Accuracy (recall and precision)
Baseline
Ceiling
N-fold cross-validation methodology => Good use of the data + More statistically reliable results.
NLP1 - Torbjörn Lager 30
Assessing the Taggers
Accuracy
Speed
Space requirements
Learning
Intelligibility
NLP1 - Torbjörn Lager 31
Demo Taggers
Transformation-Based Tagger: www.ling.gu.se/~lager/Home/brilltagger_ui.html
Constraint-Grammar Tagger www.ling.gu.se/~lager/Home/cgtagger_ui.html
Featuring tracing facilities!Try it yourself!
Parsing
Torbjörn LagerDepartment of LinguisticsStockholm University
NLP1 - Torbjörn Lager 33
Parsing
Parsing with a phrase structure grammarShallow parsing
NLP1 - Torbjörn Lager 34
A Simple Phrase Structure Grammar
Fragment
lisa springer lisa skjuter en älg
Grammar
s --> np, vp.
np --> pn.np --> det, n.
vp --> v.vp --> v, np.
pn --> [kalle].pn --> [lisa].
det --> [en].
n --> [älg].
v --> [springer].v --> [skjuter].
NLP1 - Torbjörn Lager 35
Recognition and Parsing
Recognition
?- s([lisa,springer],[]).yes?- s([springer,lisa],[]).no
Parsing
?- s(Tree,[lisa,springer],[]).Tree = s(np(pn(lisa)),vp(v(springer)))
NLP1 - Torbjörn Lager 36
A Top-Down Parser in Prolog
parse(A,P0,P,A/Trees) :-(A --> B),parse(B,P0,P,Trees).
parse((B,Bs),P0,P,(Tree,Trees)) :- parse(B,P0,P1,Tree),parse(Bs,P1,P,Trees).
parse([Word],[Word|P],P,Word).
NLP1 - Torbjörn Lager 37
Trying It Out
s --> np, vp. det --> [en].np --> pn. n --> [älg].np --> det, n. tv --> [skjuter].vp --> v, np. pn --> [lisa].
? - parse(s,[lisa,skjuter,en,älg],[],Tree). Tree = s/(np/pn/lisa,vp/(v/skjuter,np/(det/en,n/älg)))
NLP1 - Torbjörn Lager 38
The Resulting Tree
Tree = s/ np/ pn/lisa, vp/ v/skjuter, np/ det/en, n/älg
NLP1 - Torbjörn Lager 39
Syntactic Ambiguity
Den gamla damen träffade killen med handväskan
John saw a man in the park with a telescope
Råttan åt upp osten och hunden och katten jagade råttan
NLP1 - Torbjörn Lager 40
Local Ambiguity
The old man the boats
The horse raced past the barn fell
NLP1 - Torbjörn Lager 41
Indeterminism and Search
A depth-first, top-down, left-to-right, backtracking parser can handle (both forms) of ambiguity.
Parsing as a form of search
NLP1 - Torbjörn Lager 42
A Problem
Left-recursive rules
np --> np, pp.np --> np, conj, np.
Indirect left-recursion
A --> B, C.B --> A, D.
NLP1 - Torbjörn Lager 43
Another Problem
s --> np, vp.vp --> v, np.vp --> v, np, pp.vp --> v, np, vp....
Ex: John saw the man talk with the actress
Parsing is exponential in the worst case!
NLP1 - Torbjörn Lager 44
Solution
Use a table (chart) in which parsed constituents are stored. No constituent is added to the chart which is already in it.
Parsing can be done in O(n3) time (where n is length of input).
NLP1 - Torbjörn Lager 45
Some Parsing Issues
AccuracySpeedSpace requirementsRobustnessLearning
Processor
Knowledge
Text Parsed text
NLP1 - Torbjörn Lager 46
Problems with Traditional Parsers
Bad coverage
Brittleness
Slowness
Too many trees!
NLP1 - Torbjörn Lager 47
Problems with Traditional Parsers
Correct lowlevel parses are often rejected because they do not fit into a global parse -> brittleness
Ambiguity -> indeterminism -> search -> slow parsers
Ambiguity -> sometimes hundreds of thousands of parse trees, and what can we do with these?
NLP1 - Torbjörn Lager 48
Another strategy (Abney)
Start with the simplest constructions (’easy-first parsing’) and be as careful as possible when parsing them -> ’islands of certainty’
’islands of certainty’ -> do not reject these parses even if they do not fit into a global parse -> robustness
When you are almost sure of how to resolve an ambiguity, do it! -> determinism
When you are uncertain of how to resolve an ambiguity, don’t even try! -> ’containment of ambiguity’ -> determinism
determinism -> no search -> speed
NLP1 - Torbjörn Lager 49
Shallow Parsers
Works on Part-of-Speech tagged data Analyses less complete than conventional parser
output Identifies some phrasal constituents (e.g. NPs),
without indicating their internal structure and their function in the sentence.
or identifies the functional role of some of the words, such as the main verb, and its direct arguments.
NLP1 - Torbjörn Lager 50
Deterministic bottom-up parsing
Adapted from Karttunen 1996:
define NP [(d) a* n+] ;regex NP @-> “[NP” ... “]”
.o. v “[NP” NP “]” @-> “[VP” ... “]” ;
apply down dannvaan[NP dann][VP v [NP aan]]
Note the use of the longest-match operator!