programming for linguistics: evaluating parser output

Programming for Linguistics December 16, 2013

Evaluating parser output

Outline of topics

1. Preprocessing (10 slides)2. Processing (6 slides)3. Evaluation (1 slide)4. Results (1 slide)5. Improvements (1 slide)

Programming for Linguistics

1 Preprocessing

Penn Treebank trees as NLTK Tree objects


1 Preprocessing


2 Processing

D. Klein and C. Manning 2002. A generative constituent-context model for improved grammar induction. In Proceedings of the ACL.


2 Processing

Does this work for original trees and processed trees?


2 Processing

Spans represented as label-span tuples:(‘label’, start, end)


2 Processing


2 Processing

Now what can we do with this information?


4 Evaluation


5 Results

Evaluation after removing all complex labels and empty categories.GL = gold labelsAL = auto labelsM = matches


5 Improvements

● Cross brackets● Remove only labels/categories● Consider context● Constituent types● Depends on task…

Thanks!

Self Improvement