autoeval and missplel: two generic tools for automatic evaluation johnny bigert, linus ericson,...

AutoEval and Missplel:Two Generic Tools for Automatic Evaluation

Johnny Bigert, Linus Ericson, Anton Solis

Nada, KTH, Stockholm, SwedenContact: johnny@kth.se

www.nada.kth.se/theory/humanlang/tools.html

Manual evaluation Time-consuming, tedious, error-

prone Computers are good at repetitive

tasks, humans are not Unavoidable in some situations

Automatic evaluation Cheap, fast, accurate, easily

reproducible Incorporated in the development of

most NLP system

Automatic evaluation AutoEval: simplifies the

construction of (NLP system) evaluation

Missplel: introduces human-like errors into text

AutoEval "I write evaluation code myself in

all our NLP projects" "Why would I need AutoEval?"

AutoEval Our point exactly

Repetition of: Input and output file handling XML parsing and XML output Error handling, malformed input Data storage, management and

processing

AutoEvalFeatures — avoids repetition: Handles input (XML/structured plain-

text) and generates output (XML) Handles data storage and processing

...and also: Generic and extendible script

language Efficient

AutoEval

Script language: Simple C-like syntax Powerful Modules and macros in repository

files Extendible, add your own functions

AutoEval

<root> <files> <file format="plain" type="in" name="datafile">TnT.wt</file> <file format="xml" type="out" name="outfile">out.xml</file> </files> <process> field(file("datafile"), "\t", "\n", var("word"), var("tag")); inc(cnt("tot")); inc(cnt(lookup("tag"))); </process> <processonce> outputintcon(out("outfile"), cntmap("global"), "global"); </processonce></root>

Example of configuration and script language:

AutoEval

The result:

Missplel Missplel is a highly configurable

tool to introduce human-like spelling errors

Language, PoS tag set, character set and keyboard layout independent

All you need is a word/tag/lemma dictionary

Missplel

Performance errors – Damerau: Keyboard mistypes (Damerau, 1964):

Insertion, deletion, substitution, transposition of letters

wellcvome, wellcme, wellcpme, wellcmoe

Result: a new existing/non-existing word word class (PoS tag) change or not

MissplelCompetence errors – split compounds: May alter the semantics of a sentence

Kycklinglever – chicken liver Kyckling lever – chicken is alive

Settings of split compound elements: Minimum length? Allowed PoS tag? Found in dictionary? Word class change? etc.

Missplel

Competence errors – sound errors: Letter level e.g. sound-alike errors Regular expression rules:

(.+)ei(.+) @1ie@2 receive recieve

Missplel

Competence errors – syntax errors: Word/letter level Form new words from PoS tags,

missing/doubled words etc. Regular expression rules:

</rule>

Missplel

Letters NN2 would VM0 be VBIwelcome AJ0-NN1

Litters NN2 damerau/wordexist-notagchange would VM0 okbee NN1 sound/wordexist-tagchangewelcmoe ERR damerau/nowordexist-tagchange

Missplel <input> <filename>TnT.wt</filename> <expression>([^\t]+)\t([^\t]+)([^\r\n]*).*</expression> </input>

<output> <filename>output.wte</filename>  <format>%1% %2% %5%</format> <description> <noError>ok</noError> <existingWord>exist</existingWord> <nonExistingWord>noexist</nonExistingWord> <wordChange>-wordch</wordChange> <noWordChange>-nowordch</noWordChange> <tagChange>-tagch</tagChange> <noTagChange>-notagch</noTagChange> </description> </output> ...

Missplel ... <options> <unknownTag>unknown</unknownTag> <unknownLemma>unknownLemma</unknownLemma> <escapeChar>@</escapeChar> <spaceChar> </spaceChar> <wordChar>'</wordChar> <sentenceSeparatorTag>mad</sentenceSeparatorTag> <maxErrorsInSentence>30</maxErrorsInSentence> <configDir>felstava/conf/</configDir> </options>

<wordlist> <create> <filename>Swedish.cwtl</filename> <expression>.+\t([^\t]+)\t([^\t]+)\t+([^\t]+)</expression> </create> <wordfile>outfile.gz</wordfile> <tagfile>tagfile</tagfile> </wordlist> ...

Missplel ... <damerau> <reportName>damerau</reportName> <active>yes</active> <probability>10.0</probability> <confusionMatrix>confusionfile</confusionMatrix> <subst>1</subst> <ins>1</ins> <del>1</del> <transp>1</transp> <allowExistingWords>no</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </damerau> ...

Missplel ... <splitCompound> <reportName>split</reportName> <active>no</active> <probability>99.0</probability> <splitUnknownWords>yes</splitUnknownWords> <splitThreshold>50</splitThreshold> <minWordLength>6</minWordLength> <minSplitWordLength>3</minSplitWordLength> <factors> <wordLength>1</wordLength> <inDictionaryFirst>10</inDictionaryFirst> <inDictionarySecond>10</inDictionarySecond> <tagAllowed>10</tagAllowed> <tagMatchFirst>0</tagMatchFirst> <tagMatchSecond>15</tagMatchSecond> </factors> </splitCompound> ...

Missplel ... <soundError> <reportName>sound</reportName> <active>no</active> <filename>sound.test</filename> <probability>100.0</probability> <expression>(.+)\t(.+)\t(.+)</expression> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </soundError> ...

Missplel ... <syntaxError> <reportName>introduced</reportName> <active>no</active> <filename>error.rules</filename> <probability>100.0</probability> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </syntaxError>

Applications AutoEval has been used to evaluate

Parsers PoS taggers PoS majority/ensemble tagging

Missplel has been used to evaluate Spell checkers Grammar checkers Robustness of parsers and taggers

Licence AutoEval and Missplel are open

source under the Gnu General Public Licence

Source code available at www.nada.kth.se/theory/ humanlang/tools.html

autoeval and missplel: two generic tools for automatic evaluation johnny bigert, linus ericson,...

notmissplelcompetence

pos tag set

pos tags

keyboard mistypes damerau

nlp projectswhy

word class change

wordletter levelform

missingdoubled words

Documents

smart sustainable cities – a strategic innovation agenda...

johnny bigert and ola knutsson royal institute of technology...

shannon decomposition william sandqvist william@kth.se...

data-driven resource...

digital design ie1204 - kth...digital design ie1204 william...

final report - kth/menu... · final report automatic...

benign neglect - kth.se

oh johnny johnny - revista literaria...

el2520 control theory and practice - kth.se

saab ab - kth.se

generic object detection using feature maps oscar danielsson...

lecture 1 - kth.se

artur podobas podobas@kth.se scs seminar 2015...

butepage@kth.se, black@tuebingen.mpg.de, dani@kth.se ... ·...

electrostatic ion thrusters for space debris removal ·...

fkartasev, carlora, dominikfg @kth.se arxiv:1809.07759v1...

shannon decomposition - kthshannon decomposition william...

probabilistic detection of context-sensitive spelling errors...

butepage@kth.se, black@tuebingen.mpg.de, dani@kth.se,...

final report - kth.se