![Page 1: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/1.jpg)
AutoEval and Missplel:Two Generic Tools for Automatic Evaluation
Johnny Bigert, Linus Ericson, Anton Solis
Nada, KTH, Stockholm, SwedenContact: [email protected]
www.nada.kth.se/theory/humanlang/tools.html
![Page 2: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/2.jpg)
Manual evaluation Time-consuming, tedious, error-
prone Computers are good at repetitive
tasks, humans are not Unavoidable in some situations
![Page 3: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/3.jpg)
Automatic evaluation Cheap, fast, accurate, easily
reproducible Incorporated in the development of
most NLP system
![Page 4: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/4.jpg)
Automatic evaluation AutoEval: simplifies the
construction of (NLP system) evaluation
Missplel: introduces human-like errors into text
![Page 5: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/5.jpg)
AutoEval "I write evaluation code myself in
all our NLP projects" "Why would I need AutoEval?"
![Page 6: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/6.jpg)
AutoEval Our point exactly
Repetition of: Input and output file handling XML parsing and XML output Error handling, malformed input Data storage, management and
processing
![Page 7: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/7.jpg)
AutoEvalFeatures — avoids repetition: Handles input (XML/structured plain-
text) and generates output (XML) Handles data storage and processing...and also: Generic and extendible script
language Efficient
![Page 8: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/8.jpg)
AutoEvalScript language: Simple C-like syntax Powerful Modules and macros in repository
files Extendible, add your own functions
![Page 9: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/9.jpg)
AutoEval
<root> <files> <file format="plain" type="in" name="datafile">TnT.wt</file> <file format="xml" type="out" name="outfile">out.xml</file> </files> <process> field(file("datafile"), "\t", "\n", var("word"), var("tag")); inc(cnt("tot")); inc(cnt(lookup("tag"))); </process> <processonce> outputintcon(out("outfile"), cntmap("global"), "global"); </processonce></root>
Example of configuration and script language:
![Page 10: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/10.jpg)
AutoEval
<evaloutput date="Mon May 26 12:37:39 2003"><global> <var name="tot">14119</var>
<var name="ab">714</var> <var name="ab.kom">44</var> <var name="ab.pos">149</var> <var name="ab.suv">24</var> ... <var name="vb.sup.akt">117</var> <var name="vb.sup.sfo">35</var></global>
The result:
![Page 11: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/11.jpg)
Missplel Missplel is a highly configurable tool
to introduce human-like spelling errors
Language, PoS tag set, character set and keyboard layout independent
All you need is a word/tag/lemma dictionary
![Page 12: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/12.jpg)
MissplelPerformance errors – Damerau: Keyboard mistypes (Damerau, 1964):
Insertion, deletion, substitution, transposition of letters
wellcvome, wellcme, wellcpme, wellcmoe Result:
a new existing/non-existing word word class (PoS tag) change or not
![Page 13: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/13.jpg)
MissplelCompetence errors – split compounds: May alter the semantics of a
sentence Kycklinglever – chicken liver Kyckling lever – chicken is alive
Settings of split compound elements: Minimum length? Allowed PoS tag? Found in dictionary? Word class change? etc.
![Page 14: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/14.jpg)
MissplelCompetence errors – sound errors: Letter level e.g. sound-alike errors Regular expression rules:
(.+)ei(.+) @1ie@2 receive recieve
![Page 15: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/15.jpg)
MissplelCompetence errors – syntax errors: Word/letter level Form new words from PoS tags,
missing/doubled words etc. Regular expression rules:
<rule ex="slutat skrika - slutat skrikit"> <match>vb\.sup\.akt(.*) vb\.inf.*</match> <to>vb.sup.akt@1 vb.sup.akt</to>
</rule>
![Page 16: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/16.jpg)
MissplelLetters NN2 would VM0 be VBIwelcome AJ0-NN1
Litters NN2 damerau/wordexist-notagchange would VM0 okbee NN1 sound/wordexist-tagchangewelcmoe ERR damerau/nowordexist-tagchange
![Page 17: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/17.jpg)
Missplel <input> <filename>TnT.wt</filename> <expression>([^\t]+)\t([^\t]+)([^\r\n]*).*</expression> </input>
<output> <filename>output.wte</filename> <!-- %1% Word, %2% Tag, %3% Lemma, %4% Rest of line, %5% Error descr --> <format>%1% %2% %5%</format> <description> <noError>ok</noError> <existingWord>exist</existingWord> <nonExistingWord>noexist</nonExistingWord> <wordChange>-wordch</wordChange> <noWordChange>-nowordch</noWordChange> <tagChange>-tagch</tagChange> <noTagChange>-notagch</noTagChange> </description> </output> ...
![Page 18: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/18.jpg)
Missplel ... <options> <unknownTag>unknown</unknownTag> <unknownLemma>unknownLemma</unknownLemma> <escapeChar>@</escapeChar> <spaceChar> </spaceChar> <wordChar>'</wordChar> <sentenceSeparatorTag>mad</sentenceSeparatorTag> <maxErrorsInSentence>30</maxErrorsInSentence> <configDir>felstava/conf/</configDir> </options>
<wordlist> <create> <filename>Swedish.cwtl</filename> <expression>.+\t([^\t]+)\t([^\t]+)\t+([^\t]+)</expression> </create> <wordfile>outfile.gz</wordfile> <tagfile>tagfile</tagfile> </wordlist> ...
![Page 19: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/19.jpg)
Missplel ... <damerau> <reportName>damerau</reportName> <active>yes</active> <probability>10.0</probability> <confusionMatrix>confusionfile</confusionMatrix> <subst>1</subst> <ins>1</ins> <del>1</del> <transp>1</transp> <allowExistingWords>no</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </damerau> ...
![Page 20: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/20.jpg)
Missplel ... <splitCompound> <reportName>split</reportName> <active>no</active> <probability>99.0</probability> <splitUnknownWords>yes</splitUnknownWords> <splitThreshold>50</splitThreshold> <minWordLength>6</minWordLength> <minSplitWordLength>3</minSplitWordLength> <factors> <wordLength>1</wordLength> <inDictionaryFirst>10</inDictionaryFirst> <inDictionarySecond>10</inDictionarySecond> <tagAllowed>10</tagAllowed> <tagMatchFirst>0</tagMatchFirst> <tagMatchSecond>15</tagMatchSecond> </factors> </splitCompound> ...
![Page 21: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/21.jpg)
Missplel ... <soundError> <reportName>sound</reportName> <active>no</active> <filename>sound.test</filename> <probability>100.0</probability> <expression>(.+)\t(.+)\t(.+)</expression> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </soundError> ...
![Page 22: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/22.jpg)
Missplel ... <syntaxError> <reportName>introduced</reportName> <active>no</active> <filename>error.rules</filename> <probability>100.0</probability> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </syntaxError>
![Page 23: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/23.jpg)
Applications AutoEval has been used to evaluate
Parsers PoS taggers PoS majority/ensemble tagging
Missplel has been used to evaluate Spell checkers Grammar checkers Robustness of parsers and taggers
![Page 24: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation](https://reader035.vdocuments.us/reader035/viewer/2022070420/56815de7550346895dcc0fdb/html5/thumbnails/24.jpg)
Licence AutoEval and Missplel are open
source under the Gnu General Public Licence
Source code available at www.nada.kth.se/theory/ humanlang/tools.html