autoeval and missplel: two generic tools for automatic evaluation johnny bigert, linus ericson,...

Post on 28-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

AutoEval and Missplel:Two Generic Tools for Automatic Evaluation

Johnny Bigert, Linus Ericson, Anton Solis

Nada, KTH, Stockholm, SwedenContact: johnny@kth.se

www.nada.kth.se/theory/humanlang/tools.html

Manual evaluation Time-consuming, tedious, error-

prone Computers are good at repetitive

tasks, humans are not Unavoidable in some situations

Automatic evaluation Cheap, fast, accurate, easily

reproducible Incorporated in the development of

most NLP system

Automatic evaluation AutoEval: simplifies the

construction of (NLP system) evaluation

Missplel: introduces human-like errors into text

AutoEval "I write evaluation code myself in

all our NLP projects" "Why would I need AutoEval?"

AutoEval Our point exactly

Repetition of: Input and output file handling XML parsing and XML output Error handling, malformed input Data storage, management and

processing

AutoEvalFeatures — avoids repetition: Handles input (XML/structured plain-

text) and generates output (XML) Handles data storage and processing

...and also: Generic and extendible script

language Efficient

AutoEval

Script language: Simple C-like syntax Powerful Modules and macros in repository

files Extendible, add your own functions

AutoEval

<root> <files> <file format="plain" type="in" name="datafile">TnT.wt</file> <file format="xml" type="out" name="outfile">out.xml</file> </files> <process> field(file("datafile"), "\t", "\n", var("word"), var("tag")); inc(cnt("tot")); inc(cnt(lookup("tag"))); </process> <processonce> outputintcon(out("outfile"), cntmap("global"), "global"); </processonce></root>

Example of configuration and script language:

AutoEval

<evaloutput date="Mon May 26 12:37:39 2003"><global> <var name="tot">14119</var>

<var name="ab">714</var> <var name="ab.kom">44</var> <var name="ab.pos">149</var> <var name="ab.suv">24</var> ... <var name="vb.sup.akt">117</var> <var name="vb.sup.sfo">35</var></global>

The result:

Missplel Missplel is a highly configurable

tool to introduce human-like spelling errors

Language, PoS tag set, character set and keyboard layout independent

All you need is a word/tag/lemma dictionary

Missplel

Performance errors – Damerau: Keyboard mistypes (Damerau, 1964):

Insertion, deletion, substitution, transposition of letters

wellcvome, wellcme, wellcpme, wellcmoe

Result: a new existing/non-existing word word class (PoS tag) change or not

MissplelCompetence errors – split compounds: May alter the semantics of a sentence

Kycklinglever – chicken liver Kyckling lever – chicken is alive

Settings of split compound elements: Minimum length? Allowed PoS tag? Found in dictionary? Word class change? etc.

Missplel

Competence errors – sound errors: Letter level e.g. sound-alike errors Regular expression rules:

(.+)ei(.+) @1ie@2 receive recieve

Missplel

Competence errors – syntax errors: Word/letter level Form new words from PoS tags,

missing/doubled words etc. Regular expression rules:

<rule ex="slutat skrika - slutat skrikit"> <match>vb\.sup\.akt(.*) vb\.inf.*</match> <to>vb.sup.akt@1 vb.sup.akt</to>

</rule>

Missplel

Letters NN2 would VM0 be VBIwelcome AJ0-NN1

Litters NN2 damerau/wordexist-notagchange would VM0 okbee NN1 sound/wordexist-tagchangewelcmoe ERR damerau/nowordexist-tagchange

Missplel <input> <filename>TnT.wt</filename> <expression>([^\t]+)\t([^\t]+)([^\r\n]*).*</expression> </input>

<output> <filename>output.wte</filename> <!-- %1% Word, %2% Tag, %3% Lemma, %4% Rest of line, %5% Error descr --> <format>%1% %2% %5%</format> <description> <noError>ok</noError> <existingWord>exist</existingWord> <nonExistingWord>noexist</nonExistingWord> <wordChange>-wordch</wordChange> <noWordChange>-nowordch</noWordChange> <tagChange>-tagch</tagChange> <noTagChange>-notagch</noTagChange> </description> </output> ...

Missplel ... <options> <unknownTag>unknown</unknownTag> <unknownLemma>unknownLemma</unknownLemma> <escapeChar>@</escapeChar> <spaceChar> </spaceChar> <wordChar>'</wordChar> <sentenceSeparatorTag>mad</sentenceSeparatorTag> <maxErrorsInSentence>30</maxErrorsInSentence> <configDir>felstava/conf/</configDir> </options>

<wordlist> <create> <filename>Swedish.cwtl</filename> <expression>.+\t([^\t]+)\t([^\t]+)\t+([^\t]+)</expression> </create> <wordfile>outfile.gz</wordfile> <tagfile>tagfile</tagfile> </wordlist> ...

Missplel ... <damerau> <reportName>damerau</reportName> <active>yes</active> <probability>10.0</probability> <confusionMatrix>confusionfile</confusionMatrix> <subst>1</subst> <ins>1</ins> <del>1</del> <transp>1</transp> <allowExistingWords>no</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </damerau> ...

Missplel ... <splitCompound> <reportName>split</reportName> <active>no</active> <probability>99.0</probability> <splitUnknownWords>yes</splitUnknownWords> <splitThreshold>50</splitThreshold> <minWordLength>6</minWordLength> <minSplitWordLength>3</minSplitWordLength> <factors> <wordLength>1</wordLength> <inDictionaryFirst>10</inDictionaryFirst> <inDictionarySecond>10</inDictionarySecond> <tagAllowed>10</tagAllowed> <tagMatchFirst>0</tagMatchFirst> <tagMatchSecond>15</tagMatchSecond> </factors> </splitCompound> ...

Missplel ... <soundError> <reportName>sound</reportName> <active>no</active> <filename>sound.test</filename> <probability>100.0</probability> <expression>(.+)\t(.+)\t(.+)</expression> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </soundError> ...

Missplel ... <syntaxError> <reportName>introduced</reportName> <active>no</active> <filename>error.rules</filename> <probability>100.0</probability> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </syntaxError>

Applications AutoEval has been used to evaluate

Parsers PoS taggers PoS majority/ensemble tagging

Missplel has been used to evaluate Spell checkers Grammar checkers Robustness of parsers and taggers

Licence AutoEval and Missplel are open

source under the Gnu General Public Licence

Source code available at www.nada.kth.se/theory/ humanlang/tools.html

top related