integrating nlp with linked data and rdf: the nif format (hands...
TRANSCRIPT
-
10/20/14 1Building the Multilingual Web of Data – ISWC
tutorial
Integrating NLP with Linked Data and RDF: the NIF format (hands on)
Ciro Baron Neto Ph.D student at University of Leipzig
-
10/20/14 2Building the Multilingual Web of Data – ISWC
tutorial
Overview• Github NLP2RDF web page overview
and NIF Online demos (Dashboard, Combinator...)• Examples–Example 1: How to annotate string• using Snowball Steamer and OpenNLP
–Example 2: • Query generated NIF data and Querying Brown Corpus
-
10/20/14 3Building the Multilingual Web of Data – ISWC
tutorial
NLP2RDF GitHub Website
• https://github.com/NLP2RDF/
• /home/ciro/websites/github/github.com/NLP2RDF/index.html
https://github.com/NLP2RDF/file:///home/ciro/websites/github/github.com/NLP2RDF/index.html
-
10/20/14 4Building the Multilingual Web of Data – ISWC
tutorial
dashboard.nlp2rdf.aksw.org
-
10/20/14 5Building the Multilingual Web of Data – ISWC
tutorial
nlp2rdf.aksw.org
-
10/20/14 6Building the Multilingual Web of Data – ISWC
tutorial
Example 1: Snowball Stemmer Wrapper
-
10/20/14 7Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
• Stemming algorithm is a process for removing suffixes from words.–CONNECT• CONNECTED• CONNECTION• CONNECTING• CONNECTIONS
-
10/20/14 8Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper• 1. Open the USB stick folder• 2. Go to “NIF_tutorial_hands_on_jars” folder • 3. Open the “instructions.txt” file in a text
editor• 4. Open a terminal• 5. Go to the “jar” folder
-
10/20/14 9Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper• Copy the second command of the
instructions.txt“java -jar snowball.jar -f text -i 'My favorite actress is Natalie Portman.'“• -f is used to define the format• -i is used to define the input
• Paste in the terminal
-
10/20/14 10Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
-
10/20/14 11Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
-
10/20/14 12Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
NIF Standard AnnotationsNIF Offset
-
10/20/14 13Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
NIF Standard Annotations
Snowball StemNIF Offset
-
10/20/14 14Building the Multilingual Web of Data – ISWC
tutorial
OpenNLP Wrapper• Back to the terminal and use the first command
of the instructions.txtjava -jar opennlp.jar -f text -i 'My favorite actress is Natalie Portman.' -modelFolder ../model/
• The -modelFolder parameter set the folder that contains the POS tagging OpenNLP trained models and tokenization.• You might add the parameter “--outfile
myAnnotatedFile.ttl“ to store the triples in a file.
-
10/20/14 15Building the Multilingual Web of Data – ISWC
tutorial
Example 2: Query Brown Corpus
-
10/20/14 16Building the Multilingual Web of Data – ISWC
tutorial
Querying with Twinkle • Open the “/twinkle/example” folder• Open the NIF_query_example file
in a text editor and copy the query• Open the “/twinle” folder and run
the command:java -jar twinkle.jar
-
10/20/14 17Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 18Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 19Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 20Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 21Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 22Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 23Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 24Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 25Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 26Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
-
10/20/14 27Building the Multilingual Web of Data – ISWC
tutorial
Exercise 3: Querying your own NIF annotated string
-
10/20/14 28Building the Multilingual Web of Data – ISWC
tutorial
Querying your own NIF annotated string
1. Annotate your string using one of the wrappers2. Save your annotated sentence to a file (using “--outfile”)3. Open Twinkle4. Query your string using Twinkle
-
10/20/14 29Building the Multilingual Web of Data – ISWC
tutorial
• Query your annotated string:– nif:Context– nif:Sentence– nif:anchorOf – nif:oliaCategory– nif:oliaLink… or practice with Brown Corpus!
-
10/20/14 30Building the Multilingual Web of Data – ISWC
tutorial
Thank you!
http://site.nlp2rdf.org/NLP2RDF Google+ Community
http://site.nlp2rdf.org/
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30