nif 2.0 hands on turorial

30
10/20/14 1 Building the Multilingual Web of Data – ISWC tutorial Integrating NLP with Linked Data and RDF: the NIF format (hands on) Ciro Baron Neto Ph.D student at University of Leipzig

Upload: ciro-neto

Post on 14-Jul-2015

106 views

Category:

Technology


0 download

TRANSCRIPT

10/20/14 1Building the Multilingual Web of Data – ISWC

tutorial

Integrating NLP with Linked Data and RDF: the NIF format (hands on)

Ciro Baron Neto Ph.D student at University of Leipzig

10/20/14 2Building the Multilingual Web of Data – ISWC

tutorial

Overview

• Github NLP2RDF web page overview and NIF Online demos (Dashboard, Combinator...)• Examples–Example 1: How to annotate string• using Snowball Steamer and OpenNLP

–Example 2: • Query generated NIF data and Querying Brown Corpus

10/20/14 3Building the Multilingual Web of Data – ISWC

tutorial

NLP2RDF GitHub Website

• https://github.com/NLP2RDF/

• /home/ciro/websites/github/github.com/NLP2RDF/index.html

10/20/14 4Building the Multilingual Web of Data – ISWC

tutorial

dashboard.nlp2rdf.aksw.org

10/20/14 5Building the Multilingual Web of Data – ISWC

tutorial

nlp2rdf.aksw.org

10/20/14 6Building the Multilingual Web of Data – ISWC

tutorial

Example 1: Snowball Stemmer Wrapper

10/20/14 7Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

• Stemming algorithm is a process for removing suffixes from words.–CONNECT• CONNECTED• CONNECTION• CONNECTING• CONNECTIONS

10/20/14 8Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

• 1. Open the USB stick folder• 2. Go to “NIF_tutorial_hands_on_jars” folder • 3. Open the “instructions.txt” file in a text

editor• 4. Open a terminal• 5. Go to the “jar” folder

10/20/14 9Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

• Copy the second command of the instructions.txt

“java -jar snowball.jar -f text -i 'My favorite actress is Natalie Portman.'“• -f is used to define the format• -i is used to define the input

• Paste in the terminal

10/20/14 10Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

10/20/14 11Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

10/20/14 12Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

NIF Standard AnnotationsNIF Offset

10/20/14 13Building the Multilingual Web of Data – ISWC

tutorial

Snowball Stemmer Wrapper

NIF Standard Annotations

Snowball StemNIF Offset

10/20/14 14Building the Multilingual Web of Data – ISWC

tutorial

OpenNLP Wrapper• Back to the terminal and use the first command

of the instructions.txtjava -jar opennlp.jar -f text -i 'My favorite actress is Natalie Portman.' -modelFolder ../model/

• The -modelFolder parameter set the folder that contains the POS tagging OpenNLP trained models and tokenization.• You might add the parameter “--outfile

myAnnotatedFile.ttl“ to store the triples in a file.

10/20/14 15Building the Multilingual Web of Data – ISWC

tutorial

Example 2: Query Brown Corpus

10/20/14 16Building the Multilingual Web of Data – ISWC

tutorial

Querying with Twinkle

• Open the “/twinkle/example” folder• Open the NIF_query_example file

in a text editor and copy the query• Open the “/twinle” folder and run

the command:java -jar twinkle.jar

10/20/14 17Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 18Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 19Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 20Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 21Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 22Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 23Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 24Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 25Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 26Building the Multilingual Web of Data – ISWC

tutorial

Querying Brown Corpus

10/20/14 27Building the Multilingual Web of Data – ISWC

tutorial

Exercise 3: Querying your own NIF annotated string

10/20/14 28Building the Multilingual Web of Data – ISWC

tutorial

Querying your own NIF annotated string

1. Annotate your string using one of the wrappers2. Save your annotated sentence to a file (using “--outfile”)3. Open Twinkle4. Query your string using Twinkle

10/20/14 29Building the Multilingual Web of Data – ISWC

tutorial

• Query your annotated string:– nif:Context– nif:Sentence– nif:anchorOf – nif:oliaCategory– nif:oliaLink

… or practice with Brown Corpus!

10/20/14 30Building the Multilingual Web of Data – ISWC

tutorial

Thank you!

http://site.nlp2rdf.org/NLP2RDF Google+ Community