drupal and the semantic web - semtechbiz 2012

Post on 29-Aug-2014

1.579 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Leveraging the Semantic Web with Drupal 7

Stéphane Corlosquet, Paolo CiccareseMIND InformaticsSemTechBiz San Francisco 2012June 4th, 2012

About the speakers

● Stéphane Corlosquet

● 6 years with Drupal

● Drupal core maintainer (RDF)

● Drupal Security Team member

● Co-authored theDefinitive Guide to Drupal 7

● Co-maintain RDF Extensions,SPARQL, schema.org

● Member of the RDFa WG

About the speakers

● Paolo Ciccarese, PhD

● Assistant in Neurology at Mass General Hospital

● Research faculty at Harvard Medical School

● Author of 30+ scientific publications

● Senior software and knowledge engineer

● Member of W3C HCLS Interest Group

● Co-chair of the W3C Open Annotation Community Group

Tutorial outline

● Introduction to Drupal

● What is it good for

● Installation / Hosted Drupal

● Semantic Web and Drupal

● Technology stack

● Use cases, hands on session

● Domeo & Drupal

Drupal

● Dries Buytaert - small news site in 2000

● Open Source - 2001

● Content Management System

● LAMP stack

● Non-developers can build sites and publish content

● Control panels instead of code

http://www.flickr.com/photos/funkyah/2400889778/

Drupal

● Open & modular architecture

● Extensible by modules

● Standards-based

● Low resource hosting

● Scalable

Building a Drupal site

http://www.flickr.com/photos/toomuchdew/3792159077/

Building a Drupal site

● Create the content types you need

Blog, article, wiki, forum, polls, image, video, podcast, e-commerce... (be creative)

http://www.flickr.com/photos/georgivar/4795856532/

Building a Drupal site

● Enable the features you want

Comments, tags, voting/rating, location, translations, revisions, search...

http://www.flickr.com/photos/skip/42288941/

Building a Drupal site

Set how your content is displayed

Building a Drupal site

Thousands of free contributed modules

● Google Analytics

● Wysiwyg

● Captcha

● Calendar

● XML sitemap

● Five stars

● Twitter

● ...

http://www.flickr.com/photos/kaptainkobold/1422600992/

The Drupal Community

http://www.flickr.com/photos/x-foto/4923221504/

The Drupal Community

http://webchick.net/node/80

“It’s really the Drupal community and not so much the software that makes the Drupal project what it

is. So fostering the Drupal community is actually more important than just managing the code base.” -

Dries Buytaert

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

Who uses Drupal?

http://buytaert.net/tag/drupal-sites

Try Drupal 7

● Download and Install Drupal 7

● Grab latest release http://drupal.org/project/drupal

● LAMP stack:

– Mac OS: http://www.mamp.info/

– Acquia Stack http://acquia.com/downloads

● Drupal Gardens: free Drupal 7 site http://www.drupalgardens.com/

Rich Snippets

Google

Yahoo!

Bing

Why Structured Data in HTML

● Help machines extract relevant data from HTML

● Can make use of this data in amazing ways (e.g. enhanced search results)

Structured Data in HTML

● Add or alter HTML attributes

● Syntaxes

– Microformats (@class, @rel)

– RDFa (@property, @about, @typeof, …)

– Microdata (@itemscope, @itemtype, @itemprop, …)

– RDFa 1.1 & RDFa Lite

Structured Data in HTML

● Evolution and cross-syntax influence

Schema.org

Schema.org

● Describe the type of your content (Person, Event, Recipe, Product, Book, Movie, etc.)

– 290 types and counting

● Each type has a set of properties

– Common properties: name, description, image, url

– Specific properties depending on the type (see type page on schema.org)

– 240 properties and counting

Credits: Dan Brickley - link.

Schema.org

Schema.org module for Drupal

● UI instead of code

● Map your content types and fields to the schema.org terms

http://drupal.org/project/schemaorg

Example: Event

Rich Snippet testing tool

● http://www.google.com/webmasters/tools/richsnippets

Examples in the wild

● Events

– “force11 events”: http://goo.gl/VVhNM

– DrupalCon Munich: http://goo.gl/jgMvw

● Recipes

– “delicious lemon coconut squares”: http://goo.gl/ORdl1

– Apple pie with ingredients: http://goo.gl/wCO1w

Examples in the wild

● University of Waterloo

– School of Public Health and Health Systems launch: http://goo.gl/Df9hp

● Curling tournament calendar

– European Curling Championships 2012: http://goo.gl/YXgXl

– World Women’s Curling Championships 2013: http://goo.gl/BDNZW

Schema.org module

● http://drupal.org/project/schemaorg

– Download module (beta)

– Documentation on drupal.org

– Screencast + examples

Schema.org module

Play time!

http://www.google.com/webmasters/tools/richsnippets

Drupal 7 and RDF

History of RDF in Drupal

● rdf.php (2000, Dries)

● FOAF, vCard (2004, walkah)

● Relationship (2005, dman)

● Semantic Search (2006, hendler)

● RDF (2007, Arto)

● OpenCalais (febbraro, 2008)

● RDF CCK (2008, scor)

Drupal 7 and RDF

● Drupal 7 core is RDFa enabled

● RDFa output by default on blogs, forums, comments, etc. using FOAF, SIOC, DC, SKOS

http://en.wikipedia.org/wiki/File:Oriente_Station_Lisboa_roof.jpg

Architecture

● User driven data model

● Content type => RDF class

● Field => RDF property

● Node => RDF resource

Content types and Fields

Content types and Fields

Node

Drupal 7 and RDF

Drupal 7 and RDF

● Contributed module for more features

● RDF Extensions● Serialization formats: RDF/XML, Turtle, N-Triples

● SPARQL● Expose Drupal RDF data in a SPARQL Endpoint

● SPARQL Views● Display remote RDF data in Drupal using SPARQL

● JSON-LD● Expose Drupal RDF data as JSON-LD (CORS-enabled)

● Features and packaging● Build distributions / deployment workflow

SPARQL Endpoint

http://drupal.org/project/sparql

● Indexing

SPARQL Endpoint

● Public endpoint available at /sparql

● http://prefix.cc/sioc,rnews.sparql

JSON-LD in Drupal

● Client side as well as server side friendly

● Browser Scripting:

– Native javascript format

– RDFa API in the DOM

● Data can be fetched from anywhere:

– Cross-Origin Resource Sharing (CORS) enabled

● Client can mash data

● http://drupal.org/project/jsonld

JSON-LD plug

RDFa 1.1

● RDFa Lite

● RDFa 1.1 Full

● http://rdfa.info/play/

Demos

rNews / SPARQL

PREFIX dc: <http://purl.org/dc/terms/>PREFIX rnews: <http://iptc.org/std/rNews/2011-10-07#>

SELECT * WHERE { ?s a rnews:Article; dc:title ?title.}

Demos

● Occupy Directory

– http://directory.occupy.net/occupations

– JSON-LD: http://directory.occupy.net/node/19652.jsonld

● Federated General Assembly

– Drupal distribution for occupy movement

– http://wiki.occupy.net/wiki/Federated_General_Assembly

DOMEO: a web-based tool for semantic annotation of online

documents

As (biomedical) scientists…

• We deal with an increasing amount of digital resources (documents, images, videos, datasets, databases… )

• We commonly use annotation but…

–are we really efficient?

–can we leverage machine computation?

–can we share it easily with our colleagues?

–can we capitalize on the work of colleagues?

–can we integrate it with other resources?

Annotation Framework (Components)

• Annotation Ontolog y (AO): OWL  vocabulary for representing and sharing annotation of digital resources and their fragments – Website http://purl.org/ao/home

– Paper http://www.jbiomedsem.com/content/2/S 2/S 4

• DOM E O c lient: web application for producing and sharing manual, semi-automatic and automatic annotation – Website http://annotationframework.org

– Paper http://www.jbiomedsem.com/content/3/S 1/S 1

Annotation of digital resources

http://antibodyregistry.org/antibody17/antibodyform.html?

gui_type=advanced&ab_id=2266850

antibodyregistry.org

Visually and effectively annotate - better

semantically annotate - any digital resource

and resource fragment, while performing our

regular browsing/reading activities

Leverage text mining and community curation

Run text mining and entities recognition algorithms on scientific documents and persist the results in a standard format

Benefit from crowdsourcing by supporting curation of manual and automatic annotation

… and more

• Efficiently search and reuse the annotation

– S emantic inference

• S ubscribe to feeds related to topics of interest

–Proteins, Cells, Authors, Papers…

• Retrieve additional content (mashups)

–Entrez gene, UniProt, …

S emantic tagging through ontologies

Sem

antic

Tag

http://purl.obolibrary.org/obo/PR_000004168Label ‘amyloid beta A4 protein’Exact synonyms ‘APP’, ‘amyloidogenic glycoprotein’, …Related S ynonyms ‘A4’, ‘ABPP’,

Is a http://purl.obolibrary.org/obo/PR_000000001Label ‘protein’Definition ‘An amino acid chain that… ’

S ource: Protein Ontology (PRO)https://pir5.georgetown.edu/wiki/PRO

AP

Ps for

the S

em

antic

Resourc

es P

roje

ct, M

ay 2

010

Zooming in

APPs for the S emantic Resources Project, May 2010

Annotation Ontology (AO)

OWL  vocabulary for representing and sharing annotation of digital resources and their fragments

Not only for biomedicine!

–Website http://purl.org/ao/home

–Paper http://www.jbiomedsem.com/content/2/S 2/S 4

A simplified view of AOAO allows to annotate: Res ourc es : Documents (HTML, PDF, Word, Excel), Images, Databases, Web S ervices... (and their fragments) S pecifying (or not) an:

Annotation Type: through one of the already available types (errata, highlight, qualifiers...) or the ones the users will define.With (or without) a: Topic : free text, structured text, UR Is, RDF entities, RDF graphs, domain ontologies…Tracing: Provenance: who created what, when, with which software, with what expectations…

Annotating a documentA

lzS

WA

N: http://tin

yurl.c

om

/18r

Annotating a document fragment

Protein Ontology – PRO: http://purl.org/obo/owl/PRO

S WAN Ontology 2.0: http://code.google.com/p/swan-ontology/

HyQ

ue

trip

les

Work

flow

s

Experim

ents

Annotation Ontology Network

The Living DocumentProject

Biotea

Open Annotation Community GroupAnnotation Ontology is going to be replaced in our applications by the Open Annotation Model developed through the W3C Open

Annotation Community Group

–Website http://www.w3.org/community/openannotation/

–Core Model http://www.openannotation.org/spec/core/

–Extensions http://www.openannotation.org/spec/extension/

DOMEO: Document Metadata Organizer

S emantic Tags or Qualifiers [1]

S emantic Tags or Qualifiers [2]

S emantic Tags or Qualifiers [3]

Domeo and the NCBO Annotator

http://w

ww

.bio

onto

logy.

org

/annota

tor-

serv

ice

Domeo allows automatic/manual annotation with terms coming from selected ontologies managed by

the BioPortal

Running NCBO Annotator

Additional text mining services will be listed here

NCBO Annotator Results in Domeo

List of recognized entities

Results Curation

Customizable

Cumulative Results Curation

One item only

All instances with the same text match

All instances independently from the text match

S erialization in AO/RDF (S hare)

UIMA, C lerezza and AO

AO RDF

http://www.slideshare.net/paolociccarese/domeo-and-text-mining

AO RDFApplicationsPublishing

Text Mining Results

CuratedText

Mining Results

Evaluating PerformanceComparing AlgorithmsLearning…

Combining'Disparate'Sources'of'Data'

h[p://annota7onframework.org/!

Demos

● Domeo + Drupal

– Data mash up from independent, but related sources

Thanks!

● Stéphane Corlosquet: scorlosquet@gmail.com

– @scorlosquet

– http://openspring.net/

● Paolo Ciccarese: paolo.ciccarese@gmail.com

top related