hacking linked data a hands-on-exploration of an often nebulous concept for librarians! reinhard...

54
Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Upload: gavin-horn

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Hacking Linked Data

A hands-on-exploration of an often nebulous concept

for librarians!

Reinhard Engels, ABCD Library, October 2013

Page 2: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Disclaimers

1. I am not an expert!2. Apparently it takes more than a week to

become one3. Your brain may hurt

Page 3: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Goals

1. Convey some actual knowledge about LD2. Let you pass a polygraph3. Reassure that it’s OK to be confused4. Lower the bar for asking “stupid questions”

Page 4: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Agenda for next ~80 minutes

1. Quick review of what Linked Data (LD) is2. Look at some real LD (Dbpedia, NY Times)3. Make some simple LD (RDF “N-Triples”)4. Query remote LD source (SPARQL on

Dbpedia)5. How to embed LD in HTML (RDFa et al.)6. Ponder things that are kinda sorta like LD7. Recover!

Page 5: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Linked Data: What?

• “a set of best practices for publishing and connecting structured data on the web”

• Conceived by the guy who invented the WWW• Web of Data• Turns the web into a giant database• With a single, consistent API• Simple, elegant, familiar mechanism: URIs and

“typed links”

Page 6: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Linked Data: Why?

• For users: enables meaningful queries instead of just text string searches; research applications, consumer applications

• For creators: efficiency of not having to redundantly create and maintain data.

• One API for all data: this is a thing of beauty in itself.

Page 7: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Linked Data: How? (Mug version)

Page 8: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Linked Data: How? (Principles)

1. Use URIs as names for things.2. Use HTTP URIs, so that people can look up

those names.3. When someone looks up a URI, provide

useful information, using the standards (RDF, SPARQL).

4. Include links to other URIs, so that they can discover more things.

Page 9: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Linked Data: How? (RE’s formulation)

1. Describe things using RDF triples2. Identify things using HTTP URIs3. Those URIs should link to more LD (that other

people have already created, whenever possible)

Page 10: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Linked Data: How? (RE’s even shorter reformulation)

1. Describe with RDF2. Identify with HTTP URIs3. Link to more LD

Page 11: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

1. Describe with RDF

• RDF = Resource Description Framework• F stands for framework, not file type!• It’s a conceptual model• “content agnostic” (can describe anything)• Describe things using 3 terms (“RDF triples”)

1. Subject 2. Predicate 3. ObjectFred Likes Wilma

Fred Date of Birth October 2, 1973

Page 12: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

2. Identify with HTTP URIs

1. Subject and Predicate MUST be URIs2. Object may be URI or raw value (number, text, date, etc.)

1. Subject 2. Predicate 3. ObjectFred Likes Wilma

http://s.org/fred http://p.org/likes “Wilma”

http://s.org/fred http://p.org/likes http://o.org/wilma

Page 13: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

3. Link to more LD

• What format should the referenced LD be in?• If I go to http://o.org/wilma, what should I see

there?• Are predicates in RDF too? http://p.org/likes

Page 14: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

It’s SO EASY

(Why are you even here?)

Page 15: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

And it will save the world!

(Why aren’t you making Linked Data NOW?)

Page 16: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Montage of leading LD sites

Page 17: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

"At first glance, the principles of Linked Data seem simple enough. However experienced Web developers, designers and architects who attempt to put these ideas into practice often find themselves having to digest and understand debates about Web architecture, the semantic web, artificial intelligence and the philosophical nature of identity.” – Ed Summers & Dorothea Salo

Page 18: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

“LD makes my brain hurt”

• It’s OK!• Though core concepts are very simple• It quickly gets confusing – it’s not just you• Accidental: partially overlapping concepts.• Intrinsic: simple parts make complex whole• Danger: Is it too simple? (ambiguous)

“Make things as simple as possible, but not simpler.” – Einstein (paraphrased)

Page 19: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

“External” Overlapping Concepts

• There are a lot of things that are kinda sorta like LD!

• Semantic Web (1994)• Web APIs (10,214 and counting)• Facebook Open Graph?• Schema.org and microdata? (google, yahoo,

microsoft)• microformats

Page 20: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

The Semantic Web

• Semantic Web: 1994• “The vision of the Semantic Web is to extend

principles of the Web from documents to data” – W3C

• “This simple idea [the Semantic Web]… remains largely unrealized.” – Tim Berners-Lee et al., 2006

Page 21: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

LD and the Semantic Web

Is Linked Data (2006) a:• Special case: narrowing and focusing?• Redo: “The semantic web done right?”• Addition: Semantic web + links?• Rebranding of a troubled project?

Page 22: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

“Internal” forms of confusion

• URI, URL, URN, IRI, CURIE• RDF “Serializations”: RDF/XML, RDFa, N-

Triples, Turtle, JSON-LD• Ontologies vs. ontology languages vs. “schema

languages” vs. plain old RDF: RDFs, OWL, FOAF• SPARQL

Page 23: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Let’s make some LD!

• 5 star LD!• That means we need to link to other LD• So we need to identify some existing LD to link

to…

Page 24: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

DBPedia

• Linked Data version of Wikipedia• Take any wikipedia url• Replace “en.wikipedia.org/wiki”• With “dbpedia.org/page”• And you have the LD expression of that

concept.

Page 25: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Example: our fair city

• http://en.wikipedia.org/wiki/Cambridge,_Massachusetts• http://dbpedia.org/page/Cambridge,_Massachusetts

Page 26: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

An RDF triple (in HTML)

Page 27: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

RDF Graphs

• A set of RDF triples is called a “graph”• Graph in this sense is a math/comp sci data

structure• Not a visual plot

Page 28: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

“provide useful information using the standards…”

Page 29: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Cambridge in CSV (in excel)

2409 RDF Triples about Cambridge

Page 30: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013
Page 31: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013
Page 32: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013
Page 34: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Let’s make an LD “comment” about Cambridge!

1. Open the “ntriples” dbpedia file and find the existing English language comment

Page 35: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013
Page 36: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013
Page 37: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Using own id with sameAs

Subject Predicate Object

<http://mylinkeddata.org/resource/123>

<http://www.w3.org/2002/07/owl#sameAs>

<http://dbpedia.org/resource/Cambridge,_Massachusetts>

<http://mylinkeddata.org/resource/123>

<http://www.w3.org/2000/01/rdf-schema#comment>

"Cambridge is a pretty cool town"@en

Page 38: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

5 Star LD!Stick this data:

At this URL: http://mylinkeddata.org/resource/123

Page 39: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Summary of LD creation

• Creating RDF triples is easy• Figuring out the right HTTP URIs to use is hard• Figuring out how to respond to any HTTP URI

requests you receive is also harder than I would like

Page 40: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Querying LD with SPARQL

• SPARQL: Recursive acronym for SPARQL Protocol and RDF Query Language

• RQL is the part we’re interested in• LD’s answer to SQL• Instead of querying tables in a db• You query a graph of rdf triples• Using “triple patterns” (and some other stuff)

Page 41: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Let’s query the DBPedia SPARQL endpoint!

Note: You want to point your browser to “snorql” (not sparql!):

http://dbpedia.org/snorql

Page 42: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Not the most user friendly site…

Page 43: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

A Query (in English)

• Show me name and dates of birth and death for people whose “main interests” are theology and nihilism

Page 44: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Same Query (in SPARQL)

PREFIX foaf: http://xmlns.com/foaf/0.1/PREFIX dbo: http://dbpedia.org/ontology/PREFIX : http://dbpedia.org/resource/

SELECT ?name ?birth ?death ?person WHERE { ?person dbo:mainInterest :Nihilism . ?person dbo:mainInterest :Theology . ?person dbo:birthDate ?birth . ?person foaf:name ?name . ?person dbo:deathDate ?death .}ORDER BY ?name

Page 45: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

http://bit.ly/1ip6leF

Page 46: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

More sample queries

http://wiki.dbpedia.org/OnlineAccess#h28-5

Play around with them. Swap out some parameters. Stare at your favorite dbpedia records you found bymodifying wikipedia urls to get ideas for other triple patterns.

Page 47: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

If you want to run SPARL against your own RDF data…

• Install apache Jena (java framework)• Use the command line ARQ tool• Warning: probably too geeky for most folks in

this room.• But if you’re serious about going deeper,

probably unavoidable

Page 48: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

SPARQL Summary

• SPARQL syntax harder than RDF• But again, the hardest part seems to be

figuring out what URIs to plug in• Existing tools not very user friendly• Promise of querying the entire Web of Data

still a way off

Page 49: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

RDFa

• Regular LD sort of a parallel web of data• RDFa and related technologies embed web of

data within the web of documents• The “a” stands for attributes”• Metatags on steroids• But good, W3C doctor approved steroids!• Sounds like an afterthought, but probably far

more widely used than any other form of LD.

Page 50: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

$$$ Rich Snippets $$$

Page 52: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Facebook’s Open Graph API

• Graph? Sounds like LD!• And indeed, uses RDFa• But not “pure RDFa”• And only for ingest• http://graph.facebook.com/reinhard.engels• http://graph.facebook.com/harvard• http://graph.facebook.com/zuck

Page 53: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

Summary of my LD experience

• Frustrated by ambiguities and the many competing ways of doing more or less the same thing

• Frustrated by disconnect between grand vision of one API for the Web of Data and the sorry little SPARQL queries I was able to run

• Not overjoyed that SEO spamming seems the one area in which LD is really succeeding

Page 54: Hacking Linked Data A hands-on-exploration of an often nebulous concept for librarians! Reinhard Engels, ABCD Library, October 2013

But…

• “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” – Amara’s Law