time travelling through dbpedia

31
Time travelling through DBpedia Miel Vander Sande

Upload: miel-vander-sande

Post on 12-Apr-2017

337 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Time travelling through DBpedia

Time travelling through DBpediaMiel Vander Sande

Page 2: Time travelling through DBpedia

There is a huge amount of interesting information in DBpedia’s history.

What could we learn if we could easily query it?

Page 3: Time travelling through DBpedia

Sustainable querying on fragments.dbpedia.org

Uniform access to DBpedia versions

Rewriting history: applying Memento to Triple Pattern Fragments

Time travelling through DBpedia

Use cases and opportunities

Page 4: Time travelling through DBpedia

Sustainable querying on fragments.dbpedia.org

Uniform access to DBpedia versions

Rewriting history: applying Memento to Triple Pattern Fragments

Time travelling through DBpedia

Use cases and opportunities

Page 5: Time travelling through DBpedia

Linked Data Fragments: hunting trade-offs between client & server.

high server costlow server cost

datadump

SPARQLendpoint

interface offered by the server

high availability low availabilityhigh bandwidth low bandwidthout-of-date data live data

low client costhigh client cost

DBpediaPages

Page 6: Time travelling through DBpedia

low server cost

datadump

SPARQLquery results

high availabilitylive data

DBpediaPages

triple patternfragments

A triple pattern fragments interfaceis low-cost and enables clients to query.

Page 7: Time travelling through DBpedia

A Triple Pattern Fragments interfaceacts as a gateway to an RDF source.

Client can only ask ?s ?p ?o patterns.

Decompose complex SPARQL querieson the client-side.

Low server cost, highly cacheable, but higher bandwidth and query time.

Page 8: Time travelling through DBpedia
Page 9: Time travelling through DBpedia
Page 10: Time travelling through DBpedia

Usage is steadily increasing since the release in October 2014.

# Re

ques

ts

February 2015 September 2016

19.239.907

4.500.000

Page 11: Time travelling through DBpedia

And still the API has 99.99% availability up to today.

Page 12: Time travelling through DBpedia

Sustainable querying on fragments.dbpedia.org

Uniform access to DBpedia versions

Rewriting history: applying Memento to Triple Pattern Fragments

Time travelling through DBpedia

Use cases and opportunities

Page 13: Time travelling through DBpedia

The Memento Framework lets you negotiate Web resources over time.

Page 14: Time travelling through DBpedia

DBpedia pages are available through Memento since 2010 (v1.0).

Page 15: Time travelling through DBpedia

Any client can transparently navigate to a prior version.

http://dbpedia.org/page/Joachim_Lambek

Page 16: Time travelling through DBpedia

Any client can transparently navigate to a prior version.

http://dbpedia.mementodepot.org/memento/20090924000000/http://dbpedia.org/page/Joachim_Lambek

Page 17: Time travelling through DBpedia

No updates since version 3.9 (2013) because of scalability problems.

1.0

Indexing Custom

Indexing time ~ 24 hours per version

Storage MongoDB

Space 383 Gb

# Versions 10 versions: 2.0 through 3.9

# Triples ~ 3 billion

Page 18: Time travelling through DBpedia

Sustainable querying on fragments.dbpedia.org

Uniform access to DBpedia versions

Rewriting history: applying Memento to Triple Pattern Fragments

Time travelling through DBpedia

Use cases and opportunities

Page 19: Time travelling through DBpedia

Directly compatible with Memento

datadump

SPARQLquery results

Queryable for the consumerSustainable for publisher

DBpediapages

triple patternfragments

The Triple Pattern Fragments trade-offalso pays off for archives.

Page 20: Time travelling through DBpedia

Different HDT snapshots are exposed through an LDF server with Memento

http://fragments.dbpedia.org

(v2.0)

Page 21: Time travelling through DBpedia

DBpedia pages are now available through a proxy.

http://dbpedia.org/resource/…

Page 22: Time travelling through DBpedia

Space and time-to-publish significantly decreased.

1.0 2.0

Indexing Custom HDT-CPP

Indexing time ~ 24 hours per version ~ 4 hours per version

Storage MongoDB HDT binary files

Space 383 Gb 70 Gb

# Versions 10 versions: 2.0 through 3.9

12 versions: 2.0 through 2015

# Triples ~ 3 billion ~ 5 billion

Page 23: Time travelling through DBpedia

Preparing the TPF client was simply adding an HTTP header.

Query EngineSPARQL Processing

Hypermedia Layer Fragments interaction

HTTP Layer Resource access

DBpedia 3.9

DBpedia 2015

303 Location 200 Content-Location (CORS)

ClientServer

GET Accept-Datetime

Page 24: Time travelling through DBpedia

A self-descriptive interface results in a single datetime negotiation.

Query EngineSPARQL Processing

Hypermedia Layer Fragments interaction

HTTP Layer Resource access

DBpedia 3.9

DBpedia 2015

ClientServer

GET 200

Page 25: Time travelling through DBpedia

Sustainable querying on fragments.dbpedia.org

Uniform access to DBpedia versions

Rewriting history: applying Memento to Triple Pattern Fragments

Time travelling through DBpedia

Use cases and opportunities

Page 26: Time travelling through DBpedia

Querying history and the evolution of facts.

When did a researcher with name Hans Fichtner and born in Leipzig die?

Try it yourself: bit.ly/hansfichtner

bit.ly/hansfichtner-2012

Page 27: Time travelling through DBpedia

What predicates were added between 2009 and 2014 to describe a person?

Analyze and profile changes in DBpedia.

Try it yourself: bit.ly/personpredicates-2009 bit.ly/personpredicates-2014

Page 28: Time travelling through DBpedia

What works by cubists were known by DBpedia and VIAF in 2009?

Resolve out-of-sync issues between federated sources.

Try it yourself: bit.ly/workscubists-2009

bit.ly/workscubists

Page 29: Time travelling through DBpedia

Sustainable querying on fragments.dbpedia.org

Uniform access to DBpedia versions

Rewriting history: applying Memento to Triple Pattern Fragments

Time travelling through DBpedia

Use cases and opportunities

Page 30: Time travelling through DBpedia

Start digging into DBpedia’s history or host your own Linked Data archive!

github.com/LinkedDataFragmentsbit.ly/configuring-memento

linkeddatafragments.org mementoweb.org

Software

Documentation and specification

fragments.mementodepot.orgclient.linkeddatafragments.org

Use the archive on

Page 31: Time travelling through DBpedia

Time travelling through DBpedia@Miel_vdsHerbert Van de SompelHarihar Shankar Lyudmila BalakirevaRuben Verborgh