time travelling through dbpedia
TRANSCRIPT
Time travelling through DBpediaMiel Vander Sande
There is a huge amount of interesting information in DBpedia’s history.
What could we learn if we could easily query it?
Sustainable querying on fragments.dbpedia.org
Uniform access to DBpedia versions
Rewriting history: applying Memento to Triple Pattern Fragments
Time travelling through DBpedia
Use cases and opportunities
Sustainable querying on fragments.dbpedia.org
Uniform access to DBpedia versions
Rewriting history: applying Memento to Triple Pattern Fragments
Time travelling through DBpedia
Use cases and opportunities
Linked Data Fragments: hunting trade-offs between client & server.
high server costlow server cost
datadump
SPARQLendpoint
interface offered by the server
high availability low availabilityhigh bandwidth low bandwidthout-of-date data live data
low client costhigh client cost
DBpediaPages
low server cost
datadump
SPARQLquery results
high availabilitylive data
DBpediaPages
triple patternfragments
A triple pattern fragments interfaceis low-cost and enables clients to query.
A Triple Pattern Fragments interfaceacts as a gateway to an RDF source.
Client can only ask ?s ?p ?o patterns.
Decompose complex SPARQL querieson the client-side.
Low server cost, highly cacheable, but higher bandwidth and query time.
Usage is steadily increasing since the release in October 2014.
# Re
ques
ts
February 2015 September 2016
19.239.907
4.500.000
And still the API has 99.99% availability up to today.
Sustainable querying on fragments.dbpedia.org
Uniform access to DBpedia versions
Rewriting history: applying Memento to Triple Pattern Fragments
Time travelling through DBpedia
Use cases and opportunities
The Memento Framework lets you negotiate Web resources over time.
DBpedia pages are available through Memento since 2010 (v1.0).
Any client can transparently navigate to a prior version.
http://dbpedia.org/page/Joachim_Lambek
Any client can transparently navigate to a prior version.
http://dbpedia.mementodepot.org/memento/20090924000000/http://dbpedia.org/page/Joachim_Lambek
No updates since version 3.9 (2013) because of scalability problems.
1.0
Indexing Custom
Indexing time ~ 24 hours per version
Storage MongoDB
Space 383 Gb
# Versions 10 versions: 2.0 through 3.9
# Triples ~ 3 billion
Sustainable querying on fragments.dbpedia.org
Uniform access to DBpedia versions
Rewriting history: applying Memento to Triple Pattern Fragments
Time travelling through DBpedia
Use cases and opportunities
Directly compatible with Memento
datadump
SPARQLquery results
Queryable for the consumerSustainable for publisher
DBpediapages
triple patternfragments
The Triple Pattern Fragments trade-offalso pays off for archives.
Different HDT snapshots are exposed through an LDF server with Memento
http://fragments.dbpedia.org
(v2.0)
DBpedia pages are now available through a proxy.
http://dbpedia.org/resource/…
Space and time-to-publish significantly decreased.
1.0 2.0
Indexing Custom HDT-CPP
Indexing time ~ 24 hours per version ~ 4 hours per version
Storage MongoDB HDT binary files
Space 383 Gb 70 Gb
# Versions 10 versions: 2.0 through 3.9
12 versions: 2.0 through 2015
# Triples ~ 3 billion ~ 5 billion
Preparing the TPF client was simply adding an HTTP header.
Query EngineSPARQL Processing
Hypermedia Layer Fragments interaction
HTTP Layer Resource access
DBpedia 3.9
DBpedia 2015
303 Location 200 Content-Location (CORS)
ClientServer
GET Accept-Datetime
A self-descriptive interface results in a single datetime negotiation.
Query EngineSPARQL Processing
Hypermedia Layer Fragments interaction
HTTP Layer Resource access
DBpedia 3.9
DBpedia 2015
ClientServer
GET 200
Sustainable querying on fragments.dbpedia.org
Uniform access to DBpedia versions
Rewriting history: applying Memento to Triple Pattern Fragments
Time travelling through DBpedia
Use cases and opportunities
Querying history and the evolution of facts.
When did a researcher with name Hans Fichtner and born in Leipzig die?
Try it yourself: bit.ly/hansfichtner
bit.ly/hansfichtner-2012
What predicates were added between 2009 and 2014 to describe a person?
Analyze and profile changes in DBpedia.
Try it yourself: bit.ly/personpredicates-2009 bit.ly/personpredicates-2014
What works by cubists were known by DBpedia and VIAF in 2009?
Resolve out-of-sync issues between federated sources.
Try it yourself: bit.ly/workscubists-2009
bit.ly/workscubists
Sustainable querying on fragments.dbpedia.org
Uniform access to DBpedia versions
Rewriting history: applying Memento to Triple Pattern Fragments
Time travelling through DBpedia
Use cases and opportunities
Start digging into DBpedia’s history or host your own Linked Data archive!
github.com/LinkedDataFragmentsbit.ly/configuring-memento
linkeddatafragments.org mementoweb.org
Software
Documentation and specification
fragments.mementodepot.orgclient.linkeddatafragments.org
Use the archive on
Time travelling through DBpedia@Miel_vdsHerbert Van de SompelHarihar Shankar Lyudmila BalakirevaRuben Verborgh