an rdf and xml database john snelson, lead engineer 23 rd october 2013
TRANSCRIPT
![Page 1: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/1.jpg)
An RDF and XML DatabaseJohn Snelson, Lead Engineer23rd October 2013
![Page 2: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/2.jpg)
Slide 2 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
MarkLogic
SEARCHDATABASE
APPLICATION SERVICES
![Page 3: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/3.jpg)
Slide 3 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Data ≠
Information
![Page 4: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/4.jpg)
Slide 4 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Data +Context =Information
![Page 5: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/5.jpg)
Slide 5 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Dynamic Semantic PublishingBBC Sports
Size and Complexity: # of athletes # of teams # of assets (match
reports, statistics, etc.) # of relations (facts)
Rich user experience See information in
context Personalize content Easy navigation Intelligently serve ads
(outside of UK)
Manageable Static pages? Too
many, changing too fast
Limited number of journalists
Automate as much as possible
The Challenge Goals
![Page 6: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/6.jpg)
Slide 6 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Dynamic Semantic PublishingA Solution
Store, manage documents
Stories Blogs Feeds Profiles
Store, manage values Statistics
Full-Text search Performance,
scalability Robustness
Metadata about documents
Tagged by journalists Added
(semi-)automatically Inferred
Facts reported by journalists
Linked Open Data for real-world facts
XML Database Triple Store
![Page 7: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/7.jpg)
Slide 7 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
played in
plays in
plays for
Dynamic Semantic PublishingUnderstanding Data
![Page 8: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/8.jpg)
Slide 8 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Dynamic Semantic PublishingScaling Up
![Page 9: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/9.jpg)
Slide 9 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
What is RDF?
:has-child:has-parent
:birth-place
:spouse
:spouse
:birth
-pla
ce
:has-child
:has-parent:person20:person5
:place5 :first-name:person4 “John”
![Page 10: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/10.jpg)
Slide 10 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
What is RDF?
• Schema-less• Triple granularity• Open world assumption• Joins - the cost of granularity
RDF
![Page 11: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/11.jpg)
Slide 11 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Data stored in Triples
Expressed as Subject : Predicate : Object
Example:
"John Smith" : livesIn : "London""London" : isIn : "England"
What is Semantics?
![Page 12: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/12.jpg)
Slide 12 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Data stored in Triples
Expressed as Subject : Predicate : Object
Example:
"John Smith" : livesIn : "London""London" : isIn : "England"
Rules tell us something about the triples
Example:
If (A livesIn X) AND (X isIn Y) then (A livesIn Y)
Inference: "John Smith" : livesIn : "England"
What is Semantics?
![Page 13: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/13.jpg)
Slide 13 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Data stored in Triples
Expressed as Subject : Predicate : Object
Example:
"John Smith" : livesIn : "London""London" : isIn : "England"
Rules tell us something about the triples
What is Semantics?
"John Smith" "England"livesIn
"London"isIn
livesIn
![Page 14: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/14.jpg)
Slide 15 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Semantics Architecture
TRIPLE
XQY XSLT SQL SPARQL
GRAPHSPARQL
![Page 15: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/15.jpg)
Slide 16 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Triple Index
• 3 triple orders• Cached for performance• Works seamlessly with other indexes• Security• 150 bytes per triple on disk• Billions of triples per host• Scaling out horizontally
TRIPLE
![Page 16: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/16.jpg)
Slide 17 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
RDF Loading
RDF
![Page 17: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/17.jpg)
Slide 18 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Triples Embedded in Documents
…<sem:triple> <sem:subject> http://example.org/kennedy/person12 </sem:subject> <sem:predicate> http://example.org/kennedy/last-name </sem:predicate> <sem:object datatype="http://www.w3.org/2001/XMLSchema#string"> Lawford </sem:object></sem:triple>…
![Page 18: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/18.jpg)
Slide 19 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Content, Data, and Semantics
<SAR>
<title>Suspicious vehicle…Suspicious vehicle near airport
<date>
<type>
<threat>
2012-11-12Z
observation/surveillance
<type>suspicious activity
<category>suspicious vehicle
<location>
<lat>37.497075
<long>-122.363319
<subject>IRIID
<subject>IRIID
<predicate>
<predicate>
isa
value
<triple>
<triple>
<object>license-plate
<object>ABC 123
<description>A blue van…A blue van with license plate ABC 123 was observed parked behind the airport sign…
</title>
</date>
</type>
</type>
</category>
</threat>
</lat>
</long>
</location>
</subject>
</subject>
</predicate>
</predicate>
</object>
</object>
</description>
</SAR>
</triple>
</triple>
![Page 19: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/19.jpg)
Slide 20 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Content, Data, and Semantics
<SAR>
<title>
Suspicious vehicle…
<date>
2012-11-12Z
<type>
<threat>
suspicious activity
<category>
suspicious vehicle
<location>
<lat>
37.497075
<long>
-122.363319
<description>
A blue van…
<subject>
<subject>
<predicate>
<object>
IRIID
IRIID
isa
value
license-plate
ABC 123<predicate>
<object>
observation/surveillance<type>
<triple>
<triple>
Semantic
(RDF)
Triples
Unstructured full-
text
Geospati
alData
![Page 20: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/20.jpg)
Slide 21 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
RDF Values
<http://example.org/kennedy/person4>
“string value”^^xs:string
“987”^^xs:double
“2013-04-09”^^xs:date “bonjour”@fr
_:blank1
“simple”
![Page 21: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/21.jpg)
Slide 22 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Datatype Mapping
Datatype SPARQL XQuery
Typed Literal
“2013-04-09”^^xs:date
xs:date(“2013-04-09”)
IRI <http://example.com> sem:iri(“http:// example.com”)
Blank Node _:blank1 sem:blank(“…”)
Simple Literal “simple” xs:string(“simple”)
Language “bonjour”@frTaggedLiteral
rdf:langString(“bonjour”,“fr”)
![Page 22: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/22.jpg)
Slide 23 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
SPARQL
• Executed using the triple index• SPARQL 1.0 + much of SPARQL 1.1• Cost-based optimization• Join ordering and algorithms
select * where { ?person :birth-place ?place; :first-name “John”}
SPARQL
![Page 23: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/23.jpg)
Slide 24 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Executing SPARQL
sem:sparql(“ prefix : <http://example.org/kennedy/> select * { ?person :first-name ?first; :last-name ?last; :alma-mater [:ivy-league :true] }”, map:entry(“first”,“John”), (), cts:collection-query(“mycollection”))
![Page 24: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/24.jpg)
Slide 25 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Returning Binding Solutions
select * where { ?person :birth-place :place5}
select * where { ?person :birth-place ?place; :first-name “John”}
![Page 25: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/25.jpg)
Slide 26 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Solution Results
person place
:person22 :place13
:person4 :place5
map:map
![Page 26: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/26.jpg)
Slide 27 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
SPARQL Query Results XML Format
sem:query-result-serialize( sem:sparql(“select * { … }”), “xml”)
![Page 27: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/27.jpg)
Slide 28 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Returning Triples
describe :person4
construct { ?bp :uses-name ?fn} where { ?person :birth-place ?bp; :first-name ?fn}
![Page 28: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/28.jpg)
Slide 29 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Triple Resultssem:triple
:place0 :uses-name “Ethel”, “Jeffrey”, “Kara” .:place1 :uses-name “Edward”, “James” .:place10 :uses-name “Robert”, “Sheila”, “Stephen” .
sem:iri
![Page 29: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/29.jpg)
Slide 30 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Querying Named Graphs
select *from <http://my_graph>where { ?s ?p ?o }
select * where { graph <http://my_graph> { ?s ?p ?o }}
collection
![Page 30: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/30.jpg)
Slide 31 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Restricting The Datasets
let $options := “properties”let $query := cts:and-query( cts:directory-query(“/triples/”), cts:element-range-query( xs:QName(“date”),“>”,$date) )return sem:sparql(“…”,(),(), $options,$query)
![Page 31: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/31.jpg)
Slide 32 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Creating Triples
• sem:triple()• sem:rdf-parse()• sem:rdf-get()• sem:rdf-builder()
• sem:rdf-load()• sem:rdf-insert()
Returning sem:triple values
Inserting to a database
![Page 32: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/32.jpg)
Slide 33 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Graph Store API
declare function graph-insert(
$graphname as sem:iri,
$triples as sem:triple*,
[$permissions as element(sec:permission)*,
$collections as xs:string*,
$quality as xs:int?,
$forest-ids as xs:unsignedLong*]
) as xs:string*;
declare function graph-delete(
$graphname as sem:iri
) as empty-sequence();
![Page 33: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/33.jpg)
Slide 34 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Conclusion
• Semantics can enhance your data-oriented and search applications.• XQuery and SPARQL work well together.• A combination RDF and XML database simplifies working with the technologies together.• Try MarkLogic 7: http://www.marklogic.com/early-access/
![Page 34: An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013](https://reader036.vdocuments.us/reader036/viewer/2022062322/56649eaa5503460f94bae79a/html5/thumbnails/34.jpg)
Slide 35 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Any Questions?