oracle spatial and graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• cia world fact...
TRANSCRIPT
Oracle Spatial and Graph: RDF Semantic Graph Feature
1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
RDF Semantic Graph Feature
"THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT
DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND
MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A
COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY,
AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION.
THE DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR
FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT
2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT
THE SOLE DISCRETION OF ORACLE."
Program Agenda
� Part 1: Overview of Graph
� Part 2: SPARQL and GeoSPARQL
� Part 3: Semantics (RDF, RDFS, OWL)
� Part 4: RDF View on Relational Data (RDB2RDF)
3 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
� Part 5: Key Features of Oracle Spatial and Graph
� Part 6: Performance and Scalability
� Part 7: Tools Demonstration
� Part 8: SQL-based Graph Analytics
� Summary
Part 1: Overview of Graph
4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Overview of Graph
• What is a graph?
– A set of vertexes and edges (and optionally attributes)
– A graph is simply linked data
• Why do we care?
– Graphs are everywhere A D
C B
5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Graphs are everywhere
• Road networks, power grids, biological networks
• Social networks/Social Web (Facebook, Linkedin, Twitter, Baidu, Google+,>)
• Knowledge graphs (RDF, OWL)
– Graphs are intuitive and flexible
• Easy to navigate, easy to form a path, natural to visualize
E
A D
F
Various Kinds of Graphs
• There are many kinds of graphs
– Simple graph, Weighted graph, Vertex-labeled graph, Edge-labeled graph,
Directed graph (digraph), Undirected graph, Hypergraph, >
• Different application scenarios
– Link-node graphs representing physical/logical networks used in transportation,
6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Link-node graphs representing physical/logical networks used in transportation,
utilities and telco (Oracle Spatial and Graph Network Data Model (NDM)
– RDF Semantic Graphs modeling data as triples for social network, linked data
and other semantic applications (Oracle Spatial and Graph)
– Property Graphs allowing the association of K/V pairs (attributes) with
vertexes/edges for social network analytics (investigating)
RDF Semantic Graph
• Resource Description Framework
– URIs are used to identify
• Resources, entities, relationships, concepts
• Creates Subject-Property-Object “triples”
• Data identification is a must for integration
• URIs are globally unique
• Properties of subjects are triples
7 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• RDF Graph defines semantics
• Standards defined by W3C & OGC– RDF, RDFS, OWL, SKOS
– SPARQL, RDFa, RDB2RDF, GeoSPARQL
• Implementations – Oracle, IBM, Cray, Bigdata ®
– Franz, Ontotext, Openlink, Jena, Sesame, >
Property Graph
• A set of vertices (or nodes) – each vertex has a unique identifier (not globally unique).
– each vertex has a set of in/out edges.
– each vertex has a collection of key-value properties.
• A set of edges – each edge has a unique identifier (not globally unique).
– each edge has a head/tail vertex.
– each edge has a label denoting type of relationship between
8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– each edge has a label denoting type of relationship between
two vertices.
– each edge has a collection of key-value properties.
• Blueprints Java APIs - open source, no standards
• Implementations • Neo4j, InfiniteGraph, Dex, Sail, MongoDB >
• A property graph can be modeled as an RDF Graph(Oracle Spatial and Graph NDM is an example of a property
graph optimized for consistent/homogeneous properties for
edges)
https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
Introduction to RDF Semantic Graph, a feature of Oracle Spatial and Graph
9 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
feature of Oracle Spatial and Graph
Semantic Technology Stack
• Core Technologies
• URI• Uniform resource identifier
• RDF• Resource description framework
10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. http://www.w3.org/2007/03/layerCake.svg
• RDFS• RDF Schema
• OWL• Web ontology language
What is RDF
� A graph data model for web resources and their relationships
� The graph can be serialized into
- RDF/XML, N3, N-TRIPLE, >
� Construction unit: Triple
(or assertion, or fact)
http://www.foobar.com
“CA”
http://www.foobar.com/products/mp3
http://>/locatedIn
http://>/produce
http://>/customerOf
11 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
(or assertion, or fact)<http://foobar> <:produces> <:mp3>
� Quads (named graphs) add context, provenance, identification, etc. to assertions <http://foobar> <:produces> <:mp3 > <:ProductGraph>
Subject Predicate Object http://www.oracle.com
http://www.oracle.com/products/RDF
http://>/produce
http://>/customerOf
http://>/uses(property/edge)(vertex /node) (vertex/node)
Basic Elements of RDF
• Instances
E.g. :John, :MovieXYZ, :PurchaseOrder432
• Classes
• Class represents a group/category/categorization of instances
E.g. :John rdf:type :Student
12 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
E.g. :John rdf:type :Student
• Properties
• Linking data together
E.g. :John :brother :Mary,
:John :hasAge “33”^^xsd:integer.
An RDF Graph Linking Several Data Sources
13 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Triples Are Easy. But Why?
• Graph modeling is flexible
• Adding/removing a new edge or vertex is simple
• Adding an edge is like adding a new column to a table but easier to do
• Standard based graph representation
• RDF was defined by W3C. Allows interoperability
• Computers can understand the semantics RDF graphs (triples)
14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Computers can understand the semantics RDF graphs (triples)
• Same URI means same resource
http://www.foobar.com
“CA”
http://www.foobar.com/products/mp3
:locatedIn
:producehttp://www.oracle.com
http://www.oracle.com/products/RDF
:produce:customerOf
http://www.oracle.com
Triples Are Easy. But Why? (2)
• Graph modeling is flexible
• Adding/removing a new edge or vertex is simple
• Adding an edge is like adding a new column to a table but easier to do
• Standard based graph representation
• RDF was defined by W3C. Allows interoperability
• Computers can understand the semantics RDF graphs (triples)
15 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Computers can understand the semantics RDF graphs (triples)
• Same URI means same resource
http://www.foobar.com
“CA”
http://www.foobar.com/products/mp3
:locatedIn
:producehttp://www.oracle.com
http://www.oracle.com/products/RDF
:produce:customerOf
Triples Are Easy. But Why? (3)
• Graph modeling is flexible
• Adding/removing a new edge or vertex is simple
• Adding an edge is like adding a new column to a table but easier to do
• Standard based graph representation
• RDF was defined by W3C. Allows interoperability
• Computers can understand the semantics RDF graphs (triples)
16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Computers can understand the semantics RDF graphs (triples)
• Discover hidden relationships, or detect inconsistency via logical inference
http://www.foobar.com
“CA”
http://www.foobar.com/products/mp3
:locatedIn
:producehttp://www.oracle.com
http://www.oracle.com/products/RDF
:produce:customerOf rdf:type
:customerOf rdfs:range :ServiceProvider
Triples Are Easy. But Why? (4)
• Graph modeling is flexible
• Adding/removing a new edge or vertex is simple
• Adding an edge is like adding a new column to a table but easier to do
• Standard based graph representation
• RDF was defined by W3C. Allows interoperability
• Computers can understand the semantics RDF graphs (triples)
17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Computers can understand the semantics RDF graphs (triples)
• Discover hidden relationships, or detect inconsistency via logical inference
http://www.foobar.com
“CA”
http://www.foobar.com/products/mp3
:locatedIn
:producehttp://www.oracle.com
:customerOfrdf:type
:hasOracleCSI rdf:type owl:FunctionalProperty
“CID1234”
“CID987”
:hasOracleCSI
:hasOracleCSI !
• DBPedia
• SIOC
• NCI
• SNOMED
• FOAF
• Geonames
• Wordnet
• Drug Bank
• ACM
• Daily Med
• Linked CT
• Eurostat
• Semanitc XBRL
• US Census
• YAGO
• Cyc/Open Cyc
• PubMed
• Freebase
Many Graphs and Vocabularies on the Web
18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Geonames
• CIA World Fact Book
• DBLP
• UniProt
• UniParc
• CiteSeer
• Eurostat
• KEGG
• Data.gov.uk
• Music Brainz Data
• Semantic Tweet
• CO2 Emission
• Freebase
• Gene Ontology
• UniRef
• Smart Link
• Reactome
• Diseasome
And so much more !
RDF Graph Can Enrich Your Business Applications • Flexible graph modeling adds agility
• Resource identification adds precision
• Integrate full breadth of enterprise content (structured, spatial, email, documents, web
services)
• Reconcile differences in data semantics so that they can all “talk” and interoperate;
19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
• Resolve semantic discrepancies across databases, applications
• Create consolidated “single” views across business applications
• Model and implement common Business Processes
Graph Technology As an Evolution
Oracle Spatial and Graph works well with these important enterprise technologies
• Relational, XML, Spatial, Text, Security, Clustering, Compression, Data Guard >
– Oracle’s RDF/OWL support is native to the Database
• Web Services, SOA, BPMN, Hadoop (Map Reduce) >
– Support of popular Java APIs and standard complaint Web Service endpoint
20 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Advanced Analytics
– Support integration with OBIEE, Oracle Data Mining, Oracle R Enterprise
• A rich set of third party tools including
– Ontology editing, knowledge management, Complete DL reasoners
– Graph/network visualization
– NLP, text processing
– >
RDF Graph Use Cases
• Unified content metadata for federated resources
• Validate semantic and structural consistency
Semantic Semantic
Metadata LayerMetadata Layer
� Find related content & relations by navigating connected entities
Text Mining & Text Mining & EntityEntityAnalyticsAnalytics
21 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Social MediaSocial MediaAnalysisAnalysis
� Analyze social relations
using curated metadata
- Blogs, wikis, tweets, video
- Calendars, IM, voice
connected entities
� “Reason” across entities
EntityEntityAnalyticsAnalytics
• Life Sciences
• Finance
• Media
Industries
Industries Have Already Adopted the ConceptSample of Oracle Spatial and Graph customers
22 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Media
• Networks & Communications
• Defense & Intelligence
• Public Sector
Thomson Reuters
Hutchinson
3G Austria
Allied Nation Intelligence ServiceOracle Spatial and Graph: Social Analysis
Objectives
� Profile suspects through telephone, email
and social network communications
� Produce “data products” for analysts
Solution
� Standards-based tools: W3C RDF & SPARQL
� Semantic tagging for 600 TB / 10b triples graph
� Top-secret , compartmented security for data
� New discovery on ~100 million triples / month
Benefits
23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Solution
� RDF Graph modeling of the social network:
people, groups and places of interest
� Inferencing & graph analytics discover
relationships among individuals & meaning
of pseudonyms, aliases, codes, terminology
� New discovery on ~100 million triples / month
� Find & label “same-as” relationships
Cisco WebEx Social Oracle Spatial and Graph for Enterprise Collaboration
Objectives
� Social connectivity and collaboration
through semantic enablement
� Connect knowledge silos
Solution
� Unifies metadata model - forum, blog, wiki, etc.
� Tagging media documents, pictures, blogs, etc.
to user-defined and/or enterprise vocabularies.
� Validates tag semantic/structural consistency
Benefits
24 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Solution
� Persistent unified graph metadata model
� Concepts tagged with unique meaning
� Find related content & groups by
navigating connected entities,
recommendations
� Validates tag semantic/structural consistency
Eli Lilly and CompanyOracle Spatial and Graph: RDF Graph Metadata Repository
Objectives
� Unified vocabulary for scientific
investigation
� Easier, more complete investigations
Solution
“[This technology>] provides improved insight
into our business by bringing together related
information from diverse data sources,”J. Phil BrooksInformation Consultant, Eli Lilly and Company
“[This technology>] provides improved insight
into our business by bringing together related
information from diverse data sources,”J. Phil BrooksInformation Consultant, Eli Lilly and Company
25 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Solution
� Integrate patient records, chemical
structures, biological sequences &
pathways, images, scientific papers>
� View related data as a graph
� Traverse graphs to discover relationships,
search for a term, or browse ontologies
Oracle Spatial and Graph Partners:Integrated Tools and Solution Providers
Ontology Engineering & Visualization
Open Source Frameworks Standards
Reasoners NLP Entity Extractors
26 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Applications & Tools SI / Consulting
SesameJoseki
Part 2: SPARQL and GeoSPARQL
27 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
What is SPARQL?
� SPARQL Protocol and RDF Query Language
– W3C standard for querying and manipulating RDF content
– Queries/updates and corresponding results are communicated via HTTP
with a SPARQL endpoint
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28
with a SPARQL endpoint
– A SPARQL endpoint implements the SPARQL protocol and serves RDF
data from a RDF triplestore or RDF view
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
A comprehensive query language
for RDF
Many useful constructs: optional
patterns, aggregates, subqueries,
negation, property paths, extensive
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
negation, property paths, extensive
function library, etc.
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
A comprehensive language for
manipulating RDF graphs
Allows you to create, update and
remove RDF graphs
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Defines a protocol for sending
queries or updates to SPARQL
endpoint and returning the results
via HTTP
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.32
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Defines a mechanism and RDF
vocabulary for describing the
features supported by a SPARQL
endpoint
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.33
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Alternative formats used to
serialize and exchange answers to
SPARQL queries
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.34
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
SPARQL extension for executing
queries distributed over different
SPARQL endpoints
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.35
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Extends SPARQL so that logically
entailed RDF triples (hidden edges
in RDF Graphs) are matched in
addition to directly asserted RDF
triples
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.36
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
What is SPARQL?
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Simple alternative to SPARQL 1.1
Update that describes HTTP
operations for managing a
collection of RDF graphs outside of
a SPARQL 1.1 graph store
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.37
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
SPARQL 1.1 Features by Example
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.38
SPARQL Graph PatternBasic unit of SPARQL queries
govtrack:Politician
govtrack:A000041
govtrack:A000045
“John Adams” “male”
“male”
rdf:type
rdf:type
foaf:name foaf:gender
foaf:name foaf:gender
govtrack:Politician
govtrack:A000041
rdf:type
foaf:name foaf:gender
?t
?p
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.39
“1767-07-11”^^xsd:date
“Samuel Adams”
“1722-09-27”^^xsd:date
“male”vcard:BDAY
vcard:BDAY“John Adams”
“1767-07-11”^^xsd:date
“male”
foaf:name
vcard:BDAY
foaf:gender
?n
?b
?g
Result 1: {?t=govtrack:Politician, ?p=govtrack:A000041, ?n=“John Adams”, ?g=“male”, ?b=“1767-07-11”^^xsd:date}
Result 2: {?t=govtrack:Politician, ?p=govtrack:A000045, ?n=“Samuel Adams”,?g=“male”, ?b=“1722-09-27”^^xsd:date}
?t
?p
SPARQL Graph PatternBasic unit of SPARQL queries
rdf:type
foaf:name foaf:gender
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>
How do we express this with SPARQL?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.40
?n
?b
?g
foaf:name
vcard:BDAY
foaf:gender SELECT ?t ?n ?b ?gWHERE{ ?p rdf:type ?t .?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:gender ?g }
Basic Graph
Pattern (BGP)
SPARQL SELECT Modifiers
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT DISTINCT ?fWHERE{ ?p vcard:N ?vn .
Find all distinct family names for senators
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.41
{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:title "Sen." .
}
SPARQL FILTER: Restricting Solutions
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?n ?b ?gWHERE{ ?p vcard:N ?vn .
Find all people with family name Adams born before Independence Day
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.42
{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:gender ?g FILTER ( ?f = "Adams" &&
?b < "1776-07-04"^^xsd:date )}
SPARQL 1.1 Built-in Functions
� Basic: arithmetic, comparisons, boolean connectors
� RDF-related: isLiteral(), isURI(), isBlank(), datatype(), lang(), BOUND(), >
� String Functions: SUBSTR(), STRSTARTS(), STRENDS(), REGEX(), >
Numerics:
Extensive library of functions to use
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.43
� Numerics: abs(), floor(), ceil(), >
� Dates and Times: now(), year(), month(), day(), >
� Miscellaneous: IN(), NOT IN(), IF(), COALESCE(), >
� Constructors: xsd:int(), xsd:decimal(), xsd:dateTime(), >
� > plus user-defined
SPARQL UNION: Disjunction
SELECT *WHERE{ ?p vcard:N ?vn .?p vcard:BDAY ?b .?vn vcard:Family ?f .{ ?p foaf:name ?n }
Find vcard given name or foaf name for all Clintons
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.44
{ ?p foaf:name ?n }UNION{ ?vn vcard:Given ?n }FILTER ( ?f = "Clinton" ) }
SPARQL OPTIONAL: Best Effort Match
SELECT ?t ?n ?b ?hWHERE{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:name ?n .
Find all people with family name Smith and optionally their title and
homepage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.45
?p foaf:name ?n .?p vcard:BDAY ?b .OPTIONAL { ?p foaf:title ?t }OPTIONAL { ?p foaf:homepage ?h }FILTER ( ?f = "Smith" )}
SPARQL OPTIONAL: Best Effort Match
SELECT ?n ?b ?hWHERE{ ?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:title "Rep."
Find all representatives and optionally their homepage if it is not a
standard www.house.gov address
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.46
?p foaf:title "Rep."OPTIONAL {?p foaf:homepage ?h FILTER (!STRSTARTS(STR(?h),"http://www.house.gov"))
}}
SPARQL 1.1 Negation: MINUS
SELECT ?n ?bWHERE{ ?p vcard:N ?vn .?vn vcard:Family "Kennedy" .?p foaf:name ?n .?p vcard:BDAY ?b .
Find all Kennedys that do not have a homepage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.47
?p vcard:BDAY ?b .MINUS { ?p foaf:homepage ?h }
}
SPARQL 1.1 Negation: NOT EXISTS / EXISTS
SELECT ?n ?bWHERE{ ?p vcard:N ?vn .?vn vcard:Family "Kennedy" .?p foaf:name ?n .?p vcard:BDAY ?b .
Find all Kennedys that do not have a homepage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.48
?p vcard:BDAY ?b .FILTER ( NOT EXISTS { ?p foaf:homepage ?h } )
}
SPARQL Solution Modifiers: ORDER BY
SELECT ?gn ?fn ?hWHERE{ ?p vcard:N ?v .?v vcard:Given ?gn .?v vcard:Family ?fn .
Order all representatives by ascending family name and descending
homepage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.49
?v vcard:Family ?fn .?p foaf:title "Rep." .?p foaf:homepage ?h
}ORDER BY ASC(?fn) DESC(STR(?h))
SPARQL Solution Modifiers: LIMIT / OFFSET
SELECT ?gn ?fn ?hWHERE{ ?p vcard:N ?v .?v vcard:Given ?gn .?v vcard:Family ?fn .
Get information about representatives 11 through 20 in alphabetical
order
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.50
?v vcard:Family ?fn .?p foaf:title "Rep." .?p foaf:homepage ?h
}ORDER BY ASC(?fn)LIMIT 10OFFSET 10
SPARQL 1.1 SELECT Expressions
SELECT (CONCAT(?t," ",SUBSTR(?gn,1,1),". ",?fn) AS ?fullTitle)WHERE{ ?p vcard:N ?v .?v vcard:Given ?gn .?v vcard:Family ?fn .
Generate a full title “<title> <FI> <family name>” for each member of
congress
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.51
?v vcard:Family ?fn .?p foaf:title ?t .
}
SPARQL 1.1 Grouping and Aggregation
SELECT ?fn ?tWHERE{ ?p vcard:N ?v .?v vcard:Family ?fn .?p foaf:title ?t .
}
Find all distinct pairs of family name and title
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.52
}GROUP BY ?fn ?t
SPARQL 1.1 Grouping and Aggregation
SELECT ?n (COUNT(*) AS ?cnt)WHERE{ ?s foaf:name ?n .?b bill:sponsor ?s .
}GROUP BY ?n
Find the top 10 bill sponsors
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.53
GROUP BY ?nORDER BY DESC(?cnt)LIMIT 10
Available Aggregates:COUNT(), SUM(), MIN(), MAX(), AVG(), GROUP_CONCAT(), SAMPLE()
SPARQL 1.1 Grouping and Aggregation
SELECT ?n (COUNT(*) AS ?cnt)WHERE{ ?s foaf:name ?n .?b bill:sponsor ?s .
}GROUP BY ?n
Find members of congress who have sponsored very few bills
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.54
GROUP BY ?nHAVING (COUNT(?b) < 5)ORDER BY ?cnt ?n
SPARQL 1.1 Subqueries
SELECT DISTINCT ?n ?o ?p ?st ?cntWHERE{ ?s foaf:name ?n . ?s pol:hasRole ?r .?r pol:party ?p . ?r pol:forOffice ?o .?o pol:represents ?st{ SELECT ?s (COUNT(?b) AS ?cnt)
Find information about politicians who sponsored more than 150 bills
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.55
{ SELECT ?s (COUNT(?b) AS ?cnt)WHERE{ ?b bill:sponsor ?s }GROUP BY ?sHAVING (COUNT(?b) > 150)
}}ORDER BY DESC(?cnt) ASC(?n)
SPARQL 1.1 Value Assignment: BIND
SELECT ?n ?o ?p ?stWHERE{ BIND (CONCAT("John"," ","McCain") AS ?n)?s foaf:name ?n .?s pol:hasRole ?r .?r pol:party ?p .
Find information about political offices of John McCain
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.56
?r pol:party ?p .?r pol:forOffice ?o .?o pol:represents ?st
}
SPARQL 1.1 Inline Data: VALUES
SELECT ?gn ?fn ?o ?p ?stWHERE{ ?s vcard:N ?n .?n vcard:Family ?fn .?n vcard:Given ?gn .
Find information for all Browns that are Republican and Smiths that are
Democrat
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.57
?n vcard:Given ?gn .?s pol:hasRole ?r .?r pol:party ?p .?r pol:forOffice ?o .?o pol:represents ?stVALUES (?fn ?p) {("Brown" "Republican")
("Smith" "Democrat") }}
SPARQL 1.1 Federated Query
SELECT ?n ?p ?oWHERE{ ?person foaf:name ?n .?person pol:hasRole ?role .?role pol:forOffice ?office .?office pol:represents geo:nh
What information about NH senators can be found from DBPedia
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.58
?office pol:represents geo:nhSERVICE <http://dbpedia.org/sparql> { ?x foaf:name ?n .
?x ?p ?o }}
SPARQL ASK Queries
ASKWHERE{ ?person vcard:N ?n .?n vcard:Given "Kennedy".?person pol:hasRole ?role .?role pol:party "Republican" .
Is there a Republican Kennedy?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.59
?role pol:party "Republican" .}
SPARQL Construct Queries
CONSTRUCT { ?person foaf:name ?n .?person pol:memberOf ?party }
WHERE{ ?person foaf:name ?n .?person pol:hasRole ?role .?role pol:party ?party .
Build a graph of names and political parties for all senators
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.60
?role pol:party ?party .?person foaf:title "Sen." .
}
SPARQL Describe Queries
DESCRIBE ?personWHERE{ ?person vcard:N ?n .?n vcard:Family "Paul" .
}
Describe all politicians with family name “Paul”
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.61
SPARQL 1.1 Property Paths
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.62
SPARQL 1.1 Property Paths
� Uses regular expression style syntax to express path patterns over
RDF properties
� Allows syntactic shortcuts for fixed length paths
� Allows searching arbitrary length paths
Enhanced path searching in SPARQL
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.63
� Allows searching arbitrary length paths
– Uses connectivity semantics instead of path counting semantics
Property Path ConstructsSyntax Form Matches
iri An IRI (path of length 1)
^elt Reverse path (object to subject)
elt1 / elt2 Sequence path of elt1 followed by elt2
elt1 | elt2 Alternative path of elt1 or elt2
elt* Path composed of zero or more repetitions of elt
elt+ Path composed of one or more repetitions of elt
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.64
elt+ Path composed of one or more repetitions of elt
elt? Path composed of zero or one repetition of elt
!iri or !(iri1|iri2|>|irin) A path of length 1 that is not one of irii
!^iri or !(^iri1|^iri2|>|^irin) A path of length 1 that is not one of irii as reverse paths
!(iri1|>|irij|^irij+1|>|^irin) A path of length 1 that is not one of irii in the indicated direction
(elt) Grouping used to control precedence
iri is an IRI
elt is a path element, which may itself be composed of other path constructs
SPARQL 1.1 Property Path
SELECT ?n ?dist ?scWHERE{ ?person foaf:name ?n .?person pol:hasRole/pol:forOffice/pol:represents ?dist .?person rdf:type ?t .?t rdfs:subClassOf+ ?sc .
Find each politician’s classification and the district he or she represents
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.65
?t rdfs:subClassOf+ ?sc .} LIMIT 100
Property Path Limitations
� For arbitrary length paths
– Cannot return actual path itself
– Cannot get length of found path
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.66
– Difficult to place conditions on nodes along the path
� Length-limited path search is hard to express for longer lengths
� No shortest path function
SPARQL Named Graphs
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.67
SPARQL Named Graphs
� An RDF Dataset is a collection of RDF graphs
– Contains one default graph, which does not have a name
– Contains zero or more named graphs, where each graph is identified by an
IRI
The concept of an RDF Dataset
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.68
IRI
� A SPARQL query is executed against an RDF Dataset
� FROM and FROM NAMED keywords are used to construct the RDF
Dataset for a query
� The GRAPH keyword is used to control the active graph for different
parts of a query
Constructing the RDF Dataset
Graph Name Triples
-- {t1,t2,t3}
<urn:g1> {t4,t5}
Contents of RDF Triplestore
SELECT *FROM <urn:g1>FROM <urn:g3>
SPARQL query with RDF
Dataset specificationDefault Graph
{ t4, t5, t8, t9 }
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.69
<urn:g2> {t6,t7}
<urn:g3> {t8,t9}
<urn:g4> {t10,t11}
FROM <urn:g3>FROM NAMED <urn:g2>FROM NAMED <urn:g3>FROM NAMED <urn:g4>WHERE { … }
Named Graphs
{ (<urn:g2>, { t6, t7 }),
(<urn:g3>, { t8, t9 }),
(<urn:g4>, { t10, t11 }) }
Using the GRAPH Keyword
SELECT *FROM <urn:g1>FROM <urn:g3>FROM NAMED <urn:g2>FROM NAMED <urn:g3>
SPARQL query with RDF
Dataset specification
Active Graph (BGP1)
{ <urn:g1> UNION <urn:g3> }
Active Graph (BGP2)
{ <urn:g2>, <urn:g3>, <urn:g4> }
Within a GRAPH
clause:
- BGP is executed
against each active
graph separately
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.70
FROM NAMED <urn:g3>FROM NAMED <urn:g4>WHERE { BGP1 GRAPH ?g { BGP2 }GRAPH <urn:g4> { BGP3 }GRAPH <urn:g1> { BGP4 }
}
{ <urn:g2>, <urn:g3>, <urn:g4> }
Active Graph (BGP3)
{<urn:g4> }
Active Graph (BGP4)
{ }
graph separately
(e.g. BGP2 against
g2, g3, g4).
- Subgraph match
must occur within a
single graph.
SPARQL Named Graph Query
SELECT ?n ?g (count(?b) as ?bcnt)FROM usgov:peopleFROM NAMED usgov:bills_110FROM NAMED usgov:bills_111WHERE
Find the number of bills sponsored by each politician in the 110th and
111th congress
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.71
WHERE{ ?s foaf:name ?nGRAPH ?g { ?b bill:sponsor ?s }
}GROUP BY ?n ?gORDER BY ?n ?g
SPARQL 1.1 Update
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.72
SPARQL 1.1 Update
� Insert triples into an RDF Graph
� Delete triples from an RDF Graph
� Load an RDF Graph
� Clear an RDF Graph
Capabilities of SPARQL Update
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.73
� Clear an RDF Graph
� Create a new RDF Graph
� Drop an RDF Graph
� Copy, move or add the content of one RDF Graph to another
� Perform a group of update operations as a single action
SPARQL 1.1 Update: INSERT DATA
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>INSERT DATA{ my:person1 foaf:name "John Smith" .my:person1 foaf:knows my:person1 }
Insert simple triple data
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.74
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>INSERT DATA{ GRAPH my:g1 { my:person1 foaf:name "John Smith" .
my:person1 foaf:knows my:person1 } }
Insert simple named graph data
SPARQL 1.1 Update: DELETE DATA
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE DATA{ my:person1 foaf:name "John Smith" .my:person1 foaf:knows my:person1 }
Delete triple data
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.75
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE DATA{ GRAPH my:g1 { my:person1 foaf:name "John Smith" .
my:person1 foaf:knows my:person1 } }
Delete named graph data
SPARQL 1.1 Update: DELETE INSERT
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE { ?p foaf:worksFor "Oracle USA, Inc." }WHERE { ?p foaf:worksFor "Oracle USA, Inc." }
Delete all foaf:worksFor “Oracle USA” triples
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.76
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>INSERT { ?p foaf:worksFor "Oracle America" }WHERE { ?p foaf:worksFor "Oracle USA" }
Insert foaf:worksFor “Oracle America” triples for all “Oracle USA”
employees
SPARQL 1.1 Update: DELETE INSERT
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE { ?p foaf:worksFor "Oracle USA" }INSERT { ?p foaf:worksFor "Oracle America" }WHERE { ?p foaf:worksFor "Oracle USA" }
Replace all foaf:worksFor “Oracle USA” triples with foaf:worksFor
“Oracle America” triples
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.77
WHERE { ?p foaf:worksFor "Oracle USA" }
SPARQL 1.1 Update: LOAD
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>LOAD <http://www.graphs.com/graph1> INTO GRAPH my:g1
Load graph1 into my:g1
Load graph1 into default graph
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.78
PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>LOAD <http://www.graphs.com/graph1>
Load graph1 into default graph
SPARQL 1.1 Update: CLEAR
PREFIX my: <http://www.mydomain.com/>CLEAR GRAPH my:g1
Clear my:g1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.79
PREFIX my: <http://www.mydomain.com/>CLEAR NAMED
Clear all named graphs
SPARQL 1.1 Update: CLEAR
PREFIX my: <http://www.mydomain.com/>CLEAR DEFAULT
Clear default graph
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.80
PREFIX my: <http://www.mydomain.com/>CLEAR ALL
Clear all named graphs and default graph
SPARQL 1.1 Update: CREATE
PREFIX my: <http://www.mydomain.com/>CREATE GRAPH my:g1
Create graph my:g1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.81
SPARQL 1.1 Update: DROP
PREFIX my: <http://www.mydomain.com/>DROP GRAPH my:g1
Drop my:g1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.82
PREFIX my: <http://www.mydomain.com/>DROP NAMED
Drop all named graphs
SPARQL 1.1 Update: DROP
PREFIX my: <http://www.mydomain.com/>DROP DEFAULT
Drop default graph
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.83
PREFIX my: <http://www.mydomain.com/>DROP ALL
Drop all named graphs and default graph
SPARQL 1.1 Update: COPY
PREFIX my: <http://www.mydomain.com/>COPY GRAPH my:g1 TO GRAPH my:g2
Replace contents of graph g2 with contents of graph g1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.84
PREFIX my: <http://www.mydomain.com/>COPY DEFAULT TO GRAPH my:g2
Replace contents of graph g2 with contents of default graph
SPARQL 1.1 Update: MOVE
PREFIX my: <http://www.mydomain.com/>MOVE GRAPH my:g1 TO GRAPH my:g2
Replace contents of graph g2 with contents of graph g1 and drop graph
g1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.85
PREFIX my: <http://www.mydomain.com/>MOVE my:g2 TO DEFAULT
Replace contents of default graph with contents of graph g2 and drop
graph g2
SPARQL 1.1 Update: ADD
PREFIX my: <http://www.mydomain.com/>ADD GRAPH my:g1 TO GRAPH my:g2
Append contents of graph g1 to graph g2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.86
PREFIX my: <http://www.mydomain.com/>ADD my:g2 TO DEFAULT
Append contents of graph g2 to default graph
Summary
� Query Language
� Update
� Protocol
� Service Description
Query Results JSON Format
Components of SPARQL 1.1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.87
� Query Results JSON Format
� Query Results CSV and TSV Format
� Query Results XML Format
� Federated Query
� Entailment Regimes
� Graph Store HTTP Protocol
Open Geospatial Consortium (OGC) GeoSPARQL Specification
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1288
GeoSPARQL Specification
Linked Geo Data
� Many Linked Open Data (LOD) datasets have geospatial components
� Barriers to integration
– Vendor-specific geometry support
– Different vocabularies
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1289
– Different vocabularies
� W3C Basic Geo, GML XMLLiteral,
Vendor-specific
– Different spatial reference systems
� WGS84 Lat-Long, British National Grid
Semantic GIS
� GIS applications with semantically complex thematic aspects
– Logical reasoning to classify features
� Land cover type, suitable farm land, etc.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1290
– Complex Geometries
� Polygons and Multi-Polygons with 1000’s of points
– Complex Spatial Operations
� Union, Intersection, Buffers, etc.Find parcels with an area of at least 3 sq.
miles that touch a local feeder road and are
inside an area of suitable farm land.
Requirements for GeoSPARQL
� Provide a common target for implementers & users
– Representation and query
� Work within SPARQL’s extensibility framework
� Simple enough for general users
– Keep the common case simple (WGS 84 point data)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1291
– Keep the common case simple (WGS 84 point data)
� Capable enough for GIS professionals
– Multiple SRS’s, complex geometries, complex operators
� Don’t re-invent the wheel!
ISO 19107 – Spatial Schema
ISO 13249 – SQL/MM
Simple Features
Well Known Text (WKT)
GML
KML
GeoJSON
Standardization Timeline
Form SWG
(June 2010)
OAB vote on
candidate
standard
(June 2011)
Process
comments and
update document
(Feb. 2012)
Standard
Published
(June 2012)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1292
Release
candidate
standard
(May 2011)
30-day public
comment period
(July 2011)
TC/PC vote
(March 2012)
1 2 3 4 5 6 7
From SPARQL to GeoSPARQL
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1293
SPARQL Query
:res1 rdf:type :House .:res1 :baths "2.5"^^xsd:decimal .:res1 :bedrooms "3"^^xsd:decimal .
:res2 rdf:type :Condo .
RDF Data
SELECT ?r ?ba ?brWHERE { ?r rdf:type :House .
?r :baths ?ba .?r :bedrooms ?br }
SPARQL Query
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1294
:res2 rdf:type :Condo .:res2 :baths "2"^^xsd:decimal .:res2 :bedrooms "2"^^xsd:decimal .
:res3 rdf:type :House:res3 :baths "1.5"^^xsd:decimal .:res3 :bedrooms "3"^^xsd:decimal .
?r | ?ba | ?br===================:res1 | "2.5" | "3":res3 | "1.5" | "3"
Result Bindings
SPARQL Query
:res1 rdf:type :House .:res1 :baths "2.5"^^xsd:decimal .:res1 :bedrooms "3"^^xsd:decimal .
:res2 rdf:type :Condo .
RDF Data
SELECT ?r ?ba ?brWHERE { ?r rdf:type :House .
?r :baths ?ba .?r :bedrooms ?brFILTER (?ba > 2) }
SPARQL Query
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1295
:res2 rdf:type :Condo .:res2 :baths "2"^^xsd:decimal .:res2 :bedrooms "2"^^xsd:decimal .
:res3 rdf:type :House:res3 :baths "1.5"^^xsd:decimal .:res3 :bedrooms "3"^^xsd:decimal .
FILTER (?ba > 2) }
?r | ?ba | ?br===================:res1 | "2.5" | "3"
Result Bindings
Spatial SPARQL QUERY:res1 rdf:type :House .:res1 :baths "2.5"^^xsd:decimal .:res1 :bedrooms "3"^^xsd:decimal .:res1 ogc:hasGeometry :geom1 .:geom1 ogc:asWKT "POINT(-122.25 37.46)"^^ogc:wktLiteral .
:res3 rdf:type :House:res3 :baths "1.5"^^xsd:decimal .:res3 :bedrooms "3"^^xsd:decimal .
Spatial RDF DataThis is what GeoSPARQL
standardizes
This is what GeoSPARQL
standardizes
Vocabulary &
Datatypes
Vocabulary &
Datatypes
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1296
:res3 :bedrooms "3"^^xsd:decimal .:res3 ogc:hasGeometry :geom3 .:geom3 ogc:asWKT "POINT(-122.24 37.47)"^^ogc:wktLiteral .
SELECT ?r ?ba ?brWHERE { ?r rdf:type :House . ?r :baths ?ba . ?r :bedrooms ?br .
?r ogc:hasGeometry ?g . ?g ogc:asWKT ?wktFILTER(ogcf:sfWithin(?wkt, "POLYGON(…)"^^ogc:wktLiteral)) }
GeoSPARQL QueryFind houses
within a
search
polygon
Extension
Functions
Extension
Functions
ogc:Feature ogc:Geometry
GeoSPARQL Vocabulary: Basic Classes and Relations
metadata
ogc:SpatialObject
0 .. *ogc:hasGeometry
Same as ISO
GM_Object
Same as ISO
GM_Object
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1297
ogc:dimension : xsd:int
ogc:coordinateDimension : xsd:int
ogc:spatialDimension : xsd:int
ogc:isEmpty : xsd:boolean
ogc:isSimple : xsd:boolean
ogc:asWKT : ogc:wktLiteral
ogc:asGML : ogc:gmlLiteral
>
serializations
metadata0 .. 1
ogc:hasDefaultGeometry
Same as ISO
GFI_Feature
Same as ISO
GFI_Feature
Geometry encoded
as a Literal
Geometry encoded
as a Literal
Details of ogc:wktLiteral
All RDFS Literals of type ogc:wktLiteral shall consist of an optional IRI
identifying the spatial reference system followed by Simple Features
Well Known Text (WKT) describing a geometric value [ISO 19125-1].
"<http://www.opengis.net/def/crs/OGC/1.3/CRS84>POINT(-122.4192 37.7793)"^^ogc:wktLiteral
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1298
POINT(-122.4192 37.7793)"^^ogc:wktLiteral
European Petroleum Survey Group (EPSG)
maintains a set of CRS identifiers.
WGS84 longitude – latitude
is the default CRS
WGS84 longitude – latitude
is the default CRS
"POINT(-122.4192 37.7793)"^^ogc:wktLiteral
Topological Relations between ogc:SpatialObject
A/B A B A B AB
ogc:sfEquals ogc:sfTouches ogc:sfOverlaps ogc:sfContains
A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1299
A B A B BBA
ogc:sfWithin ogc:sfDisjoint ogc:sfIntersects ogc:sfCrosses
• Assumes Simple Features Relation Family
• Also support Egenhofer and RCC8
Example Data
:City rdfs:subClassOf ogc:Feature . :Park rdfs:subClassOf ogc:Feature .:exactGeometry rdfs:subPropertyOf ogc:hasGeometry .
:SanFrancisco rdf:type :City .:UnionSquarePark rdf:type :Park .:UnionSquarePark :commissioned "1847-01-01"^^xsd:date .
Meta InformationMeta Information
Non-spatial PropertiesNon-spatial Properties
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12100
:UnionSquarePark :exactGeometry :geo1 .:geo1 ogc:asWKT "Polygon((…))"^^ogc:wktLiteral .
:SanFrancisco :exactGeometry :geo2 .:geo2 ogc:asWKT "Polygon((…))"^^ogc:wktLiteral .
:UnionSquarePark ogc:sfWithin :SanFrancisco .
Spatial PropertiesSpatial Properties
GeoSPARQL Query Functions
– ogcf:distance(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral,units: xsd:anyURI): xsd:double
– ogcf:buffer(geom: ogc:wktLiteral, radius: xsd:double,
geom1 geom2
geom
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12101
radius: xsd:double,units: xsd:anyURI): ogc:wktLiteral
– ogcf:convexHull(geom: ogc:wktLiteral): ogc:wktLiteral geom
GeoSPARQL Query Functions
– ogcf:intersection(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral
geom1
geom2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12102
– ogcf:union(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral
geom1
geom2
GeoSPARQL Query Functions– ogcf:difference(geom1: ogc:wktLiteral,
geom2: ogc:wktLiteral): ogc:wktLiteral
geom1
geom2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12103
– ogcf:symDifference(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral
geom1
geom2
GeoSPARQL Query Functions
– ogcf:envelope(geom: ogc:wktLiteral): ogc:wktLiteralgeom
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12104
– ogcf:boundary(geom1: ogc:wktLiteral): ogc:wktLiteral
– ogcf:getSRID(geom: ogc:wktLiteral): xsd:anyURI
geom
GeoSPARQL Topological Query Functions
– ogcf:sfEquals(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfDisjoint(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfIntersects(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfTouches(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfCrosses(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12105
– ogcf:sfCrosses(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfWithin(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfContains(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
– ogcf:sfOverlaps(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean
Assumes Simple Features
Relation Family
Example Query
PREFIX : <http://my.com/appSchema#>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/geosparql/functions/>PREFIX epsg: <http://www.opengis.net/def/crs/EPSG/0/>
SELECT ?parcelWHERE { ?parcel rdf:type :Residential .
Find all land parcels that are within the intersection of :City1 and :District1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12106
WHERE { ?parcel rdf:type :Residential .?parcel :exactGeometry ?pGeo .?pGeo ogc:asWKT ?pWKT .
:District1 :exactGeometry ?dGeo .?dGeo ogc:asWKT ?dWKT .
:City1 :extent ?cGeo .?cGeo ogc:asWKT ?cWKT .FILTER(ogcf:sfWithin(?pWKT, ogcf:intersection(?dWKT,?cWKT)))}
Summary
� GeoSPARQL Defines:
– Basic vocabulary, Query functions, Entailment component
� Based on existing OGC/ISO standards
– WKT, GML, Simple Features, ISO 19107
� Uses SPARQL’s built-in extensibility framework
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12107
� Uses SPARQL’s built-in extensibility framework
� Modular specification
– Allows flexibility in implementations
– Easy to extend
Part 3: Semantics
10
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
RDF Schema (RDFS)• Core language constructs
• rdfs:subClassOf
:A rdfs:subClassOf :B � instance of A is also instance of B
• rdfs:subPropertyOf (property transfer)
:p1 rdfs:subPropertyOf :p2, :a :p1 :b � :a :p2 :b
:firstAuthor rdfs:subPropertyOf :Author
skos:prefLabel rdfs:subPropertyOf rdfs:label
Derives implicit relationships
using inference
10
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
skos:prefLabel rdfs:subPropertyOf rdfs:label
• rdfs:domain and rdfs:range (specify how a property can be used):p1 rdfs:domain :D, :a :p1 :b � :a rdf:type :D
:p2 rdfs:range :R, :a :p2 :b � :b rdf:type :R
E.g. :performSurgeryOn rdfs:domain :Surgeon
:performSurgeryOn rdfs:range :Patient
• rdfs:label, seeAlso, isDefinedBy, >:Jack rdfs:seeAlso http://>/Jack_Blog
Web Ontology Language (OWL)• More expressive compared to RDFS
• Property related constructs
• owl:inverseOfE.g. :write owl:inverseOf :authoredBy
• owl:SymmetricProperty:relatedTo rdf:type owl:SymmetricProperty
foaf:knows is not defined as a symmetric property!
• owl:TransitiveProperty
Derives implicit relationships
using inference
11
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
:partOf rdf:type owl:TransitiveProperty.
skos:broader rdf:type owl:TransitiveProperty
• owl:equivalentProperty
• owl:FunctionalProperty:hasBirthMother rdf:type owl:FunctionalProperty
• owl:InverseFunctionalPropertyfoaf:mbox rdf:type owl:InverseFunctionalProperty
• Instances (owl:sameAs, owl:differentFrom)
Web Ontology Language (OWL)• Class related constructs
• owl:equivalentClass
• owl:disjointWith:Boys owl:disjointWith :Girls
• owl:complementOf:Boys owl:complementOf :Non_Boys
• owl:unionOf, owl:intersectionOf, owl:oneOf
11
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• owl:unionOf, owl:intersectionOf, owl:oneOf
• owl:Restriction is used to define a class whose members have
certain restrictions w.r.t a property
• owl:someValuesFrom
• owl:allValuesFrom
• owl:hasValue
• Class related constructs
• owl:equivalentClass
• owl:disjointWith:Boys owl:disjointWith :Girls
• owl:complementOf:Boys owl:complementOf :Non_Boys
• owl:unionOf, owl:intersectionOf, owl:oneOf
owl:someValuesFrom
•:ApprovedPurchaseOrder owl:equivalentClass
Web Ontology Language (OWL)
11
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• owl:unionOf, owl:intersectionOf, owl:oneOf
• owl:Restriction is used to define a class whose members have
certain restrictions w.r.t a property
• owl:someValuesFrom
• owl:allValuesFrom
• owl:hasValue
•:ApprovedPurchaseOrder owl:equivalentClass
[ a owl:Restriction ;
owl:onProperty :approvedBy ;
owl:someValuesFrom :Manager ]
:PO1 :approvedBy :managerXyz
:managerXyz rdf:type :Manager
� :PO1 rdf:type :ApprovedPurchaseOrder
• Class related constructs
• owl:equivalentClass
• owl:disjointWith:Boys owl:disjointWith :Girls
• owl:complementOf:Boys owl:complementOf :Non_Boys
• owl:unionOf, owl:intersectionOf, owl:oneOf
owl:allValuesFrom
•:Vegetarian rdfs:subClassOf
[ a owl:Restriction ;
owl:onProperty :eats ;
Web Ontology Language (OWL)
11
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• owl:unionOf, owl:intersectionOf, owl:oneOf
• owl:Restriction is used to define a class whose members have
certain restrictions w.r.t a property
• owl:someValuesFrom
• owl:allValuesFrom
• owl:hasValue
owl:onProperty :eats ;
owl:allValuesFrom :VegetarianFood ]
:Jen rdf:type :Vegetarian .
:Jen :eats :Marzipan .
� :Marzipan rdf:type :VegetarianFood .
We SHOULD not use :eats rdfs:range :VegetarianFood
• Class related constructs
• owl:equivalentClass
• owl:disjointWith:Boys owl:disjointWith :Girls
• owl:complementOf:Boys owl:complementOf :Non_Boys
• owl:unionOf, owl:intersectionOf, owl:oneOf
owl:hasValue
•:HighPriorityItem owl:equivalentClass
Web Ontology Language (OWL)
11
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• owl:unionOf, owl:intersectionOf, owl:oneOf
• owl:Restriction is used to define a class whose members have
certain restrictions w.r.t a property
• owl:someValuesFrom
• owl:allValuesFrom
• owl:hasValue
•:HighPriorityItem owl:equivalentClass
[ a owl:Restriction ;
owl:onProperty :hasPriority ;
owl:hasValue :High ]
:Item1 rdf:type :HighPriorityItem .
� :Item1 :hasPriority :High
Web Ontology Language (OWL)• Class related constructs
• Cardinality restrictions constrain the number of distinct individuals that can
associate with a class instance via a particular property
• owl:minCardinality
• owl:maxCardinality
• owl:cardinality
E.g. To express that a basketball game has at least 2 players
11
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
E.g. To express that a basketball game has at least 2 players
:BasketBallGame rdfs:subClassOf
[ a owl:Restriction;
owl:onProperty :hasPlayer;
owl:minCardinality 2 ]
• Others
• DatatypeProperty, AnnotationProperty, OntologyProperty,>
OWL 2
• More language constructs, better expressivity
• Property Chains, Keys, Punning, etc.
• http://www.w3.org/TR/owl2-new-features/
• OWL 2 Profiles
11
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• OWL 2 Profiles
• OWL 2 RL
• OWL 2 EL
• OWL 2 QL
• Syntactic subset of OWL 2
– W3C standard profile
– Inspired by DLP, pD*
– Has more than 70 entailment rules
OWL 2 RL
11
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Reasoning/conjunctive query answering is PTIME w.r.t
data/taxonomy complexity
– Defines a standard set of rules for implementation
• Oracle Spatial and Graph provides full support for OWL 2
RL/RDF ruleset
Example OWL2 RL Entailment Rules
• OWL2RL has 70+ entailment rules.– E.g. rule :
T(?p, owl:propertyChainAxiom, ?x)
LIST[?x, ?p1, ..., ?pn]
T(?u1, ?p1, ?u2)
T(?u2, ?p2, ?u3)...
11
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• These rules have efficient implementations in Oracle Spatial and Graph
...
T(?un, ?pn, ?un+1) . ���� T(?u1, ?p, ?un+1)
T(?p, rdf:type, owl:FunctionalProperty
T(?x, ?p, ?y1)
T(?x, ?p, ?y2) . ���� T(?y1, owl:sameAs, ?y2)
• SNOMED-CT is a major application of OWL 2 EL
– Suitable for applications employing ontologies that define very large numbers
of classes and/or properties
– One of the largest commercial biomedical ontologies
• Example rule for EL+ inference
OWL 2 EL
11
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Example rule for EL+ inference
?A rdfs:subClassOf ?A1
…
?A rdfs:subClassOf ?An
T(?C, owl:intersectionOf, ?x)
LIST[?x, ?A1, ..., ?An] � ?A rdfs:subClassOf ?C
• Oracle Spatial and Graph provides full support for OWL 2 EL
Part 4: RDF View on Relational Data (RDB2RDF)
12
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
A Look at Some System ArchitecturesTypical Traditional Application Architecture
Mid-Tier Server
Application 1 Application 2 Application 3
SQL
� Mid-tier server and database
server
� Applications communicate
via SQL/JDBC with RDBMS
backend
� Multiple traditional relational
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12121
Database Server
HR Database Sales DatabaseInventory Database
HR Schema Inventory Schema Sales Schema
SQL� Multiple traditional relational
schemas
� Issues
– Inflexible schema
– Limited semantics
– Limited interoperability
A Look at Some System ArchitecturesTypical Semantic Application Architecture
Mid-Tier Server
Application 1 Application 2 Application 3
SPARQL
� Common ontologies used to
integrate datasets
� Applications communicate
via SPARQL / HTTP with
native triplestore backend
� Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12122
Triplestore
SPARQL
HR GraphSales Graph Inventory Graph
Shared Ontologies
� Issues
– Radical change for
customer
– Need to ETL to RDF
– RDF/OWL may not be
necessary for the entire
data
Relational Data to RDF (RDB2RDF)
� Direct Mapping
W3C Specification
The mission of the RDB2RDF Working Group, part of the Semantic Web
Activity, is to standardize languages for mapping relational data and
relational database schemas into RDF and OWL*.
The two languages are:
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12123
� Direct Mapping
– Automatically generates a mapping based on an input relational schema
� R2RML
– Language for expressing customized mappings
* http://www.w3.org/2001/sw/rdb2rdf/
A Look at Some System ArchitecturesHow about RDB2RDF?
Mid-Tier Server
Application 1
Application 2 Application 3
Shared Ontologies
SPARQL
� Use virtual RDF data
� Benefits
– Existing relational data
stays in place and
corresponding applications
do not need to change
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12124
Database Server
HR Database Sales DatabaseInventory Database
HR Schema Inventory Schema Sales Schema
SQLRDB2RDF
Inventory Graph Sales Graph
– Use of virtual mapping
eliminates synchronization
issues
– Common vocabulary helps
with data integration issues
Using R2RML: Overall Flow
Schema: Classes
and Predicates
Query Writer
SPARQL to SQL
Translator
SPARQL QUERY
Map: Classes and
Predicates �
DB Objects
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12125
R2RML
Document
Source Relational Database R2RML
Map Author
R2RML Processor
R2RML Basics: Mapping
class predicatessubject class predicatessubject
Logical
constraints
Logical
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12126
•TriplesMap: Row of a Logical Table (Table / View / SQL query) � triples
• SubjectMap: Primary key value of a Row � subject (+ static class)
• PredicateMap: Names of columns and constraints � predicates (incl. rdf:type)
• ObjectMap: Values in columns or foreign keys � objects (+ dynamic class)
Logical Table A
Logical Table B
Vocabulary: R2RML Classes and Relations
rr:logicalTable
rr:parentTriplesMap
rr:graphMap
(rr:graph)
rr:graphMap
rr:subjectMap
(rr:subject)
rr:predicateMap
(rr:predicate)rr:objectMap
(rr:object)
rr:predicateObjectMap
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12127
Source: (annotated with relation names)R2RML: RDB to RDF Mapping Language
W3C Candidate Recommendation 23 February 2012
http://www.w3.org/TR/2012/CR-r2rml-20120223/
rr:graphMap
(rr:graph)
rr:joinCondition
Mapping EMP and DEPT Tables to RDFx:Department
http://x.com/Dept/{DNO}
class
Subject<http://x.com/Dept/100>
rdf:type x:Department ;
<../Dept/Deptno> 100 ;
<../Dept/DeptName> “Sales” ;
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12128
1 John DB 100
constraints
EM
P ENO ENAME EXPERTISE DNO
100 Sales NYC
DNO DNAME LOC
DE
PT
pkey ref. pkey
Predicate-object pairs
<../Dept/DeptName> “Sales” ;
<../Dept/Location> “NYC”
.
Mapping EMP and DEPT Tables to RDFx:Employee
http://x.com/Emp/{ENO}Subject
class
<http://x.com/Emp/1>
rdf:type x:Employee ;
<../Emp/Empno> 1 ;
<../Emp/EmpName> “John” ;
<../Emp/Expertise> “DB” ;
<../Emp/DeptNum> 100 ;
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12129
1 John DB 100
constraints
EM
P ENO ENAME EXPERTISE DNO
100 Sales NYC
DNO DNAME LOC
DE
PT
pkey ref. pkey
Predicate-object pairs
<../Emp/DeptNum> 100 ;
<../Emp/Department> <http://x.com/Dept/100>
.
Schema for Generated RDF data
� Classes
– x:Department
– x:Employee
� Properties
– <../Dept/Deptno>
– <../Dept/DeptName>
– <../Dept/Location>
– <../Emp/Empno>
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12130
– x:Employee – <../Emp/Empno>
– <../Emp/EmpName>
– <../Emp/Expertise>
– <../Emp/DeptNum>
– <../Emp/Department>
R2RML Mapping for EMP (w/ multi-ObjectMap)Empno
NUMBER
Ename
Varchar
W_phone
NUMBER
C_phone
Varchar
H_phone
Varchar
W_addr
Varchar
H_addr
Varchar
DeptNo
NUMBER
Smap ���� []
rr:template
“http://ex.org/E/{EMPNO}” ;
POmap ���� []
rr:predicate em:phone ;
rr:objectMap
Tmap ���� <#EmpTM>
rr:logicalTable [
rr:tableName “EMP”
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12131
“http://ex.org/E/{EMPNO}” ;
rr:class ex:Employee .
rr:objectMap
[ rr:column “W_PHONE” ]
, [ rr:column “C_PHONE” ]
, [ rr:column “H_PHONE” ] .
rr:tableName “EMP”
].
POmap���� []
rr:predicate em:dept ;
rr:objectMap [ rr:parentTriplesMap <#DeptTM> ;
rr:joinCondition [ rr:child “DEPTNO” ; rr:parent “DEPTNO” ]].RefObjectMap
R2RML Mapping: using “R2RML Views”<#DeptTableView> rr:sqlQuery """
SELECT DEPTNO, DNAME, LOC
, (SELECT COUNT(*) FROM EMP WHERE EMP.DEPTNO=DEPT.DEPTNO) AS STAFF
FROM DEPT """.
<#TriplesMap1>
rr:logicalTable <#DeptTableView>;
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12132
rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}";
rr:class ex:Department; ];
rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "DNAME" ]; ];
rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:column "LOC" ]; ];
rr:predicateObjectMap [ rr:predicate ex:staff; rr:objectMap [ rr:column "STAFF" ]; ].
R2RML Mapping: Translating type codes to IRIs<#TriplesMap1>
rr:logicalTable [ rr:sqlQuery """
SELECT *, (CASE JOB WHEN 'CLERK' THEN 'general-office'
WHEN 'NIGHTGUARD' THEN 'security'
WHEN 'ENGINEER' THEN 'engineering'
END) ROLE
FROM EMP
""" ];
R2RML
Mapping
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12133
""" ];
rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; ];
rr:predicateObjectMap [
rr:predicate ex:role;
rr:objectMap [ rr:template "http://data.example.com/roles/{ROLE}" ]; ].
<http://data.example.com/employee/7369> ex:role <http://data.example.com/roles/general-office> .
Generated RDF triple
Part 5: Key Features of RDF Semantic Graph a feature of Oracle Spatial and Graph
13
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
feature of Oracle Spatial and Graph
• Native RDF Database
• Named graph support
• Supports SPARQL 1.1, SPARQL/SQL, GeoSPARQL
RDF Semantic Graph Feature
Oracle Spatial and Graph OptionRDF Semantic Graph Feature:
13
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Jena, Sesame, & Joseki Web Services
• W3C standards: RDFS, OWL 2 RL, OWL 2 EL, SKOS,
RDF, RDB2RDF, SPARQL 1.1, RDFa
• Scales with hardware – petabytes of triples
• Works with OBIEE, Oracle BPM, Advanced Analytics
• Exploits: Exadata, RAC, Parallelism, Label Security
Oracle Spatial and Graph RDF Triple StoreLeverages Oracle Manageability:
• RAC & Exadata scalability
• Compression & partitioning
• SQL*Loader direct path load
• Parallel load, inference, query
• High Availability
• Native RDF graph data store
• Manages billions of triples
• Optimized storage architecture
• SPARQL-Jena/Joseki, Sesame
• SQL/graph query, B-tree indexing
• Ontology assisted SQL query
Load / Storage
Query
13
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• High Availability
• Triple-level label security
• Ladder based inference
• Choice of SPARQL, SQL, or Java
• Native inference engine
• Enterprise Manager
• RDFS, OWL2 RL, EL, SKOS• User-defined rules• Incremental, parallel reasoning• User-defined inferencing• Plug-in architecture
Reasoning
• Semantic indexing framework• Integration with • OBIEE, Oracle R Enterprise• Oracle Data Mining
Analytics
New functions in Oracle Spatial and Graph
• Open Geospatial Consortium (OGC) GeoSPARQL
• Native SPARQL 1.1 query support– 40+ new query functions/operators: IF, COALESCE, STRBEFORE, REPLACE, ABS,
– Aggregates: COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT, SAMPLE
– Subqueries
– Value Assignment: BIND, GROUP BY Expressions, SELECT Expressions
– Negation: NOT EXISTS, MINUS
13
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Negation: NOT EXISTS, MINUS
– Improved Path Searching with Property Paths
New functions in Oracle Spatial and Graph
• RDF views on relational tables (through RDB2RDF)
– RDF views can be created on a set of relational tables and/or views
– SPARQL queries access data from both a relational and RDF store
– Support RDF view creation using
• Direct Mapping: simple and straightforward to use
13
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• R2RML Mapping: customizations allowed
• Inference– Native OWL 2 EL inference support
– User defined inferencing
– Ladder Based Inference
– Performance optimization for user defined rules
– Integration with TrOWL, an external OWL 2 reasoner
* http://trowl.eu/
Jena Adapter for Oracle Database
13
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Jena Adapter for Oracle Database
• Requires Apache Jena 2.7.2, ARQ 2.9.2, Joseki 3.4.4, Oracle Database
release 11.2.0.3 or higher
• SPARQL 1.1 compliance
• Named Graph (quads) support: DatasetGraphOracleSem
• N-QUADS, TriG data format
14
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• N-QUADS, TriG data format
• Updated StatusListener interface
• Named graph queries through Joseki web service endpoint
• SPARQL Update through Joseki web service endpoint
• JSON output
Jena Adapter for Oracle Database (2)
• An efficient approach to better support OntModel APIsOracleGraphWrapperForOntModel
• Support for reserved SQL (PL/SQL) keywords
select ?date { :event :happenedOn ?date }
• Named graph based local inference: Attachment
14
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Named graph based local inference: Attachment
- public void setUseLocalInference(boolean useLocalInference)
- public boolean getUseLocalInference()
- public void setDefGraphForLocalInference(String defaultGraphName)
- public String getDefGraphForLocalInference()
- public String getInferenceOption()
- public void setInferenceOption(String inferenceOption)
Jena Adapter for Oracle Database (3)
• Analytical functions for RDF data: SemNetworkAnalyst
• Integrating Oracle Spatial and Graph network data model (NDM) with
RDF Semantic Graph feature
• Provides functions including shortest path, within cost, partitioning, >
• Extensible architecture
14
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Extensible architecture
Oracle Spatial and Graph Native Inference Engine
14
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Engine
Core Inference Features in Oracle Database
• Oracle provides native inference in the database for
• RDFS, RDFS++
• OWLPRIME, OWL2RL, OWL2EL, SKOS
• User-defined rules
14
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Inference done using forward chaining
• Triples inferred and stored ahead of query time
• Removes on-the-fly reasoning and results in fast query times
• Proof generation
• Shows one deduction path
Native Inference Engine in Oracle: APIs
SEM_APIS.CREATE_ENTAILMENT(• entailment_name
• sem_models(‘GraphTBox’, ‘GraphABox’, >),
• sem_rulebases(‘OWL2RL’),
• passes,
• inf_components,
• Options,>
)
Use “PROOF=T” to generate inference proof
Typical Usage:
• First load RDF/OWL data
• Call create_entailment to generate
inferred graph
• Query both original graph and
inferred data
Inferred graph contains only new triples!
Saves time & resources
14
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Use “PROOF=T” to generate inference proof
SEM_APIS.VALIDATE_ENTAILMENT(• sem_models((‘GraphTBox’, ‘GraphABox’, >),
• sem_rulebases(‘OWL2RL’),
• Criteria,
• Max_conflicts,
• Options
)
Java API: performInference, deleteInference, setInferenceOption, analyze methods in• GraphOracleSem, DatasetGraphOracleSem (Jena Adapter)
Typical Usage:
• First load RDF/OWL data
• Call create_entailment to generate
inferred graph
• Call validate_entailment to find
inconsistencies
Native Inference Engine in Oracle
• Leverage SQL and relational
technologies (partitioning, compression)
e.g. RDFS9 Rule Implemented in SQL
select distinct T2.SID, ID(rdf:type), T1.OID
Parallel
Execution
14
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
from <IVIEW> T1, <IVIEW> T2
where T1.PID=ID(rdfs:subClassOf)
and T2.PID=ID(rdf:type)
and T1.SID=T2.OID
and not exists (
select 1 from <IVIEW> m
where m.SID=T2.SID
and m.PID=ID(rdf:type)
and m.OID=T1.OID)- Implementing an Inference Engine for RDFS/OWL Constructs, ICDE 2008
- Optimizing Enterprise-scale OWL 2 RL Reasoning in a Relational Database System, ISWC 2010
- Advancing the Enterprise-class OWL Inference Engine in Oracle Database, ORE 2012
- Making the Most of your Triple Store: Query Answering in OWL 2 Using an RL Reasoner, WWW 2013
Native Parallel Inference Engine in Oracle (3)
• Transitive closure calculation– E.g. ?a owl:sameAs ?b � ?a owl:sameAs ?c
?b owl:sameAs ?c
– Partition (distance d) based Iterative
approach
– P1 has :a => :b, :b => :c (asserted)
– Join P1 with P1 to produce P2
14
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Join P1 with P1 to produce P2
– P2 has :a => :c (assign distance 2)
– Join P1 with P2 to produce P3
– >
:a
:b :c
owl:sameAsd=1
d=1
d=21 Implementing an Inference Engine for RDFS/OWL Constructs, ICDE 2008
On the Computation of the Transitive Closure of Relational Operators, VLDB 1986
Native Parallel Inference Engine in Oracle (4)
• Incremental Maintenance when there is a
small addition Delta to a big graph G
?x :hasParent ?y . ?y :hasBrother ?z ���� ?x :hasUncle ?y
Graph G self join Graph G
family:John
family:Jack
family:Mary
:hasParent
:hasBrother
ns:Jill ns:Jedi
:relatedTo
:relatedTo
:hasUncle
14
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
:relatedTo
How to efficiently handle a new edge
:Mary :hasBrother :Tony ?
Oracle Confidential April 2013
Native Parallel Inference Engine in Oracle (4)
• Incremental Maintenance when there is a
small addition Delta to a big graph G
?x :hasParent ?y . ?y :hasBrother ?z ���� ?x :hasUncle ?y
Graph G self join Graph G
family:John
family:Jack
family:Mary
:hasParent
:hasBrother
ns:Jill ns:Jedi
:relatedTo
:relatedTo
:hasUncle
14
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
:relatedTo
Adding a new edge :Mary :hasBrother :Tony
family:John
family:Jack
family:Mary
:hasParent
:hasBrother
ns:Jill ns:Jedi
:relatedTo
:relatedTo
:hasUncle
family:Tony
:hasBrother
:hasUncle
Native Parallel Inference Engine in Oracle (4)
• Incremental Maintenance when there is a
small addition Delta to a big graph G
?x :hasParent ?y . ?y :hasBrother ?z ���� ?x :hasUncle ?y
Graph G self join Graph G
– To update the closure when Graph G becomes
Graph G’, we can perform 3 efficient small joins than
1 big join.
family:John
family:Jack
family:Mary
:hasParent
:hasBrother
ns:Jill ns:Jedi
:relatedTo
:relatedTo
:hasUncle
15
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
1 big join.:relatedTo
family:John
family:Jack
family:Mary
:hasParent
:hasBrother
ns:Jill ns:Jedi
:relatedTo
:relatedTo
:hasUncle
family:Tony
Approach ?x :hasParent ?y ?y :hasBrother ?z
I: A single big
join
Graph G’ Graph G’
II: 3 small
joins
Delta Graph G
Graph G Delta
Delta Delta
:hasBrother
:hasUncle
- Optimizing Enterprise-scale OWL 2 RL Reasoning in a Relational Database System, ISWC 2010
Adding a new edge :Mary :hasBrother :Tony
• Option 1: Add user-defined rules
• Oracle supports, from release 10g, user-defined rules :
Extending Semantics Supported by Native OWL Inference Engine
Antecedents
�
Consequents
?z :parentOf ?x .
?z :parentOf ?y .
?x owl:differentFrom ?y .
?x :siblingOf ?y
15
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Option 2: Leverage external DL reasoners
• Option 3: User-defined inferencing in Oracle Database Release 12c
?x owl:differentFrom ?y .
Extensible Architecture for External Reasoners
External
In-Memory
OWL DL
Jena APIs
through
Oracle’s
Native Inference
Engine for
OWL 2 RL, EL &
15
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Reasoners
TrOWL/REL
Jena AdapterOWL 2 RL, EL &
user-defined
rules
Materialized Inference
Enabling Advanced Inference Capabilities• Parallel inference option
EXECUTE sem_apis.create_entailment('M_IDX',sem_models('M'),
sem_rulebases('OWLPRIME'), null, null, 'DOP=x');
– Where ‘x’ is the degree of parallelism (DOP)
• Incremental inference optionEXECUTE sem_apis.create_entailment ('M_IDX',sem_models('M'),
sem_rulebases('OWLPRIME'),null,null, 'INC=T');
• Enabling owl:sameAs option to limit duplicates
15
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Enabling owl:sameAs option to limit duplicatesEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),
sem_rulebases('OWLPRIME'),null,null,'OPT_SAMEAS=T');
• Compact data structuresEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),
sem_rulebases(‘OWLPRIME'),null,null, 'RAW8=T');
• OWL2RL/SKOS inferenceEXECUTE Sem_apis.create_entailment('M_IDX',sem_models('M'), sem_rulebases(x),null,null…);
• x in (‘OWL2RL’,’SKOSCORE’)
Querying RDF Semantic Graph
15
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
SPARQL Query Architecture
Jena API
Jena Adapter
Sesame API
Sesame Adapter
Standard SPARQL EndpointEnhanced with query management control
Java
HTTP
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12155
Jena Adapter Sesame Adapter
SEM_MATCHSQL
Java
SPARQL-to-SQL Core Logic
SPARQL vs. SQL
SPARQL NULL-accepting JOIN vs. SQL NULL-rejecting JOIN
?p ?n ?e
<p1> "Jon Stewart" "[email protected]"
<p2> "John Smith" NULL
?p ?e
<p1> "[email protected]"
<p2> "[email protected]"?p = ?p
AND
?e = ?e{?p :name ?nOPTIONAL {?p :email ?e}OPTIONAL {?p :mbox ?e} }
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12156
?p ?n ?e
<p1> "Jon Stewart" "[email protected]"
<p2> "John Smith" "[email protected]"SPARQL
?p ?n ?e
<p1> "Jon Stewart" "[email protected]"
<p2> "John Smith" NULL
SQL
OPTIONAL {?p :mbox ?e} }
SPARQL vs. SQL
• SPARQL weak typing vs. SQL strong typingFILTER (?a + ?b > 10) > doesn’t complain if ?a, ?b not numbers.
Also, when ?a and/or ?b are not numbers,
FILTER(?a + ?b > 10) and FILTER(!(?a + ?b > 10)) are both FALSE
FILTER(?a != ?b), etc. are complicated
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12157
• No Boolean type in SQL
• SPARQL UNION
SPARQL UNION is actually SQL UNION ALL
SPARQL graph patterns DO NOT have to be UNION compatible
SEM_MATCH: SPARQL in SQL
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12158
SEM_MATCH: Adding SPARQL to SQL
� Extends SQL with SPARQL constructs
– Graph Patterns, OPTIONAL, UNION
– Dataset Constructs
– FILTER – including SPARQL built-ins
– Solution Modifiers
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12159
– Solution Modifiers
� Benefits:
– Integrates graph data with existing enterprise data
– JOINs with other object-relational data
– Allows SQL constructs/functions
– DDL Statements: create tables/views
SEM_MATCH: Adding SPARQL to SQL
SELECT n1, n2FROMTABLE(SEM_MATCH('PREFIX foaf: <http://...>SELECT ?n1 ?n2FROM <http://g1>
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12160
FROM <http://g1>WHERE {?p foaf:name ?n1
OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }
FILTER (REGEX(?n1, "^A")) }ORDER BY ?n1 ?n2',
SEM_MODELS('M1'),…));
SEM_MATCH: Adding SPARQL to SQL
SELECT n1, n2FROMTABLE(SEM_MATCH('PREFIX foaf: <http://...>SELECT ?n1 ?n2FROM <http://g1>
SQL Table Function
n1 n2
Alex Jerry
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12161
FROM <http://g1>WHERE {?p foaf:name ?n1
OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }
FILTER (REGEX(?n1, "^A")) }ORDER BY ?n1 ?n2',
SEM_MODELS('M1'),…));
Alex Tom
Alice Bill
Alice Jill
Alice John
SEM_MATCH: Adding SPARQL to SQL
SELECT n1, n2FROMTABLE(SEM_MATCH('PREFIX foaf: <http://...>SELECT ?n1 ?n2FROM <http://g1>
SQL Table FunctionRewritable
( SELECT v1.value AS n1, v2.value AS n2FROM VALUES v1, VALUES v2
TRIPLES t1, TRIPLES t2, …WHERE t1.obj_id = v1.value_id
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12162
FROM <http://g1>WHERE {?p foaf:name ?n1
OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }
FILTER (REGEX(?n1, "^A")) }ORDER BY ?n1 ?n2',
SEM_MODELS('M1'),…));
AND t1.pred_id = 1234AND …
)Get 1 unified SQL query- Query optimizer sees 1 query
- Get all the performance of Oracle SQL Engine- optimizer, compression, indexes, parallelism, etc.
SEM_MATCH Table Function Arguments
SEM_MATCH(
query,
models,
rulebases,
'SELECT ?aWHERE { ?a foaf:name ?b }'
Container(s) for
asserted quadsEntailed
+
Basic unit of
access control
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12163
rulebases,
options
);
Built-in (e.g. OWL2RL)
and user-defined
rulebases
'ALLOW_DUP=T STRICT_TERM_COMP=F'
Entailed
triples+
GovTrack RDF DataRDF/OWL data about activities of US Congress
• Political Party Membership
• Voting Records
• Bill Sponsorship
• Committee Membership
• Offices and Terms
GOV_TBOX
GOV_PEOPLE
GOV_BILLS_110
GOV_BILLS_111
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12164
GOV_BILLS_111
GOV_VOTES_110
GOV_VOTES_111
GOV_TRACK_OWL
GOV_ALL_VMINFERENCE
OWL2RLGOV_ASSERT_VMAsserted data only
(4.3M triples)
Asserted + Inferred
(4.7M triples)
GOV_DISTRICTS (US Census)
GovTrack in OracleVirtual Models
Semantic Models
Rulebases
Entailments
Live Demo: SPARQL SELECT Modifiers
SELECT f$rdftermFROM TABLE(SEM_MATCH('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>SELECT DISTINCT ?fWHERE{ ?p vcard:N ?vn .
Find all distinct family names for senators
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12165
{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:title "Sen." .
}',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T '));
Live Demo: SPARQL FILTER
select n$rdfterm, b$rdfterm, g$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>SELECT ?n ?b ?gWHERE
Find information about Adams that were born before the American Revolution
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12166
WHERE{ ?p vcard:N ?vn .
?vn vcard:Family ?f .?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:gender ?g FILTER ( ?f = "Adams" &&
xsd:dateTime(?b) < xsd:dateTime("1776-07-04") )}',sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T '));
Live Demo: SPARQL MINUS
select n$rdfterm, b$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>SELECT ?n ?bWHERE
Find information about Kennedys that don’t have a homepage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12167
WHERE{ ?p vcard:N ?vn .
?vn vcard:Family "Kennedy" .?p foaf:name ?n .?p vcard:BDAY ?b .MINUS { ?p foaf:homepage ?h }
}',sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T ‘));
Live Demo: Aggregation
select n$rdfterm, cnt$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>SELECT ?n (COUNT(*) AS ?cnt)
Who sponsored the most bills?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12168
SELECT ?n (COUNT(*) AS ?cnt)WHERE{ ?s foaf:name ?n .
?b bill:sponsor ?s .}GROUP BY ?nORDER BY DESC(?cnt)LIMIT 10'
,sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T '));
Live Demo: Named Graph Query
select n$rdfterm, g$rdfterm, bcnt$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>SELECT ?n ?g (count(?b) as ?bcnt)FROM usgov:peopleFROM NAMED usgov:bills_110
Find number of bills sponsored in 110th and 111th congress
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12169
FROM NAMED usgov:bills_110FROM NAMED usgov:bills_111WHERE{ ?s foaf:name ?n
GRAPH ?g { ?b bill:sponsor ?s }}GROUP BY ?n ?gORDER BY ?n ?g'
,sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T '));
GovTrack Bill
Types
Live Demo: Inference
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12170
Live Demo: Entailment
select title$rdfterm, dt$rdfterm, btype$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?title ?dt ?btypeWHERE
Find bills sponsored by Barack Obama and their types
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12171
WHERE{ ?s foaf:name "Barack Obama" .
?b bill:sponsor ?s .?b dc:title ?title .?b rdf:type ?btype .?b bill:introduced ?dtFILTER(xsd:dateTime("2007-03-28") <= xsd:dateTime(?dt) &&
xsd:dateTime(?dt) < xsd:dateTime("2007-04-01") ) }',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T’));
Live Demo: Entailment with Property Path
select title$rdfterm, dt$rdfterm, btype$rdfterm, sc$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?title ?dt ?btype ?scWHERE
Find bills sponsored by Barack Obama and their types
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12172
WHERE{ ?s foaf:name "Barack Obama" .
?b bill:sponsor ?s .?b dc:title ?title .?b rdf:type ?btype .?btype rdfs:subClassOf+ ?sc .?b bill:introduced ?dtFILTER(xsd:dateTime("2007-03-28") <= xsd:dateTime(?dt) &&
xsd:dateTime(?dt) < xsd:dateTime("2007-04-01") ) }',sem_models('gov_assert_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T’));
Full Text Indexing with Oracle Text
� Filters graph patterns based on text search string
� Indexes all RDF Terms
– URIs, Literals, Language Tags, etc.
� Provide SPARQL extension function
– orardf:textContains(?var, "Oracle text search string")
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12173
orardf:textContains(?var, "Oracle text search string")
– Search String
� Group Operators: AND, OR, NOT, NEAR, …
� Term Operators: stem($), soundex(!), wildcard(%)
SQL> exec sem_apis.add_datatype_index('http://xmlns.oracle.com/rdf/text');
Live Demo: Full Text Search
select n$rdfterm, title$rdfterm, dt$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?n ?title ?dtWHERE
Find information about bills related to children and taxes
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12174
WHERE{ ?b bill:sponsor ?s .
?s foaf:name ?n .?b dc:title ?title .?b bill:introduced ?dtFILTER (orardf:textContains(?title, "$children AND $taxes"))}'
,sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T '));
GeoSPARQL with Oracle Spatial and Graph
� Support geometries encoded as ogc:wktLiterals
:semTech2011 ogc:asWKT"POINT(-122.4192 37.7793)"^^ogc:wktLiteral .
� Provide a library of spatial functions
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12175
SELECT ?sWHERE { ?s ogc:asWKT ?geom
FILTER(ogc:distance(?geom,"POINT(-122.4192 37.7793)"^^ogc:WKTLiteral,uom:KM) <= 10)
� Provide a library of spatial functions
OGC wktLiteral Datatype
SRS: WGS84 Longitude, Latitude
� Optional leading Spatial Reference System URI followed by OGC WKT geometry
string.<http://xmlns.oracle.com/rdf/geo/srid/{srid}>
� WGS 84 Longitude, Latitude is the default SRS (assumed if SRS URI is absent)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12176
SRS: WGS84 Longitude, Latitude"POINT(-122.4192 37.7793)"^^ogc:wktLiteral
SRS: NAD27 Longitude, Latitude"<http://xmlns.oracle.com/rdf/geo/srid/8260>
POINT(-122.4181 37.7793)"^^ogc:wktLiteral
OGC wktLiteral Datatype
SQL> exec sem_apis.add_datatype_index('http://xmlns.oracle.com/rdf/geo/WKTLiteral',
� Prepare for spatial querying by creating a spatial index for the ogc:wktLiteraldatatype
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12177
'http://xmlns.oracle.com/rdf/geo/WKTLiteral', options=>'TOLERANCE=1.0 SRID=8307
DIMENSIONS=((LONGITUDE,-180,180)(LATITUDE,-90,90))');
What Types of Spatial Data are Supported?
� Spatial Reference Systems
– Built-in support for 1000’s of SRS
– Plus you can define your own
– Coordinate system transformations applied transparently during indexing
and query
� Geometry Types
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12178
� Geometry Types
– Support OGC Simple Features geometry types
� Point, Line, Polygon
� Multi-Point, Multi-Line, Multi-Polyon
� Geometry Collection
– Up to 500,000 vertices per Geometry
Spatial Function Library
� Topological Relations
– ogcf:relate, ogcf:sfContains, ogcf:sfCrosses, ogcf:sfDisjoint, ogcf:sfEquals, ogcf:sfIntersects, ogcf:sfOverlaps, ogcf:sfTouches, ogcf:sfWithin
� Distance-based Operations
Standard OGC functions
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12179
– ogcf:distance, ogcf:buffer
� Geometry Operations
– ogcf:boundary, ogcf:convexHull, ogcf:envelope, ogcf:getSRID,
� Geometry-Geometry Operations
– ogcf:difference, ogcf:intersection, ogcf:symDifference, ogcf:union
Spatial Function Library
� Topological Relations
– orageo:relate
� Distance-based Operations
– orageo:distance, orageo:withinDistance, orageo:buffer, orageo:nearestNeighbor
Oracle Extensions
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12180
orageo:nearestNeighbor
� Geometry Operations
– orageo:area, orageo:length
– orageo:centroid, orageo:mbr, orageo:convexHull
� Geometry-Geometry Operations
– orageo:intersection, orageo:union, orageo:difference, orageo:xor
GovTrack Spatial Demo
� Congressional District Polygons (435)
– Complex Geometries
– Average over 1000 vertices per geometry
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12181
Load .shp file
from US Census
into Oracle Spatial
Generate triples using sdo_util.toWKTGeometry()
Load into Oracle
semantic model
Live Demo: Spatial Query
select name$rdfterm, cdist$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX pol: <http://www.rdfabout.com/rdf/schema/politico/>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/function/geosparql/>SELECT ?name ?cdist
Find the representative and district for Nashua, NH
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12182
SELECT ?name ?cdistWHERE{ ?person foaf:name ?name .
?person pol:hasRole/pol:forOffice/pol:represents ?cdist .?cdist ogc:asWKT ?cgeomFILTER (ogcf:sfContains(?cgeom,
"POINT(-71.46444 42.7575)"^^ogc:wktLiteral)) } ',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T '));
Live Demo: Spatial Query
select name$rdfterm, cdist$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX pol: <http://www.rdfabout.com/rdf/schema/politico/>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/function/geosparql/>PREFIX uom: <http://xmlns.oracle.com/rdf/geo/uom/>SELECT ?name ?cdist
Find the 10 nearest congressional districts to Nashua, NH ordered by distance
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12183
SELECT ?name ?cdistWHERE{ ?person foaf:name ?name .
?person pol:hasRole/pol:forOffice/pol:represents ?cdist .?cdist ogc:asWKT ?cgeomFILTER (orageo:nearestNeighbor(?cgeom, "POINT(-71.46444 42.7575)"^^ogc:wktLiteral,
"sdo_num_res=10")) }ORDER BY ASC(ogcf:distance(?cgeom,
"POINT(-71.46444 42.7575)"^^ogc:wktLiteral, uom:KM))',sem_models('gov_all_vm'), null, null, null, null, ' ALLOW_DUP=T PLUS_RDFT=T ')) order by sem$rownum;
Live Demo: Spatial Query
select name$rdfterm, cdist$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX pol: <http://www.rdfabout.com/rdf/schema/politico/>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/function/geosparql/>PREFIX uom: <http://xmlns.oracle.com/rdf/geo/uom/>SELECT ?name ?cdist
Find the 10 nearest congressional districts ordered by distance to center point
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12184
SELECT ?name ?cdistWHERE{ ?person foaf:name ?name .
?person pol:hasRole/pol:forOffice/pol:represents ?cdist .?cdist ogc:asWKT ?cgeomFILTER (orageo:nearestNeighbor(?cgeom, "POINT(-71.46444 42.7575)"^^ogc:wktLiteral,
"sdo_num_res=10")) }ORDER BY ASC(ogcf:distance(orageo:centroid(?cgeom),
"POINT(-71.46444 42.7575)"^^ogc:wktLiteral, uom:KM))',sem_models('gov_all_vm'), null, null, null, null, ' ALLOW_DUP=T PLUS_RDFT=T ')) order by sem$rownum;
RDB2RDF: Supporting RDF Views of Relational Data with Oracle Spatial and Graph
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12185
RDB2RDF Architecture
Mid-Tier Server
Application 1
Application 2 Application 3
SQLRDB2RDF
Shared Ontologies
SPARQL/SQL
� Use virtual RDF data
� Benefits
– Existing relational data
stays in place and
corresponding applications
do not need to change
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12186
Database Server
HR Database Sales DatabaseInventory Database
HR Schema Inventory Schema Sales Schema
SQLRDB2RDF
Inventory Graph Sales Graph
– Use of virtual mapping
eliminates synchronization
issues
– Common vocabulary helps
with data integration issues
RDB2RDF Support in Oracle Spatial and Graph
� Support both Direct Mapping and R2RML: RDB to RDF Mapping
Language
� API for creating, dropping and exporting (materializing) RDF views
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12187
– sem_apis.create_rdf_view_model
– sem_apis.drop_rdf_view_model
– sem_apis.export_rdf_view_model
� Query like native RDF data
Relational Data: HR Schema
EMPLOYEES (
EMPLOYEE_ID NOT NULL NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(20)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
JOBS (
JOB_ID NOT NULL VARCHAR2(10)
JOB_TITLE NOT NULL VARCHAR2(35)
MIN_SALARY NUMBER(6)
MAX_SALARY NUMBER(6)
)
JOB_HISTORY (
EMPLOYEE_ID NOT NULL NUMBER(6)
START_DATE NOT NULL DATE
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12188
DEPARTMENT_ID NUMBER(4)
)
DEPARTMENTS (
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME NOT NULL VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)
)
START_DATE NOT NULL DATE
END_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
DEPARTMENT_ID NUMBER(4)
)
LOCATIONS (
LOCATION_ID NOT NULL NUMBER(4)
STREET_ADDRESS VARCHAR2(40)
POSTAL_CODE VARCHAR2(12)
CITY NOT NULL VARCHAR2(30)
STATE_PROVINCE VARCHAR2(25)
COUNTRY VARCHAR2(30)
GEO_LOCATION SDO_GEOMETRY
)
Relational Data: E-CommercePRODUCT_INFORMATION (
PRODUCT_ID NOT NULL NUMBER(6)
PRODUCT_NAME VARCHAR2(50)
PRODUCT_DESC VARCHAR2(2000)
CATEGORY_ID NUMBER(2)
WEIGHT_CLASS NUMBER(1)
WARRANTY_PERIOD INTERVAL YEAR(2) TO MONTH
SUPPLIER_ID NUMBER(6))
PRODUCT_STATUS VARCHAR2(20)
LIST_PRICE NUMBER(8,2)
MIN_PRICE NUMBER(8,2)
CATALOG_URL VARCHAR2(50)
)
CUSTOMERS (
CUSTOMER_ID NOT NULL NUMBER(6)
CUST_FIRST_NAME NOT NULL VARCHAR2(20)
CUST_LAST_NAME NOT NULL VARCHAR2(20)
CUST_LOCATION_ID NUMBER(4)
PHONE_NUMBERS PHONE_LIST_TYP
CUST_EMAIL VARCHAR2(30)
)
REVIEWS (
REVIEW_ID NOT NULL NUMBER(6)
CUSTOMER_ID NUMBER(6)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12189
PRODUCT_CATEGORIES (
CATEGORY_ID NOT NULL NUMBER(6)
CATEGORY_NAME VARCHAR2(50)
CATEGORY_DESCRIPTION VARCHAR2(2000)
)
INVENTORIES (
PRODUCT_ID NOT NULL NUMBER(6)
WAREHOUSE_ID NOT NULL NUMBER(3)
QUANTITY_ON_HAND NOT NULL NUMBER(8)
)
WAREHOUSES (
WAREHOUSE_ID NOT NULL NUMBER(3)
WAREHOUSE_SPEC SYS.XMLTYPE
WAREHOUSE_NAME VARCHAR2(35)
LOCATION_ID NUMBER(4)
)
CUSTOMER_ID NUMBER(6)
PRODUCT_ID NUMBER(6)
RATING NUMBER(1)
REVIEW_TEXT VARCHAR2(4000)
)
Creating the RDF View
BEGINsem_apis.create_rdfview_model(
'r2r_model_full', SYS.ODCIVarchar2List('DEPARTMENTS','EMPLOYEES','JOBS','JOB_HISTORY','LOCATIONS',
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12190
'LOCATIONS','PRODUCT_INFORMATION','WAREHOUSES','INVENTORIES','CUSTOMERS','REVIEWS','PRODUCT_CATEGORIES'
), 'http://mydb/', options => ' SCALAR_COLUMNS_ONLY=T');
END;/
Live Demo: RDB2RDF
select o$rdftermfrom table(sem_match('SELECT DISTINCT ?o WHERE{ ?s rdf:type ?o }'
,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));
What classes do we have?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12191
,null,' PLUS_RDFT=T '));
Live Demo: RDB2RDF
select p$rdftermfrom table(sem_match('SELECT DISTINCT ?p WHERE{ ?s ?p ?o }'
,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));
What properties do we have?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12192
,null,' PLUS_RDFT=T '));
Live Demo: RDB2RDF
select p$rdfterm, o$rdftermfrom table(sem_match('SELECT ?p ?o WHERE{ <http://mydb/EMPLOYEES/EMPLOYEE_ID=1> ?p ?o }'
,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));
Describe Employee 1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12193
,null,' PLUS_RDFT=T '));
Live Demo: RDB2RDF
select fname$rdfterm, lname$rdfterm, t$rdftermfrom table(sem_match('SELECT ?fname ?lname ?tWHERE{ ?e <http://mydb/EMPLOYEES#FIRST_NAME> ?fname .
?e <http://mydb/EMPLOYEES#LAST_NAME> ?lname .?e <http://mydb/EMPLOYEES#ref-JOB_ID> ?j .
Find all employees and their job title
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12194
?e <http://mydb/EMPLOYEES#ref-JOB_ID> ?j .?j <http://mydb/JOBS#JOB_TITLE> ?t }'
,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));
Semantic Indexing: Extracting Semantic Data from Unstructured Text with Oracle Spatial
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12195
from Unstructured Text with Oracle Spatial and Graph
Subject Property Object graph
p:Marcus rdf:type rc::Person <>/r1>
p:Marcus :hasName “Marcus”^^> <>/r1>
p:Marcus :hasAge “38”^^xsd:.. <>/r1>
> > > >
Triples table with rowid references
CREATE INDEX ArticleIndex
ON Newsfeed (Article)
INDEXTYPE IS SemContext
PARAMETERS (‘my_policy’)
An
aly
tica
l Qu
erie
s
On
Gra
ph
Da
ta
Semantic Indexing- Overview
Auto maintained like a
B-tree index
LOCAL PARALLEL 4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12196
Rowid docId Article Source
1 Indiana authorities filed felony charges and a
court issued an arrest warrant for a financial
manager who apparently tried to fake his
death by crashing his airplane in a Florida
swamp. Marcus Schrenker, 38 >
CNN
2 Major dealers and investors > NW
Newsfeed table SemContext index on Article column
SELECT Sem_Contains_Select(1)
FROM Newsfeed
WHERE Sem_Contains (Article,
‘{?x rdf:type rc:Person .
?x :hasAge ?age .
FILTER(?age >= 35)}’,1)=1
AND Source = ‘CNN’
r1
r2
Content type:
•Text
•File (path)
•URL
Semantic Indexing - Key Components
� Extensible Information Extractor
– Programmable API to plug-in 3rd party extractors into the database.
� SemContext Indextype
– A custom indexing scheme that interacts with the extractor to manage the metadata
extracted from the documents efficiently and facilitates semantic search via SQL
queries.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12197
queries.
� SEM_CONTAINS Operator
– To identify documents of interest based on their extracted metadata, using standard
SQL queries.
� SEM_CONTAINS_SELECT Ancillary Operator
– To return additional information (SPARQL Query Results XML) about the documents
identified using SEM_CONTAINS operator.
Semantic Indexing - Key Concepts
� Policy
– Base Policy: <policy_name, extractor_type>
– Dependent Policy: <policy_name, base_policy_name, ontology>
� Association between indexes and policies
– Multiple policies may be associated with an index
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12198
– Metadata extracted from each base policy is stored separately
� Sem_Contains invocation is restricted to one policy
– Policy to be used can be specified by user
� Inference
– Document-centric
– Corpus-centric
Product Descr.: “Harry Potter” series book 2
� [from amazon.com]
The Dursleys were so mean that hideous that summer that all Harry Potter wanted was to get back to the Hogwarts
School for Witchcraft and Wizardry. But just as he's packing his bags, Harry receives a warning from a strange, impish
creature named Dobby who says that if Harry Potter returns to Hogwarts, disaster will strike.
And strike it does. For in Harry's second year at Hogwarts, fresh torments and horrors arise, including an outrageously
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12199
And strike it does. For in Harry's second year at Hogwarts, fresh torments and horrors arise, including an outrageously
stuck-up new professor, Gilderoy Lockheart, a spirit named Moaning Myrtle who haunts the girls' bathroom, and the
unwanted attentions of Ron Weasley's younger sister, Ginny.
But each of these seem minor annoyances when the real trouble begins, and someone--or something--starts turning
Hogwarts students to stone. Could it be Draco Malfoy, a more poisonous rival than ever? Could it possibly be Hagrid,
whose mysterious past is finally told? Or could it be the one everyone at Hogwarts most suspects...Harry Potter
himself?
News report: “Solving a Riddle of Primes”
� [from nytimes.com: Solving a Riddle of Primes - Published: May 20, 2013]
Three and five are prime numbers — that is, they are divisible only by 1 and by themselves. So are 5 and 7. And 11
and 13. And for each of these pairs of prime numbers, the difference is 2.
>
The proof has been elusive.
But last month, a paper > arrived “out of the blue” at the journal Annals of Mathematics, said Peter Sarnak, a
professor of mathematics at Princeton University and the Institute for Advanced Study and a former editor at the
journal, which plans to publish it. The paper, by Yitang Zhang of the University of New Hampshire, > does show an
infinite number of prime pairs whose separation is less than a finite upper limit — 70 million, for now. (Dr. Zhang used
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12200
infinite number of prime pairs whose separation is less than a finite upper limit — 70 million, for now. (Dr. Zhang used
70 million in his proof — basically an arbitrary large number where his equations work.)
>
Dr. Zhang’s proof takes advantage of a 2005 paper by Daniel Goldston of San Jose State University, Janos Pintz of the
Alfred Renyi Institute of Mathematics in Budapest and Cem Yildirim of Bogazici University in Istanbul, which had shown
there would always be pairs of primes closer than the average distance between two primes.
>
Dr. Zhang also used techniques developed in the 1980s by Henryk Iwaniec of Rutgers, Enrico Bombieri of the Institute
for Advanced Study and John B. Friedlander of the University of Toronto, adding his own ingenuity to tie everything
together in a way others had been unable to.
>
Live Demo: Semantic Indexing
select docid, sem_contains_select(1) sparqlrsltfrom doctablewhere sem_contains (doc, 'SELECT ?nm ?orgnm ?posnmWHERE { ?s rdf:type ctype:PersonCareer .
?s c:careertype "professional" .
Find information about professionals and their position name if they are
a professor
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12201
?s c:careertype "professional" . ?s c:person ?o . ?o c:name ?nm . ?s c:organization ?org . ?org c:name ?orgnm . OPTIONAL{?s c:position ?pos .
?pos c:name ?posnm . FILTER (regex(?posnm,"^professor"))}
}', sem_aliases(sem_alias('ctype','http://s.opencalais.com/1/type/em/r/')
,sem_alias('c','http://s.opencalais.com/1/pred/')), 1) = 1 order by docid;
Part 6: Performance and Scalability
20
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Graphs Are Big and Are Getting Bigger
• Social Scale*– 1 Billion vertices, 100 billion edges
• Web Scale*• 50 billion vertices, 1 trillion edges
20
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• 50 billion vertices, 1 trillion edges
• Brain Scale*• 100 billion vertices, 100 trillion edges
* An NSA Big Graph Experiment
http://www.pdl.cmu.edu/SDI/2013/slides/big_graph_nsa_rd_2013_56002v1.pdf
A Big RDF Graph Linking Many Data Sources
20
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Graphs at Scale
Properties of Graph Problems
1. Data Driven
2. Unstructured
20
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
2. Unstructured
3. Poor Locality
4. High IO to Compute ratio
Graphs at Scale
Properties of Graph Problems
1. Data Driven
A. Dictated by Node and Link structure
20
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
A. Dictated by Node and Link structure
B. Structure not known a priori
2. Unstructured
3. Poor Locality
4. High IO to Compute ratio
Graphs at Scale
Properties of Graph Problems
1. Data Driven
2. Unstructured
A. Irregular Structure of Graphs
20
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
A. Irregular Structure of Graphs
B. Difficult to parallelize
C. Difficult to partition
3. Poor Locality
4. High IO to Compute ratio
Graphs at Scale
Properties of Graph Problems
1. Data Driven
2. Unstructured
20
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
2. Unstructured
3. Poor Locality
A. Represent relationships between entities
B. Esp. for graphs derived from analysis
4. High IO to Compute ratio
1. Distributed Memory Architectures
2. Partitioned Global Address Space
3. Shared memory
4. Massively Multi-threaded
Parallel Architectures and Computing Models
20
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
4. Massively Multi-threaded
Parallel Graph Queries and Hardware
1. Task Granularity – where to introduce ||’ism?A. IO parallelism – DBWR, partitioning, || hints
B. CPU parallelism – multi-threading, single thread multi core
2. Memory ContentionA. Shared memory in a Global Address Space
B. Not an issue in Oracle
21
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
B. Not an issue in Oracle
3. Load Balancing1. How to load balance multi threaded systems
2. Less Pronounced on Shared Memory
4. Shared Graphs
A. Focus on throughput
B. IO ||’ism
Setup for Performance
• Use a balanced hardware system for databases and mid-tier servers
– A single, huge physical disk for everything is not recommended.
• Multiple hard disks tied together through ASM is a good practice
– A virtual machine for multiple databases and applications is not recommended
– Make sure throughput of hardware components matches up
-
Hardware spec
100 - 200 MB/sCPU core
Sustained throughputComponent
21
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
30 - 50 MB/sDisk (spindle)
80 MB/s*2 Gbit/sGigE NIC (interconnect)
2 Gbit/s
2 Gbit/s
8 * 2 Gbit/s
1/2 Gbit/s
200 MB/sDisk controller
200 MB/sFiber channel
1,200 MB/s16 port switch
100/200 MB/s1/2 Gbit HBA
2k-7k MB/sMEM
1. Plan on Going to Disk
2. Running Out of Memory
3. Being Smart About It
4. Have IB already in place
OrXXX
21
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
4. Have IB already in place
5. Locate data close to compute nodes
6. Realizing It is Part of a Larger EcoSystem
Oracle Engineered Systems
Purpose Built General Purpose
21
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
SPARC
SuperCluster
ExalogicExadata
© 2011 Oracle Corporation – Proprietary and Confidential
Oracle Exadata Database Machine
• Designed to be the best platform for Oracle Database
• Integrates and optimizes servers, storage, network
– Using high-volume servers in a scale-out architecture
– Flash-optimized
• Special software brings database intelligence to
21
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Special software brings database intelligence to
storage, flash, and networking
– To deliver extreme performance and data compression
• Deployments are fast, reliable, and supportable
• Optimal for all database workloads
– OLTP, Data Warehousing, and Database Clouds
Exadata Storage Software Unique Features
• Exadata Smart Scans– 10X or greater reduction in data sent
to database servers
• Exadata Storage Indexes– Eliminate unnecessary I/Os
• Hybrid Columnar Compression
• Exadata Smart Flash Cache– Breaks random I/O bottleneck by
increasing IOPs by up to 20X
– Doubles user data scan bandwidths
• I/O Resource Manager (IORM) – Enables storage grid by prioritizing
21
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Hybrid Columnar Compression– Efficient compression increases
effective storage capacity and increases user data scan bandwidths by a factor of up to 10X
– Very efficient on triple, quad
– Enables storage grid by prioritizing I/Os to ensure predictable performance
• Quality of Service (QoS)
– Actively meet and maintain SLAs
– Memory Guard to protect existing
current transactions from memory-
based failures
Components - Exadata, Exalogic, and SuperCluster
Exadata X3-8 Exalogic X3-2 SuperCluster
Compute Grid
- Compute Servers 2 Sun Server X2-8 30 Sun Server X3-2 4 SPARC T4-4
- Compute Cores 160 Xeon 480 Xeon 128 Sparc T4
- Compute Server Memory 4 TB 7.5 TB 4 TB
- Operating Systems Linux Linux / Solaris Solaris
- Oracle Virtual Machine Partitioning - Yes Yes
- Exalogic Elastic Cloud Software - Yes Yes
21
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. © 2011 Oracle Corporation – Proprietary and Confidential
- Exalogic Elastic Cloud Software - Yes Yes
Storage Grid
- ZFS 7320 NAS Appliance - Yes (60TB) Yes (60TB)
- Exadata Storage Servers 14 (up to 504TB) - 6 (up to 216TB)
Networking
- Ethernet 1 GbE / 10GbE 10GbE 10GbE
- InfiniBand Fabric Yes Yes Yes
- Fibre Channel SAN Connectivity - - Yes (optional)
Configuration for Performance
21
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Configure OS and Network
• Network configuration is important to data integration performance
– Network MTU (TCP, Infiniband)
– net core rmem_max, wmem_max
21
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Linux OS Kernel parameters– shmmax,
– shmall,
– aio-max-nr,
– sem, >
Configure Database
• Database parameters– SGA, PGA, filesystemio_options,
– db_cache_size, auto dop, >
• Calibrate I/O performance‒ DBMS_RESOURCE_MANAGER.CALIBRATE_IO
21
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
‒ DBMS_RESOURCE_MANAGER.CALIBRATE_IO
• Gather statistics
• Run a typical workload on a typical data set‒ Check AWR report to see top waits
‒ Check SQL Monitor report to find bottlenecks in SQL executions
Configure Mid-Tier Server
• Understand bottleneck‒ Use tools, jstack/top for example, to identify top threads
• Set a proper JVM heap size‒ Pay close attention to GC activities and memory related settings
• Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, :NewRatio,
22
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, :NewRatio, :SurvivorRatio, etc.
• For Java clients using JDBC (through Jena Adapter)‒ Network MTU, Oracle SQL*Net parameters including SDU, TDU,
SEND_BUF_SIZE, RECV_BUF_SIZE,
‒ Linux Kernel parameters: net.core.rmem_max, wmem_max, net.ipv4.tcp_rmem, tcp_wmem, >
Performance and Scalability
• Scales to 100s of billions of triples (petabytes) and more
- Scales linearly with Oracle database and hardware
- No limitations as with other in-memory approaches
• Fast loading of triples– Incremental and bulk loading
22
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Incremental and bulk loading
• Parallelism is exploited– Load, Query, Inference
• Fast access by persisting asserted and inferred triples
Oracle Spatial and Graph - LUBM 200K on 3-Node RAC
Sun Server X2-4 Load, Inference and Query Performance
• The LUBM 200K Graph has 48+ Billion triples (edges)
– Original graph has 26.6 Billion unique triples (quads)
– Inference produced another 21.4 Billion triples
• Data Loading Performance
– Triples Loaded and Indexed Per Second (TLIPS): 273K
22
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Setup:
Hardware: Sun Server X2-4, 3-node RAC
- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)
- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)
256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)
Software: Oracle Database 11.2.0.3.0, SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G
Note: Only one node in this RAC was used for performance test. Test performed in April 2013.
• Inference Performance
– Triples Inferred and Indexed Per Second (TIIPS): 327K
• SPARQL Query Performance
– Query Results Per Second (QRPS): 459K
Oracle Spatial and Graph - LUBM 200K on 3-Node RAC
Sun Server X2-4 Load Performance
Data Set Quads Loaded TimeDegrees of
Parallelism
LUBM200K
Load into Staging Table:
Load into the RDF graph:
27.4 billion Quads (with duplicates)
26.6 billion Quads (unique quads)
2 hrs 6 min.
22 hrs 23 min.
DOP = 66
DOP = 80
•Data loading included de-duplication and building of two indexes on the quads. A significant portion (11 hrs 18
minutes) of the total load time was spent in building the two indexes.
22
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
minutes) of the total load time was spent in building the two indexes.
•Loading from the 198 compressed N-Quad formatted files was done by defining an External Table (with gunzip
preprocessor) on those files and then using sem_apis.LOAD_INTO_STAGING_TABLE
•Load flags => parse mbv_method=shadow parallel=80 parallel_create_index DEL_BATCH_DUPS=USE_INSERT
Setup:
Hardware: Sun Server X2-4, 3-node RAC
- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)
- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)
256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)
Software: Oracle Database 11.2.0.3.0, SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G
Note: Only one node in this RAC was used for performance test. Test performed in April 2013.
Data Set (# quads) Quads Inferred Time Degrees of Parallelism
LUBM 200K (27.4B) 21.4 billion 17 hrs 56 min. DOP = 80
Inference Semantics: OWLPrime + the following components:
Inference included building 2 indexes on the inferred triples that took a little over 5 hrs.
Oracle Spatial and Graph - LUBM 200K on 3-Node RAC
Sun Server X2-4 Inference Performance
22
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
INTERSECT, INTERSECTSCOH, SVFH, THINGH, THINGSAM, UNION
Inference Options: RAW8=T, Dynamic Sampling level 1
Setup:
Hardware: Sun Server X2-4, 3-node RAC
- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)
- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)
256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)
Software: Oracle Database 11.2.0.3.0, SGA_TARGET=850G and PGA_AGGREGATE_TARGET=150G
Note: Only one node in this RAC was used for performance test. Test performed in April 2013.
Ontology
LUBM 200K – 48B quads
27.4 billion asserted quads
26.6 billion inferred quads
LUBM Benchmark Queries
OWLPrime
& new
inference
Query Q1 Q2 Q3 Q4 Q5 Q6 Q7
# answers 4 494.5M 6 34 719 2.067B 67
Time (sec) 0.01 1160 0.01 609.22 0.04 1105.07 712.48
Query Q8 Q9 Q10 Q11 Q12 Q13 Q14
# answers 7790 4 224 15
Oracle Spatial and Graph - LUBM 200K on 3-Node RAC
Sun Server X2-4 Query Performance
22
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
DOP = 40, Dynamic sampling level = 6. 4.18 Billion answers generated in 2.53 hrs on a single node.
inference
components# answers 7790 53.86M 4 224 15 926088 1.568B
Time (sec) 1228.95 3139.28 0.01 0.01 1.2 208.88 946.01
Setup:
Hardware: Sun Server X2-4, 3-node RAC
- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)
- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)
256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)
Software: Oracle Database 11.2.0.3.0, SGA_TARGET=850G and PGA_AGGREGATE_TARGET=150G
Note: Only one node in this RAC was used for performance test. Test performed in April 2013.
• LUBM 25K local inference on Sun M8000
• 6.1B+ quads (3.4B asserted, 2.7B inferred)
Parallel Execution Performance on M8000
345
250
300
350
400
Tim
e (
in m
inu
tes)
Oracle’s Parallel Execution
is completely transparent!
22
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
214
180
150 145
121
0
50
100
150
200
250
0 20 40 60 80 100 120 140
Tim
e (
in m
inu
tes)
Parallel Degree (DOP)
is completely transparent!• Cross CPUs/Cores on a single node
• Cross multiple nodes in a cluster
Inference Performance on Exadata V2
1 Preliminary result: 1 round of OWLPrime (OWL Horst semantics)
Data Set (# triples) Triples Inferred Time Degrees of Parallelism
LUBM 100K (13B) 5B 1h, 58’ 1 DOP = 32
LUBM 25K (3.3B) 2.7B 4h, 7’ 2 DOP = 32
LUBM 8K (1.1B) 869M 46’ 2 DOP = 64
22
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
2 Inference: OWLPrime + components: INTERSECT,INTERSECTSCOH,SVFH,THINGH,THINGSAM,UNION
Setup:
Hardware: Full Rack Sun Oracle Database Machine X2-2 (8 nodes, 72GB RAM per node), and Exadata Storage Server
Storage required: LUBM8K: 330GB or LUBM25K 1TB + 110GB temp table space
Software: Oracle Database 11.2.0.1.0 + Patch 9819833: SEMANTIC TECHNOLOGIES 11G R2 FIX BUNDLE 2
Each node: SGA_TARGET=32G and PGA_AGGREGATE_TARGET=31G
Query Performance on Exadata V2Auto DOP used. 465,849,803 answers generated for LUBM 25K in 274.2 sec.
Ontology
LUBM 25K
3.3 billion triples &
2.7 billion inferred
LUBM Benchmark Queries
Query Q1 Q2 Q3 Q4 Q5 Q6 Q7
# answers 4 2528 6 34 719 260M 67
22
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
OWLPrime
& new inference
components
Complete? Y Y Y Y Y Y Y
Time
(sec)0.01 20.65 0.01 0.01 0.02 23.07 4.99
Query Q8 Q9 Q10 Q11 Q12 Q13 Q14
# answers 7790 6.8M 4 224 15 0.11M 197M
Complete? Y Y Y Y Y Y Y
Time
(sec)0.48 203.06 0.01 0.02 0.02 2.40 19.45
Part 7: Demonstration
22
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Managing and Mining RDF Graph Data with Oracle Spatial and Graph
• Open source ontology editor
• Java APIs (Jena Adapter)
allows an easy integration of
Integration with Protégé
Tools: Ontology Editing and Engineering
Protégé 4.1
http://protege.stanford.edu/
23
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
allows an easy integration of
Protégé with Oracle Spatial
and Graph
• Commercial tool
• Java APIs (Jena Adapter)
allows an easy integration of
Integration with TopQuadrant Tools
Tools: Ontology Editing and Engineering
23
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
allows an easy integration of
TopQuadrant TopBraid Suite
with Oracle Spatial and Graph
• Open source visualizer
• Visualizes RDF and OWL stored
in Oracle Database
Integration with Cytoscape
Tools: Navigation and Visualization of Graphs
http://www.cytoscape.org/
23
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
in Oracle Database
• Enables Fish Eye views by
building summary-detail graphs
• Performant with large RDF data
sets
• Commercial tool
• Standards-based SPARQL
web service endpoint allows an
Integration with Visualization
Tools: Navigation and Visualization of Graphs
Tom Sawyer Perspectives
23
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
web service endpoint allows an
easy integration with Tom
Sawyer’s Perspective
• Standards based SPARQL
web service endpoint and
SPARQL Gateway feed XML
Integration with
Oracle Business Intelligence EE
Tools: Reporting RDF in Business Intelligence
23
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
SPARQL Gateway feed XML
(transformed from SPARQL
XML response) to BI
• Open source language
• Statistical computing and chart for graph data
Integration with Oracle R Enterprise
Tools: Statistical Graph AnalyticsOracle R Enterprise feature of Oracle Advanced Analytics
23
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
for graph data
• Produces publication quality plots
• Highly extensible with open source R packages
Tools: Discovery & Predictive Analysis Oracle Data Mining feature of Oracle Advanced Analytics
Problem Classification Sample Problem
Anomaly Detection Given demographic data about a set of customers, identify
customer purchasing behavior that is significantly different from
the norm
Association Rules Find the items that tend to be purchased together and specify
their relationship – market basket analysis
23
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
their relationship – market basket analysis
Clustering Segment demographic data into clusters and rank the probability
that an individual will belong to a given cluster
Feature Extraction Given demographic data about a set of customers, group the
attributes into general characteristics of the customers
F1 F2 F3 F4
Using Oracle Data Mining with RDF Graph Data
• Make the semantic data available to a data mining tool in an
appropriate format
– Turn a semantic data store into yet another data source for DM tool
create view N_COUNTRY_BD_RATE as select name
, to_number(brate) as brate
, to_number(drate) as drate
, to_number(popu) as population
23
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
, to_number(popu) as population
, to_number(mig) as net_migration_rate
, to_number(imr) as infant_motal_rate
, to_number(leab) as life_expectancy
from table(sem_match('{ SQL
?subject <http://www.cia.gov/cia/publications/factbook#Birth_rate> ?brate .
?subject <http://www.cia.gov/cia/publications/factbook#Death_rate> ?drate .
?subject <http://www.cia.gov/cia/publications/factbook#Name> ?name .
?subject <http://www.cia.gov/cia/publications/factbook#Population> ?popu . SPARQL
?subject <http://www.cia.gov/cia/publications/factbook#Net_migration_rate> ?mig .
?subject <http://www.cia.gov/cia/publications/factbook#Infant_mortality_rate> ?imr .
?subject <http://www.cia.gov/cia/publications/factbook#Life_expectancy_at_birth> ?leab .
}‘ … ))
Using Oracle Data Mining with RDF Graph Data• Tie it all together
– Turn a semantic data store into yet another data source to DM
– Follow the conventional DM process:
• Data preparation, build/evaluate model, deployment
• This is one example of what you may get:
23
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Using Oracle Data Mining with RDF Graph Data• Tie it all together
– Turn a semantic data store into yet another data source to DM
– Follow the conventional DM process:
• Data preparation, build/evaluate model, deployment
• Some Mining results can be saved back as RDF into Oracle database
23
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Anomaly Detection output in SQL
Convert into RDF
:AbnormalCase1 :hasSubject
:Dominica .
:AbnormalCase1 :probability
“0.54”
Oracle Enterprise Manager
Understand exactly what is the going on• Configuration
• Storage
• Security
24
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Performance
– Real time monitor
– CPU
– Memory
– I/O
– Sessions
– Activity
– Workload
– X
• X
Oracle Enterprise Manager
Understand exactly what is the going on• Configuration
• Storage
• Security
24
1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Performance
– Real time monitor
– CPU
– Memory
– I/O
– Sessions
– Activity
– Workload
– X
• X
Part 8: SQL-Based Graph Analytics
24
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Why Use SQL for Graph Analytics?
• A standard language that runs every platform
• Modern RDBMS provides many cool features
– Cost based optimizer
– Parallel execution
– Partition pruning
– Hierarchical queries, Match_Recognize
24
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
– Hierarchical queries, Match_Recognize
– Smart scans, Table Compression, Index Compression, IMDB
– Temporal query
– Transparent query rewrite (OLS/VPD)
– No need to worry about memory management
– No need to worry about workload management in a cluster
• Even NoSQL folks want to have a SQL interface
Shortest Path
• Why is shortest path (SP) important?
– SP is a good measure of how closely related things are
– SP is a foundation to many other algorithms and metrics
• Distance, diameter, radius, eccentricity
• Closeness Centrality
– In G, if John is closer to others than Mary, then John is more important because John can spread
information quicker.
24
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
information quicker.
• Betweenness Centrality
– In G, if John locates on more shortest paths between other nodes than Mary, then John is more
important because John can better control information flow
• Many existing algorithms
– Dijkstra, Bellman Ford, BFS, A*, Johnson, >
• Proposed solution: SQL-Based Dijkstra (SBD)
http://en.wikipedia.org/wiki/Centrality
• Twitter GraphNodes 41,652,230
Edges 1.4+ Billion
Max Out Degree 2,997,469
Max In Degree 770,155
Weights: integers in [1, 100]
• Performance on DesktopRandom Tests Run1 Run2 Run3
1: v766 v12345 2.84s 1.95s 1.91s
2: v766 v667023 2.41s 1.51s 1.44s
3: v12 v667023 4.55s 4.46s 4.54s
4: v667023 v12 20.09s 1.55s 1.12s
• Performance on ExadataRun1 Run2 Run3
1.11s 1.06s 1.15s
0.86s 0.85s 0.73s
6.61s 3.84s 3.82s
3.91s 0.88s 0.84s
Performance of Finding Shortest Path in SQL
24
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Weights: integers in [1, 100]
• Hardware: 8GB RAM Desktop, 3-SATA disk, quad core
5: v20 v13220032 8.62s 7.36s 7.42s
6: v32309592 v14293310 153.73s 3.93s 2.78s
7: v702013 v37083737 14.99s 1.85s 1.62s
8: v12041332 v37083590 1.55s 1.50s 1.03s
9: v57640358 v19945621 11.91s 0.77s 0.65s
10: v19945621v57640358 315.59s 93.44s 5.39s
• Hardware: Exadata, 128 cores, 128G SGA, 64G PGA (shared
with 3 other databases)
6.52s 5.85s 5.84s
2.12s 2.12s 2.42s
1.0s 1.02s 1.08s
0.92s 0.92s 1.0s
0.56s 0.54s 0.53s
4.05s 3.59s 3.76s
Test performed in Dec 2012 ~ Feb 2013
Page Ranking
• Page Ranking (PR) definition (from Google’s two
founders)
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
where PR(A) is the Page Rank of page (node or vertex) A, d is a
damping factor, C(Ti) is the out degree of page Ti
– PR is a variant of Eigenvector centrality which measures influence A
T2
T1
34
24
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
of a node in a network
– Intuitions:
• A vertex (page) is more important if there are more incoming edges
• A vertex is more important if there’re in-edges from more important vertexes
• Contribution of a vertex is diluted if out-degree is high
– Proposed Solution: iteratively update Page Ranking of vertexes
using SQL and parallel execution
T3
1T4
1
Performance of Page Ranking in SQL
• Twitter GraphNodes 41,652,230
Edges 1.4+ Billion
Max Out Degree 2,997,469
Max In Degree 770,155
Weights: not relevant
• Performance evaluation
• 50 iterations of PR calculation took less than 39 minutes
• DOP = 128
• After 10 iterations, the average PR change is less than 0.003
• Hardware: Exadata, 128 cores, 128G SGA, 64G PGA (shared with 3
other databases)
0.5
24
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49
Avg | PR(v) – PRprev(v) |
IterationsTest performed in Dec 2012 ~ Feb 2013
Summary
• Graph is a flexible, intuitive data modeling
• W3C standards-based RDF Semantic Graph
• Tools that work with Graph data
• SQL-Based Graph Analytics
24
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
• Oracle provides an efficient, secure, and scalable platform for
– Managing graph data
– Performing graph analytics
24
9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
25
0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.