department of mathematics and computer science tu/e technische universiteit eindhoven wise...
TRANSCRIPT
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 1
RAL: an RDF Algebra
Flavius Frasincar
Geert-Jan Houben
Richard Vdovjak
Peter Barna
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 2
Contents
1. Introduction
2. RAL Goals
3. RAL Data Model
4. RAL Operators
5. Conclusion
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 3
1. Introduction
• Metadata is machine understandable information about web resources or other things [Source: Tim Berners-Lee, “Metadata Architecture”]
• RDF (Resource Description Framework) is the Web metadata language for the Web
• RDF extends the syntactic interoperability of XML to semantic interoperability being the foundation for the Semantic Web
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 4
Semantic Web Architecture “Layer Cake”
[Source:
Tim Berners-Lee Director W3C
Keynote speech at XML2000
“RDF and the Semantic Web”
(Washington DC, 6 Dec. 2000)]
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 5
Hera
• Hera research project: Web Information Systems (WIS) and web (hypermedia) generation in WIS
• WIS use RDF to represent and query application data for:
– Semantic integration of data coming from heterogeneous sources
– Semantic information presentation
– Semantic querying
• Huge quantities of data and metadata need to be processed in real-time: optimization is crucial
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 6
Hera Methodology/Suite
ConceptualDesign
IntegrationDesign
ApplicationDesign
AdaptationDesign
(Search)Agent
inforequest
(meta) data
ConceptualModel
ApplicationModel
inforequest
(slice)presentation
End User
RQL / RDF XML
UserModel
PresentationEngine
ApplicationEngine
IntegrationEngine
CuypersEngine
AdaptationEngine
inforequest
HTML/WML/SMIL
IntegrationModel
Semantic Layer Application Layer Presentation Layer
Presentation Templates(XSLT)
PresentationDesign
User/PlatformProfile
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 7
RDF Representations
Primitive semantics: Subject Predicate ObjectThree alternative notations:
• Triple (http://example.com/sb.jpg, painted_by, “Rembrandt”)
• RDF/XML <rdf:Description rdf:ID=http://example.com/sb.jpg>
<painted_by> Rembrandt </painted_by> </rdf:Description>
• Graph painted_byhttp://example.com/sb.jpg Rembrandt
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 8
RDF Query Languages
• Triple-based:
– Triple [successor of SiLRI] (Horn logic)
– Metalog (Datalog)
• XML-based:
– RDF Query
– RQuery (XQuery)
• Graph-based (but not graphical):
– RQL (OQL)
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 9
2. RAL Goals
• Support the formal specification of RDF query languages
• Provide a reference framework to compare different RDF query languages
• Consider the result construction phase
– presently neglected by RDF query languages which focus only on extraction
• Enable algebraic query optimization
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 10
RAL
• RAL Data Model: specify what information is accessible (for RAL operators) in an RDF graph– Nodes: Resources and Literals– Edges: Properties
• RAL Operators: define operators working on collections of nodes from the RAL Data Model– Extraction Operators– Loop Operators – Construction Operators
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 11
3. RAL Data Model
• R is the set of resources R = U B
• U is the set of URI references rdf:Property U
• B is the set of blank nodes
• L is the set of literals U, B, L are disjoint
• P is the set of properties P R, rdf:type P
R L
rdf:type
rdf:PropertyU B
P
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 12
• An RDF model M is a finite set of triples (statements)
M R U (R L)
• The set of properties of an RDF model M
PM = {p| (s, p, o) M (p, rdf:type, rdf:Property) M}
• The RDF graph model is similar to a directed labeled graph (DLG)
– It is not a DLG since it allows for multiple edges between two nodes
– It is not a general multigraph because different edges between two nodes cannot share the same label
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 13
• The RDF graph model corresponding to an RDF model M is defined by
GM = (N, E, lN , lE), lN : N R L, lE : E P
using the following construction mechanism:
for each (s, p, o) M
add nodes ns, no to N (different only if s o)
assign lN (ns) = s, lN (no) = o
add ep to E as a directed edge between ns and no
assign lE ( ep ) = p
Observations:
• lN (.) is an injective partial function
• lE ( .) is a total function
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 14
Basic Properties
Basic Property
Result for resources
Result for literals
id lN(u), u U lN(s), s L
type Resource Literal
Basic Property
Result
name lE(p), p P
subject r, r R
object o, o R L
• Two non-blank nodes are equal if they have the same id
• Two blank nodes are equal if they have the same properties
and the corresponding property values are equal
Nodes Edges
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 15
RDF(S)-Closure
• RDF Model Theory defines the RDF-closure and RDFS-closure of an RDF Model M by proposing a set of rules for generating new triples
• Extensional data: the original model M triples• Intensional data: the new triples generated by the RDF(S)-
closure
• RAL operators work on extensional+intensional data• Variants of the operators can be defined to neglect the
intensional data (similar to the RQL strict interpretation)
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 16
4. RAL Operators
• All operators have the following form
o[f](x1, x2, … xn: expression) where an expression is a collection of nodes and f is a function
having as input/output collection of nodes
• Extraction Operators: retrieve the needed information from an RDF graph
• Loop Operators: control the repetitive application of certain operators
• Construction Operators: build new RDF graphs from the extracted data
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 17
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique Artifact
Painter
Creator
Painting
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 18
Projection[re_name](e: expression)
computes the values of the properties with a name given by the regular expression re_name over strings on the input collection given by e
Example [(P|p)aint[s]#](r4)
returns the resources painted by r4
4.1 Extraction Operators
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 19
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique Artifact
Painter
Creator
Painting
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 20
Selection
[condition](e: expression)
selects input collection nodes fulfilling the given condition
Example
[[tname] = “Chiaroscuro”](c)
where c is the collection of input resources r1, r2, r3, and r4, returns the resources representing the painting technique with the name“Chiaroscuro”
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 21
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique Artifact
Painter
Creator
Painting
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 22
Cartesian Product
(x: expression) (y: expression)
for each element in the Cartesian product of the input collections, a blank node that has all properties of both originating nodes is added to the result
Example
[[rdf:type] = Technique](c) [[rdf:type] = Painter](c)
returns a collection of blank nodes, each blank node having all the properties of the corresponding pair from the Cartesian product (the new nodes have both types Technique and Painter)
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 23
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique Artifact
Painter
Creator
Painting
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
exemplified_by paints
exemplified_by paints
tname cname
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 24
Join (x: expression) ⋈[condition] (y: expression)
[condition](x y)
is a Cartesian product followed by a selection
Example
(x: [[rdf:type] = Technique](c)) ⋈[[exemplified_by](x) = [paints](y)] (y: [[rdf:type] = Painter](c))
returns a collection of blank nodes, each blank node having all the properties of the corresponding pair from the Cartesian product that satisfies the given condition
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 25
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique Artifact
Painter
Creator
Painting
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
exemplified_by paints
exemplified_by paints
tname cname
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 26
Union, Difference, Intersection
(x: expression) (y: expression)
where {, , }
defined as in set theory
Example
[[rdf:type] = Technique](c) [[rdf:type] = Painter](c), returns the collection of resources obtained by combining the two collections (these two collections are obtained using two selections)
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 27
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique Artifact
Painter
Creator
Painting
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 28
4.2 Loop Operators
Mapmap[f](e: expression)
applies the function f to each element of the input collection; the function results are added in the output collection
Example
map[[ rdfs:subClassOf]](Painting, Painter) computes the parent classes using the property rdfs:subClassOf
for the collection consisting of Painting and Painter
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 29
Creator
Painter
Artifact
Painting
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 30
Creator
Painter
Artifact
Painting
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 31
Kleene Star[f](e: expression)
repeats the function f possibly infinite times starting with the given input collection; at each iteration the results of the function are added to the next function input
Example
[[rdfs:subClassOf]](Painting))
computes the transitive closure of the property rdfs:subClassOf starting from Painting, i.e. Painting and all its superclasses
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 32
Artifact
Painting
r2 r3
r1 r4
exemplified_by exemplified_by paints paints
Stone Bridge
Rembrandt tname
image name
cname
Technique
Painter
Creator
Image
Literal
Literal Literal
Literal tname
exemplified_by
exemplifies creates
created_by
name year
cname
image
paints
painted_by
1638 Self Portrait 1628
http://example.com/sb.jpg http://example.com/sp.jpg image
name year year name
Chiaroscuro
inferred rdf:type
rdf:type
rdfs:subClassOf
rdfs:subPropertyOf
Legend
schema
instance
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 33
4.3 Construction Operators
Create Nodenode[type, id]()
adds a new node to the graph with the given type and id (id is missing for blank nodes) and returns this node; if a resource is created, an rdf:type edge is added between the resource and the node representing rdfs:Resource
The Create Node operator assigns a unique (in the resulted RDF graph) internal identifier for each created node
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 34
Caravagio
rdfs:Resource
rdf:type
Examplenode[Resource]() and node[Literal,“Caravagio”]()
create a Resource representing a blank node and a Literal representing the string “Caravagio”
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 35
Create Edgeedge[name, subject](object: expression)
adds edges between the subject node and each of the nodes in the object collection, and returns the subject node; the label of the edges is given by name which is the id of a property resource
The Create Node and Create Edge operators abort if the “well-formed RDF(S) graph” conditions (e.g. rdf:type cannot refer to a literal, literals cannot have properties etc.) are not met after construction
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 36
nameCaravagio
rdfs:Resource
rdf:type
Example edge[name, node[Resource]()](node[Literal, ”Caravagio”]())
creates an edge labeled with name between the nodes defined in the previous example
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 37
5. Conclusion• The RAL algebra is developed from a DB perspective and
proposes a set of operators similar to their relational algebra counterparts:– Extraction Operators: Projection, Selection, Cartesian
Product, Join, Union, Difference, Intersection• Similar to the existing semi-structured query languages RAL
considers powerful repetition operators:– Loop Operators: Map, Kleene Star
• As opposed to present RDF query languages RAL supports result construction:– Construction Operators: Create Node, Create Edge
WISE 2002
/department of mathematics and computer science
TU/e technische universiteit eindhoven
December 12, 2002 38
Future Work
• Analyze the power of expression of RAL compared to RQL, a popular RDF query language at present time (build a translation scheme from RQL to RAL)
• Formally specify the semantics of other RDF query languages in terms of RAL
• Compare the power of expression of different RDF query languages using RAL as reference language
• Explore equivalence rules for RAL expressions to be used in query optimization
• Develop an RDF query optimization algorithm on RAL