modelgraphdb
TRANSCRIPT
Model-Driven Design of Graph Databases
Roberto De Virgilio, Antonio Maccioni and Riccardo Torlone
33rd edition of the International Conference on Conceptual Modeling (ER2014) – Atlanta, GA (U.S.A.)
Context (Theory)
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Logics
ER
ModelsConcepts
Meta- Models
NoDB
NoSQL
NoSPARQL
AgileDevelopment
Schema-free
Semantics
Context (Practice)
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Software engineers still reason at different abstraction levels
Data engineers still model their databases
We cannot give up modeling with all NoSQLs
Graph Databases
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
admin
works
belongsbelongs
admin
friend
marriedbelongs
belongs
likes
workedfriend
likes
follows
Property Graph Model
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Uname: DateUid: u01
Bname: DatabaseBid: b02
label: follower
label: admin
n1 n2
Graph DB Modeling: How?
Compact: Sparse: Dense: Reduces the number of data accesses
Can violate property graph constraints
Accesses and updates can be inefficient
Reduces number of joins
Needs human intervention for a semantic enrichment
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Our 3-steps Approach
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
1) Generation of an oriented ER diagram
2) Partitioning of the elements (entities and relationships) of the obtained diagram
3) Definition of a template over the resulting partition.
Use case: ER
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
User BlogComment
follower
admin
tag
post
publish
(0:N)(1:1)
(0:N) (0:N)
(0:N)(1:1)
(0:N)(0:N)
(0:N) (1:1)uid
uname
bid
bname
cid
msg
date
Categoryabout
ExternalLink
contains
(1:1)
(1:1)
(0:N)
ctiddescriptioneidurl
date
(1:1)
Orienting the ER
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
ENTITY 1
ENTITY 2
RELATIONSHIP
(0:1)
RELATIONSHIP : 0
(0:1)
ENTITY 1
ENTITY 2
ENTITY 1
ENTITY 2
RELATIONSHIP
(0:N)
RELATIONSHIP : 1
(0:1)
ENTITY 1
ENTITY 2ENTITY 1
ENTITY 2
RELATIONSHIP
(0:N)
RELATIONSHIP : 2
(0:N)
ENTITY 1
ENTITY 2
Use case: O-ER
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
User BlogComment
tag:2
post:1
publish:1
admin:1
follower:2
ExternalLink
Category
contains:0 about:1
Partitioning the O-ER
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Rule 1: if a node n is disconnected then it forms a group by itself.
Rule 2: if a node n has w−(n)>1 and w+(n)>0 then n forms a group by itself.
Rule 3: if a node n has w−(n)<2 and w+(n)<2 then n is added to the group of a node m such that there exists the edge (m, n) in the O-ER diagram.
Use case: partitioned O-ER
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
User BlogComment
tag:2
post:1
publish:1
admin:1
follower:2
ExternalLink
Category
contains:0 about:1
Template of the Graph Database
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
A template describes homogeneous nodes occurring in a graph database and the ways they are connected.
A template is similar to a logical schema, but it is not a schema!
A template is derived by grouping together attributes of nodes in the partitioning.
Use case: the Template
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
ExternalLink.eidExternalLink.urlComment.cidComment.msg
User.uidUser.uname
Blog.bidBlog.bnameCategory.ctid
Category.description
datelabel
label
labellabel
label
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Empirical Results
sparse native strategy
our approach
ER 2014 Model-Driven Design of Graph Databases Atlanta, USA, 28th Oct 2014
Conclusion and Future WorkConceptual modeling of graph databases is useful and possible
Our methodology is systemindependent and aim at minimizing data accesses
We want to involve more aspects in the design process and verify the approach with other NoSQL
We are developing a tool that allows the developer to customize the modeling of this methodology by tuning on the parameters