Download - Querying UML Class Diagrams - FoSSaCS 2012
Querying UML Class Diagrams
Georg Gottlob
Department of Computer Science
University of Oxford
joint work with Andrea Calì, Giorgio Orsi and Andreas Pieris
member
leads
works
[1,2]
1 2
1 2 student professor
group (1,1)
(1,N)
(1,N)
(1,1)
m_name g_name since
student v member
leads¡ v works
student v :professor
Context C1 inv: C1.allInstances -> forAll ( x1: C1 | C2.allInstances ->
forAll ( x2: C2 | x1=x2 implies x2.oclIsTyeOf(C)))
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Issues Index[0..1]:Str
getIndex():List
Major Modeling Formalisms for Data and Objects
UML Class Diagrams
ER Diagrams
Description Logics
Object Constraint Language (OCL)
XML Schemas
Relational Data Dependencies
person: SSN Name
employee[1] person[1]
Datalog± : A Unifying Logical Framework
Description Logics (DL-Lite, EL,…) Relational Constraints
(IDs, FKDs,…)
Datalog±
Datalog
… providing: Logical foundations, semantics, decidability and
complexity results for reasoning and query-answering,
identification of tractable fragments, …
Conceptual Models (UML, ER,…)
Datalog§
8X8Y (X,Y) (X)
• Extend Datalog with additional features such as:
• Existential quantification (9): TGDs
• Equality atoms (=): EGDs
• Constant false (?): Negative constraints
• But query answering under Datalog[9] is undecidable
[see, e.g., Beeri & Vardi, ICALP 81]
• Datalog[9,=,?] is syntactically restricted ! Datalog§
8X8Y (X,Y) 9Z (X,Z)
8X (X) Xi = Xj
8X (X) ?
Restriction: Guardedness
8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z) 9W Q(X,W)
• All 8-variables occur in one body atom - guard atom
guard
• Query answering is PTIME-complete in data complexity
[Calì, G. & Lukasiewicz, PODS 09]
• Models of finite treewidth ) decidability of query answering
[Calì, G. & Kifer, KR 08] related to work by [Andréka, Németi & van Benthem] and [Grädel]
Reasoning over UML Class Diagrams
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company Stock
Index[0..1]:Str
getIndex():List
Executive Person
Satisfiability: Does the diagram admit at least one instantiation?
Reasoning over UML Class Diagrams
• Satisfiability - the diagram has a (possibly infinite) non-empty
instantiation
• Full Satisfiability - the diagram has a (possibly infinite) instantiation
where each class and association is non-empty
• Finite Satisfiability - the diagram has a finite instantiation
Reasoning over UML Class Diagrams
Student Worker
Person
{disjoint}
finitely satisfiable, e.g., {Worker(john),Person(john)}
but not fully - student class is necessarily empty
Complexity of Reasoning over UML Class Diagrams
• Satisfiability is EXPTIME-complete
• Full Satisfiability is EXPTIME-complete
• Finite Satisfiability is EXPTIME-complete
[Artale, Calvanese & Ibánez-García, ER 10]
[Berardi, Calvanese & De Giacomo, Artificial Intelligence 05]
[implicit in Berardi, Calvanese & De Giacomo, Artificial Intelligence 05]
Querying UML Class Diagrams
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Index[0..1]:Str
getIndex():List
Issues
Which persons have a potential conflict of interest?
Querying UML Class Diagrams
Stock
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Index[0..1]:Str
getIndex():List
Conflict(P) Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Index[0..1]:Str
getIndex():List
Issues
Does anybody have a potential conflict of interest?
Querying UML Class Diagrams
Stock
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Index[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Person(john)
Issues Index[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Person(john)
Company(LU)
Company(BA)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Issues Index[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Person(john)
Company(LU)
Company(BA)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
{P ! john, C1 ! LU, C2 ! BA, S ! BAY}
Issues Index[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams: Existential Rules
Group(DB)
Is there a professor who works in the database group?
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
Group(DB)
Ans Professor(P), WorksIn(P,DB)
Querying UML Class Diagrams: Existential Rules
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
since: Date
Group(DB)
Leads(z1,DB)
Professor(z1)
R*(z1,DB,z2)
CLeads(z2)
Ans Professor(P), WorksIn(P,DB)
Querying UML Class Diagrams: Existential Rules
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
Group(DB)
Leads(z1,DB)
Professor(z1)
R*(z1,DB,z2)
CLeads(z2)
WorksIn(z1,DB)
Ans Professor(P), WorksIn(P,DB)
Querying UML Class Diagrams: Existential Rules
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
Group(DB)
Leads(z1,DB)
Professor(z1)
R*(z1,DB,z2)
CLeads(z2)
WorksIn(z1,DB)
…
Ans Professor(P), WorksIn(P,DB)
{P ! z1, DB ! DB}
Querying UML Class Diagrams: Existential Rules
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
Querying UML Class Diagrams
D
8M (M ² D [ ! M ² Q) D [ ² Q ,
Q
Querying UML Class Diagrams
D
8M (M ² D [ ! M ² Q) D [ ² Q ,
Q
M ¶ D Æ M ²
From Diagrams to First-Order Logic (Datalog§)
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company Stock
Index[0..1]:Str
getIndex():List
Executive Person
From Diagrams to First-Order Logic (Datalog§)
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company Stock
Executive Person
8X8Y Member(X,Y) Company(X) ^ Executive(Y)
8X Company(X) 9Y9Z Member(X,Y) ^ Member(X,Z) ^ Y ≠ Z
8X Executive(X) 9Y Member(Y,X)
Index[0..1]:Str
getIndex():List
[Satoh & Kaneiwa, TCS 10
and Berardi et al., AI 05]
From Diagrams to First-Order Logic (Datalog§)
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company Stock
Executive Person
8X Company(X) 9Y Issues(X,Y)
8X8Y8Z Stock(X) ^ Issues(Y,X) ^ Issues(Z,X) Y = Z
8X8Y Stock(X) ^ Index(X,Y) Str(Y)
8X8Y Stock(X) ^ getIndex(X,Y) List(Y)
8X Stock(X) 9Y Issues(Y,X)
Index[0..1]:Str
getIndex():List
[Satoh & Kaneiwa, TCS 10
and Berardi et al., AI 05]
From Diagrams to First-Order Logic (Datalog§)
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company Stock
Executive Person
8X Executive(X) Person(X)
Index[0..1]:Str
getIndex():List
Issues
[Satoh & Kaneiwa, TCS 10
and Berardi et al., AI 05]
Complexity of Query Answering
• EXPTIME-complete in combined complexity (everything is part of the input)
• coNP-complete in data complexity (the diagram and the query are fixed)
• Undecidable when diagrams are combined with arbitrary OCL
(Object Constraint Language) constraints
[implicit in Berardi et al., AI 05 and Lutz, IJCAR 08]
[implicit in Ortiz, Calvanese & Eiter, AAAI 06]
[folklore]
Research Challenge: Reduce High Complexity
• Diagrams often have very large instantiations
• Some applications require very large diagrams
• OCL constraints, that are not expressible diagrammatically,
lead to undecidability or high complexity
Our Goals
• Restrict UML class diagrams to achieve tractability of query
answering in data complexity
• Better understanding of combined complexity
• Add relevant OCL constraints without losing tractability of
query answering in data complexity
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
Lean UML Class Diagrams
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
• For each association A:
- upper bounds mU ,nU 2 {1,1},
- if A generalizes some other association, then mU = nU = 1
Lean UML Class Diagrams
C1 C2 mL..mU nL..nU A
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
• For each association A:
- upper bounds mU ,nU 2 {1,1},
- if A generalizes some other association, then mU = nU = 1
• Completeness constraints are forbidden
Lean UML Class Diagrams
C1 C2
C
{complete}
8X C(X) C1(X) _ C2(X)
C1 C2 mL..mU nL..nU A
Lean UML Class Diagrams
8X C(X) C1(X) _ C2(X) C1 C2
C
{complete}
• For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
• For each association A:
- upper bounds mU ,nU 2 {1,1},
- if A generalizes some other association, then mU = nU = 1
• Completeness constraints are forbidden
C1 C2 mL..mU nL..nU A
Lean UML Class Diagrams: Example
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company Stock
Index[0..1]:Str
getIndex():List
Executive Person
Lean UML Class Diagrams: Example
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
Add Some Non-Diagrammatic Constraints (OCL)
C1 C2
C
C3
disjoint classes
8X C2(X) ^ C3(X) ?
We need negative constraints of the form
8X C1(X) ^ … ^ Cn(X) ?
C1 C2
C
most-specific class
We need most-specific class constraints of the form
8X C1(X) ^ … ^ Cn(X) C(X)
8X C1(X) ^ C2(X) C(X)
Add Some Non-Diagrammatic Constraints (OCL)
type the domain of Enrolled
We need domain-type constraints of the form
8X8Y C(X) ^ Attr(Y,X) T(Y)
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
Enrolled[0..1]:CS-Course
CS-Student
Enrolled[0..1]:Course
Student
Add Some Non-Diagrammatic Constraints (OCL)
type the domain of Enrolled
We need domain-type constraints of the form
8X8Y C(X) ^ Attr(Y,X) T(Y)
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
Enrolled[0..1]:CS-Course
CS-Student
Enrolled[0..1]:Course
Student
Add Some Non-Diagrammatic Constraints (OCL)
pullback rule
type the domain of Enrolled
We need domain-type constraints of the form
8X8Y C(X) ^ Attr(Y,X) T(Y)
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
Enrolled[0..1]:CS-Course
CS-Student
Enrolled[0..1]:Course
Student
Add Some Non-Diagrammatic Constraints (OCL)
pullback rule will make a difference!
OCL (Object Constraint Language)
Context C1 inv:
C1.allInstances -> forAll ( x1: C1 |
C2.allInstances -> forAll ( x2: C2 |
x1=x2 implies x2.oclIsTypeOf(C)
)
)
Context C1 inv:
C1.allInstances -> forAll ( x1: C1 |
C2.allInstances -> forAll ( x2: C2 |
x1<>x2
)
)
Context Object inv:
Object.allInstances -> forAll ( y: Object |
y.a.oclIsTypeOf(C) implies y.oclTypeOf(T)
)
8X C1(X) ^ C2(X) ?
8X C1(X) ^ C2(X) C(X)
8X8Y C(X) ^ a(Y,X) T(Y)
8X C(X) 9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) Yi ≠ Yj1 ≤ i < j ≤ n
8X C(X) 9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) Yi ≠ Yj1 ≤ i < j ≤ n
8X C(X) 9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) Yi ≠ Yj1 ≤ i < j ≤ n
8X1…8Xn A(X1,…,Xn) C1(X1) ^ … ^ Cn(Xn)
8X1…8Xn8Y A(X1,…,Xn) ^ R*(X1,…,Xn,Y) CA(Y)
8X1…8Xn A(X1,…,Xn) 9Y R*(X1,…,Xn,Y)
8X1…8Xn8Y8Z A(X1,…,Xn) ^ R*(X1,…,Xn,Y) ^ R*(X1,…,Xn,Z) Y = Z
8X1…8Xn8Y1…8Yn8Z R*(X1,…,Xn,Z) ^ R*(Y1,…,Yn,Z) ^ CA(Z) X1 = Y1 ^ … ^ Xn = Yn
8X8Y8Z C(X) ^ A(X,Y) ^ A(X,Z) Y = Z
8X8Y8Z C(X) ^ A(Y,X) ^ A(Z,X) Y = Z
8X1…8Xn A1(X1,…,Xn) A2(X1,…, Xn)
8X8Y C(X) ^ Attr(X,Y) T(Y)
8X8Y8Z C(X) ^ Attr(X,Y) ^ Attr(X,Z) Y = Z
8X8Y1…8Yn8Z C(X) ^ Op(X,Y1, …,Yn,Z) T1(Y1) ^ … ^ Tn(Yn) ^ T(Z)
8X8Y1…8Yn8Z18Z2 C(X) ^ Op(X,Y1, …,Yn,Z1) ^ Op(X,Y1, …,Yn,Z2) Z1 = Z2
8X C1(X) C2(X)
8X C1(X) ^ … ^ Cn(X) ?
Lean UML Class Diagrams as Rules
8X18X2 A(X1,X2) C1(X1) ^ C2(X2)
8X18X28Y A(X1,X2) ^ R*(X1,X2,Y) CA(Y)
8X18X2 A(X1,X2) 9Y R*(X1,X2,Y)
8X18X2 A1(X1,X2) A2(X1,X2)
8X C(X) 9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn) Yi ≠ Yj1 ≤ i < j ≤ n
8X C(X) 9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X) Yi ≠ Yj1 ≤ i < j ≤ n
8X C(X) 9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn) Yi ≠ Yj1 ≤ i < j ≤ n
8X1…8Xn A(X1,…,Xn) C1(X1) ^ … ^ Cn(Xn)
8X1…8Xn8Y A(X1,…,Xn) ^ R*(X1,…,Xn,Y) CA(Y)
8X1…8Xn A(X1,…,Xn) 9Y R*(X1,…,Xn,Y)
8X1…8Xn8Y8Z A(X1,…,Xn) ^ R*(X1,…,Xn,Y) ^ R*(X1,…,Xn,Z) Y = Z
8X1…8Xn8Y1…8Yn8Z R*(X1,…,Xn,Z) ^ R*(Y1,…,Yn,Z) ^ CA(Z) X1 = Y1 ^ … ^ Xn = Yn
8X8Y8Z C(X) ^ A(X,Y) ^ A(X,Z) Y = Z
8X8Y8Z C(X) ^ A(Y,X) ^ A(Z,X) Y = Z
8X1…8Xn A1(X1,…,Xn) A2(X1,…, Xn)
8X8Y C(X) ^ Attr(X,Y) T(Y)
8X8Y8Z C(X) ^ Attr(X,Y) ^ Attr(X,Z) Y = Z
8X8Y1…8Yn8Z C(X) ^ Op(X,Y1, …,Yn,Z) T1(Y1) ^ … ^ Tn(Yn) ^ T(Z)
8X8Y1…8Yn8Z18Z2 C(X) ^ Op(X,Y1, …,Yn,Z1) ^ Op(X,Y1, …,Yn,Z2) Z1 = Z2
8X C1(X) C2(X)
8X C1(X) ^ … ^ Cn(X) ?
Lean UML Class Diagrams as Rules
check via query
pre-process
8X18X2 A(X1,X2) C1(X1) ^ C2(X2)
8X18X28Y A(X1,X2) ^ R*(X1,X2,Y) CA(Y)
8X18X2 A(X1,X2) 9Y R*(X1,X2,Y)
8X18X2 A1(X1,X2) A2(X1,X2)
8X8Y C(X) ^ Attr(X,Y) T(Y)
8X C(X) 9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn)
8X C1(X) C2(X)
8X18X2 A(X1,X2) C1(X1) ^ C2 (X2)
8X18X28Y A(X1,X2) ^ R*(X1,X2,Y) CA(Y)
8X18X2 A(X1,X2) 9Y R*(X1,X2,Y)
8X C(X) 9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn)
8X C(X) 9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X)
8X18X2 A1(X1,X2) A2(X1,X2)
8X C1(X) ^ … ^ Cn(X) C(X)
8X8Y C(X) ^ Attr(Y,X) T(Y)
Lean UML Class Diagrams as Rules
additional constraints
associations
classes
8X8Y C(X) ^ Attr(X,Y) T(Y)
8X C(X) 9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn)
8X C1(X) C2(X)
8X18X2 A(X1,X2) C1(X1) ^ C2 (X2)
8X18X28Y A(X1,X2) ^ R*(X1,X2,Y) CA(Y)
8X18X2 A(X1,X2) 9Y R*(X1,X2,Y)
8X C(X) 9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn)
8X C(X) 9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X)
8X18X2 A1(X1,X2) A2(X1,X2)
8X C1(X) ^ … ^ Cn(X) C(X)
8X8Y C(X) ^ Attr(Y,X) T(Y)
Lean UML Class Diagrams as Rules
guard atoms
Data Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative
& most-specific class & domain-type constraints is PTIME-complete
Data Complexity of Query Answering
Proof:
• in PTIME: reduction to assertions 8X8Y body(X,Y) 9Z head(X,Z),
where body(X,Y) has a guard-atom [Calì, G. & Lukasiewicz, PODS 09]
• PTIME-hardness (even without domain-type constraints): reduction
from Path System Accessibility
Theorem: Query answering under Lean UML class diagrams + negative
& most-specific class & domain-type constraints is PTIME-complete
Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams +
negative & most-specific class constraints is PSPACE-complete
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Rel
evan
t p
art
:
poly
nom
ial
dep
th
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Rel
evan
t p
art
:
poly
nom
ial
dep
th
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (Chase)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Rel
evan
t p
art
:
poly
nom
ial
dep
th
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
C(x)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
With each current atom A we need to compute
and memorize its type(A). This is in NP
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
R(a,v) S(v,w)
Q= R(X1,X2) ^ S(X2,X3) ^ B(X4,X5) ^ T(X5) 8X8Y C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) B(X,Y)
With each current atom A we need to compute
and memorize its type(A). This is in NP
Combined Complexity of Query Answering
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Initialization rules 8X initial(X) initial-state(X)
8X initial(X) cell0[α0,1](X)
8X initial(X) celli[αi,0](X), for each i 2 {1,…,n-1}
8X initial(X) celli[0,0](X), for each i 2 {n,…,m-1}
initial-state cell0[α0,1] cell1[α1,0] celln-1[αn-1,0] … celln[0,0] cellm-1[0,0] …
initial
8X config(X) 9Y succ(X,Y)
8X8Y config(X) ^ succ(X,Y) config(Y)
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Configuration generation rules
8X initial(X) config(X)
Combined Complexity of Query Answering
initial
config
succ[1..1]:config
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Rules to describe the transition from one configuration to another,
e.g., state transition rules
Combined Complexity of Query Answering
8X8Y s1-celli [α1,1](X) ^ succ(X,Y) state-s2(Y), for each i 2 {0,…,m-1}
for each δ(hs1,α1i) = hs2,α2,di:
in configuration X, which has state s1, the i-th
cell contains α1, and the cursor is over cell i s1-celli [α1,1]
succ[0..1]:state-s2
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Acceptance rule 8X accept-state(X) accept(X)
• Initial database D = {initial(c)} - c is the initial configuration
• Boolean CQ Q= accept(X)
• Turing machine accepts I iff D [ ² Q
Combined Complexity of Query Answering
accept
accept-state
Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative &
most-specific class & domain-type constraints is EXPTIME-complete
Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative &
most-specific class & domain-type constraints is EXPTIME-complete
Proof:
• in EXPTIME: reduction to assertions 8X8Y body(X,Y) 9Z head(X,Z), where
body(X,Y) has a guard-atom, and all the predicates are of bounded arity
[Calì, G. & Kifer, KR 08]
Combined Complexity of Query Answering
Proof:
• EXPTIME-hardness: simulation of an alternating PSPACE Turing machine
• Acceptance rules - q 2 {9,} and i 2 {1,2}:
8X q-accept-state(X) accept(X)
8X8Y accept(X) ^ succ1(Y,X) accept1(Y)
8X 9-state(X) ^ accept1(X) accept(X)
8X -state(X) ^ accept1(X) ^ accept2(X) accept(X)
accept
q-accept-state
domain-type constraints
most-specific class
constraints
8X8Y accept(X) ^ succ2(Y,X) accept2(Y)
8X 9-state(X) ^ accept2(X) accept(X)
pullback rules
Further Restrictions
Enrolled [0..1]:CS-Course
CS-Student
Enrolled [0..1]:Course
Student
Further Restrictions
different classes have disjoint sets
of attributes and operations
instead of 8X8Y Student(X) ^ Enrolled(X,Y) Course(Y)
we have 8X8Y Student-Enrolled(X,Y) Course(Y)
Data complexity in AC0 and combined complexity NP-complete
Enrolled [0..1]:CS-Course
CS-Student
Enrolled [0..1]:Course
Student
Complexity of Querying UML Class Diagrams
Data
Complexity
Combined
Complexity
Full
Lean
coNP-complete
PTIME-complete
EXPTIME-complete
PSPACE-complete
in AC0 NP-complete
+ negative
+ specific-class
+ domain-type
UML
Formalism
Additional
Constraints
none
Lean + negative
+ specific-class
Restricted
Lean
+ negative
+ specific-class
PTIME-complete
EXPTIME-complete
Datalog± : A Unifying Logical Framework
Description Logics (DL-Lite, EL,…) Relational Constraints
(IDs, FKDs,…)
Datalog±
Datalog
… without losing tractable data complexity
Conceptual Models (UML, ER,…)
Datalog± : A Unifying Logical Framework
Description Logics (DL-Lite, EL,…) Relational Constraints
(IDs, FKDs,…)
Datalog±
Datalog
… without losing tractable data complexity
Conceptual Models (UML, ER,…)
Ontological Reasoning and Datalog
DL Assertion Datalog Rule
Concept Inclusion
emp v person emp(X) person(X)
(Inverse) Role Inclusion
reports¡ v mgr reports(X,Y) mgr(Y,X)
Role Transitivity
trans(mgr) mgr(X,Y) ^ mgr(Y,Z) mgr(X,Z)
Concept Product sen-emp £ emp v moreThan sen-emp(X) ^ emp(Y) moreThan(X,Y)
Ontological Reasoning and Datalog
DL Assertion Datalog Rule
Concept Inclusion
emp v person emp(X) person(X)
(Inverse) Role Inclusion
reports¡ v mgr reports(X,Y) mgr(Y,X)
Role Transitivity
trans(mgr) mgr(X,Y) ^ mgr(Y,Z) mgr(X,Z)
Participation
emp v 9report emp(X) 9Y report(X,Y)
Disjointness
emp u customer v ? emp(X) ^ customer(X) ?
Functionality
funct(reports) reports(X,Y) ^ reports(X,Z) Y = Z
Concept Product sen-emp £ emp v moreThan sen-emp(X) ^ emp(Y) moreThan(X,Y)
Guardedness
8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z) 9W Q(X,W)
• All 8-variables occur in one body atom - guard atom
guard
• Query answering is PTIME-complete in data complexity
[Calì, G. & Lukasiewicz, PODS 09]
• Models of finite treewidth ) decidability of query answering
[Calì, G. & Kifer, KR 08]
• Properly extends ELH (same data complexity)
A v 9R.B
Ontology Querying
ELH TBox Datalog§ Representation
A v B
A u B v C
8X A(X) B(X)
8X A(X) ^ B(X) C(X)
8X A(X) 9Y R(X,Y) ^ B(Y)
ELH: Popular DL with PTIME data complexity [Baader, IJCAI 03 and Rosati, DL 07]
9R.A v B 8X8Y R(X,Y) ^ A(Y) B(X)
R v P 8X8Y R(X,Y) P(X,Y)
First-Order Rewritable Datalog± Fragments
Q
Q
compilation
Q
D
First-Order Rewritable Datalog± Fragments
Q
Q
compilation
Q*
translation
D
SQL
First-Order Rewritable Datalog± Fragments
FO+
8D: D [ ² Q , D ² Q*
D
Q
Q
compilation
Q*
translation evaluation
SQL
First-Order Rewritable Datalog± Fragments
Linearity
• Just one atom in the body
• Linear TGDs are first-order rewritable [Calì, G. & Lukasiewicz, PODS 09]
Query answering in AC0 data complexity
• Linear TGDs are trivially guarded
• Properly extends DL-Lite (same data complexity)
guard
8X8Y R(X,Y) ! 9Z (X,Z)
• Polynomial-size rewriting (SQL DL = NR Datalog) [G. & Schwentick, KR 12]
Ontology Querying
DL-Lite TBox Datalog§ Representation
A v B
A v 9R
8X A(X) B(X)
8X A(X) 9Y R(X,Y)
8X8Y R(X,Y) A(X)
DL-Lite: Popular family of DLs with AC0 data complexity (OWL 2 QL) [Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, JAR 07]
9R v A
R v P 8X8Y R(X,Y) P(X,Y)
Finite Controllability
D [ ² Q , D [ ²fin Q ?
• Holds for inclusion dependencies
[Rosati, PODS 06]
• Holds for guarded Datalog± (in fact, for the guarded fragment)
[Bárány, G. & Otto, LICS 10]
• Different from finite-model property of the guarded fragment:
If D [ [ Q has a model, then D [ [ Q it has a finite one.
• For each attribute assertion Attribute[ i..j ]:Type: j = 1
• For each association A: mU = nU = 1
• Disjointness and negative constraints are forbidden
• Most-specific class and domain-type constraints are allowed
Finite Controllability and Lean UML Class Diagrams
C1 C2 mL..mU nL..nU A
set of guarded TGDs ) finite controllability holds
Datalog§: Overview
8X8Y R(X,Y,Y) ! 9Z P(X,Z)
Linear Guarded
Datalog§: Overview
8X8Y R(X,Y,Y) ! 9Z P(X,Z) 8X8Y8Z R(X,Y) ^ S(Y,Z) ! 9W P(Y,W)
Linear Guarded Sticky-join
Datalog§: Overview
Sticky-join Linear Sticky Guarded
DL-Lite
ELH
PTIME-complete FO-rewritable
IDs + FKDs Lean UCDs
Datalog§: Summary of Complexity Results
Data Fixed Combined
Guarded
Linear
PTIME-complete
in AC0
NP-complete
NP-complete
2EXPTIME-complete
PSPACE-complete
Same complexity with negative constraints and non-conflicting EGDs
Sticky-join in AC0 NP-complete EXPTIME-complete
Datalog§: Next Steps
Linear Guarded Sticky-join
PTIME-complete FO-rewritable
?
PTIME-complete
…with disjunction in the head
… finite-model reasoning
Thank you!
But…
• What about joins in rule bodies?
8E8M elephant(E) ^ mouse(M) biggerThan(E,M)
• What about the DL assertion concept product?
8A8D8P runs(D,P) ^ area(P,A) 9E employee(E,D,P,A)
But…
• What about joins in rule bodies?
8E8M elephant(E) ^ mouse(M) biggerThan(E,M)
• What about the DL assertion concept product?
8A8D8P runs(D,P) ^ area(P,A) 9E employee(E,D,P,A)
8X8Y R(X,Y) 9Z R(Y,Z)
8X8Y R(X,Y) S(X)
8X8Y S(X) ^ S(Y) P(X,Y)
Infinitely many symbols in S
P forms an infinite clique
No tree-like models guaranteed
Stickiness
8A8D8P runs(D,P) ^ area(P,A) 9E employee(E,D,P,A)
8E8M elephant(E) ^ mouse(M) biggerThan(E,M)
Stickiness
8X8Y8Z R(X,Y) ^ P(Y,Z) 9W T(X,Y,W)
8X8Y8Z T(X,Y,Z) 9W S(Y,W)
8X8Y8Z R(X,Y) ^ P(Y,Z) 9W T(X,Y,W)
8X8Y8Z T(X,Y,Z) 9W S(X,W)
Stickiness
• Properly generalize inclusion dependencies
• Backward-resolution terminates
• Query answering is in AC0 in data complexity (first-order rewritability)
[Calì, G. & Pieris, VLDB 10]
• Properly extends DL-Lite (same data complexity)
Additional Features
• EGDs, e.g., 8X8Y8Z reports(X,Y) ^ reports(X,Z) ! Y = Z
Non-Conflicting EGDs: do not interact with TGDs
Preliminary check without adding complexity
• Negative constraints, e.g., 8X emp(X) ^ customer(X) ! ?
Check without adding complexity
8X8Y R(X,Y) 9Z R(Y,Z)
8X8Y8Z R(Y,X) ^ R(Z,X) Y = Z
D = {R(a,b)}
Q R(A,a)
=
D [ ² Q
but
D [ ²fin Q
Finite controllability does not hold
Additional Features
• EGDs, e.g., 8X8Y8Z reports(X,Y) ^ reports(X,Z) ! Y = Z
Non-Conflicting EGDs: do not interact with TGDs
Preliminary check without adding complexity
• Negative constraints, e.g., 8X emp(X) ^ customer(X) ! ?
Check without adding complexity
Comparison with ER§ Schemata
Lean UML ER§
IS-A among classes
IS-A among associations
¸n participation
·1 participation
Permutation on IS-A
Values of attributes
Operations
Attribute re-use
ER§: Extended ER formalism with AC0 data complexity [Calì, G. & Pieris, Information Systems 2012]
n ¸ 0 n 2 {0,1}
complex atomic
The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
person(john)
person(P) 9F father(F,P) father(F,P) person(F)
D
chase(D,) = D [ ?
person(P) 9F father(F,P) father(F,P) person(F)
The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
person(john)
D
chase(D,) = D [ {father(z1,john)
person(P) 9F father(F,P) father(F,P) person(F)
The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
person(john)
D
chase(D,) = D [ {father(z1,john), person(z1)
person(P) 9F father(F,P) father(F,P) person(F)
The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
person(john)
D
chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1)
person(P) 9F father(F,P) father(F,P) person(F)
The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
person(john)
D
chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1), …}
The Chase Procedure
Input: Database D, set of TGDs
Output: A model of D [
person(john)
D
chase(D,) = D [ {father(z1,john), person(z1), father(z2,z1), …}
infinite instance
person(P) 9F father(F,P) father(F,P) person(F)
Query Answering via Chase
[see, e.g., Deutsch, Nash & Remmel, PODS 08]
D [ ² Q , chase(D,) ² Q
D
. . .
C = chase(D,)
M1
M2
h1 h2
h1(C) h2(C)
Q h
Bounded Derivation-Depth Property (BDDP)
Q
D
chase(D,)
P
constant depth
w.r.t. D
chase(D,) ² Q ) P ² Q
[Calì, G. & Lukasiewitcz, PODS 09]
BDDP ) First-Order Rewritability
Bounded Derivation-Depth Property (BDDP)
Q
D
chase(D,)
P
[Calì, G. & Lukasiewitcz, PODS 09]
constant depth
w.r.t. D
First-Order Rewritable TGDs
rewriting
Q
Q
8D: D [ ² Q , D ² Q
Query answering is in AC0 in data complexity [Vardi, PODS 95]
Standard conceptual modeling tool for software design
Reasoning over UML models
Model checking
Specification recovery
Software maintenance
OMG UML (Unified Modeling Language)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Issues Index[0..1]:Str
getIndex():List
Class Diagrams