dept. of cis, temple univ. cis616/661 – principles of data management

59
1 Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management V. Megalooikonomou Integrity Constraints (based on slides by C. Faloutsos at CMU)

Upload: forbes

Post on 22-Mar-2016

42 views

Category:

Documents


2 download

DESCRIPTION

Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management. V. Megalooikonomou Integrity Constraints (based on slides by C. Faloutsos at CMU). Constraints:. Integrity constraints in the E-R Model: Key cardinalities of a Relationship. Overview. Domain; Ref. Integrity constraints - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

1

Dept. of CIS, Temple Univ.CIS616/661 – Principles of Data Management

V. MegalooikonomouIntegrity Constraints

(based on slides by C. Faloutsos at CMU)

Page 2: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Constraints:Integrity constraints in the E-R

Model: Key cardinalities of a Relationship

Page 3: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Overview Domain; Ref. Integrity constraints Assertions and Triggers Functional dependencies

Page 4: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Domain Constraints Domain Types, eg, SQL

Fixed Length characters Int; float; (date)

null values eg., create table student( ssn char(9) not

null, ...)

Page 5: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Referential Integrity constraints‘foreign keys’ - eg:

create table takes(ssn char(9) not null,

c-id char(5) not null, grade integer, primary key(ssn, c-id), foreign key ssn references student, foreign key c-id references class)

Page 6: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Referential Integrity constraints …

foreign key ssn references student, foreign key c-id references class)

Effect: expects that ssn exists in ‘student’ table blocks ops that violate that - how??

insertion? deletion/update?

Page 7: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Referential Integrity constraints …

foreign key ssn references student on delete cascade on update cascade,...

-> eliminate all student enrollments other options (set to null, to default

etc)

Page 8: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Weapons for IC: assertions

create assertion <assertion-name> check <predicate>

triggers (~ assertions with ‘teeth’) on operation, if condition, then

action

Page 9: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Triggers - exampledefine trigger zerograde on

update takes(if new takes.grade < 0 then takes.grade = 0)

Page 10: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Triggers - discussion more complicated: “managers

have higher salaries than their subordinates” - a trigger can automatically boost mgrs salaries

triggers: tricky (infinite loops…)

Page 11: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Overview Domain; Ref. Integrity constraints Assertions and Triggers Functional dependencies

why definition Armstrong’s “axioms” closure and cover

Page 12: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependencies motivation: ‘good’ tables

takes1 (ssn, c-id, grade, name, address)

‘good’ or ‘bad’?

Page 13: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependenciestakes1 (ssn, c-id, grade, name,

address)Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 14: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependencies‘Bad’ - why?

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 15: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional Dependencies Redundancy

space inconsistencies insertion/deletion anomalies (later…)

What caused the problem?

Page 16: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependencies ‘name’ depends on the ‘ssn’ define ‘depends’

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 17: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependenciesDefinition: ‘a’ functionally determines ‘b’

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

ba

Page 18: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependenciesInformally: ‘if you know ‘a’, there is

only one ‘b’ to match’Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 19: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependenciesformally:

if two tuples agree on the ‘X’ attribute,they *must* agree on the ‘Y’ attribute, too(eg., if ssn is the same, so should address)

… a functional dependency is a generalization of the notion of a key

])[2][1][2][1( ytytxtxtYX

Page 20: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependencies ‘X’, ‘Y’ can be sets of attributes other examples??

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 21: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependencies ssn -> name, address ssn, c-id -> grade

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 22: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependencies

K is a superkey for relation R iff K -> R

K is a candidate key for relation R iff:K -> Rfor no a K, a -> R

Page 23: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Functional dependenciesClosure of a set of FD: all implied

FDs - eg.:ssn -> name, addressssn, c-id -> grade

implyssn, c-id -> grade, name, addressssn, c-id -> ssn

Page 24: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsClosure of a set of FD: all implied

FDs - eg.:ssn -> name, addressssn, c-id -> grade

how to find all the implied ones, systematically?

Page 25: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axioms“Armstrong’s axioms” guarantee

soundness and completeness: Reflexivity: eg., ssn, name -> ssn Augmentation

eg., ssn->name then ssn,grade-> ssn,grade

YXXY

YWXWYX

Page 26: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axioms Transitivity

ssn->address address-> county-tax-rateTHEN:

ssn-> county-tax-rate

ZXZYYX

Page 27: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsReflexivity: Augmentation: Transitivity:

ZX

ZYYX

YXXY

YWXWYX

‘sound’ and ‘complete’

Page 28: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs – finding the closure F+

F+ = Frepeat

for each functional dependency f in F+

apply reflexivity and augmentation rules on f add the resulting functional dependencies to F+

for each pair of functional dependencies f1and f2 in F+

if f1 and f2 can be combined using transitivity then add the resulting functional dependency to F+

until F+ does not change any further

We can further simplify manual computation of F+ by using the following additional rules.

Page 29: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsAdditional rules: Union

Decomposition Pseudo-transitivity

ZXW

ZYWYX

ZXYX

YZX

YZXZXYX

Page 30: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsProve ‘Union’ from three axioms:

YZXZXYX

?

Page 31: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsProve ‘Union’ from three axioms:

YZXtytransitiviandthusXisXXbut

XZXXXwaugmYZXZZwaugm

ZXYX

)4()3(;

)4(/.)2()3(/.)1(

)2()1(

Page 32: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsProve Pseudo-transitivity:

ZXWZYWYX

?

ZXZYYX

YXXY

YWXWYX

Page 33: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Armstrong’s axiomsProve Decomposition

ZXZYYX

YXXY

YWXWYX

ZXYX

YZX?

Page 34: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure F+Given a set F of FD (on a schema)F+ is the set of all implied FD. Eg.,takes(ssn, c-id, grade, name,

address)ssn, c-id -> grade

ssn-> name, address}F

Page 35: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure F+

ssn, c-id -> grade ssn-> name, address ssn-> ssn ssn, c-id-> address c-id, address-> c-id ...

F+

Page 36: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure F+ R=(A,B,C,G,H,I)

F= { A->BA->CCG->HCG->IB->H}

Some members of F+:A->HAG->ICG->HI

Page 37: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+Given a set F of FD (on a schema)A+ is the set of all attributes determined

by A:takes(ssn, c-id, grade, name, address)

ssn, c-id -> grade ssn-> name, address

{ssn}+ =??

}F

Page 38: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+takes(ssn, c-id, grade, name,

address)ssn, c-id -> grade

ssn-> name, address

{ssn}+ ={ssn, name, address }

}F

Page 39: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+takes(ssn, c-id, grade, name,

address)ssn, c-id -> grade

ssn-> name, address

{c-id}+ = ??

}F

Page 40: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+takes(ssn, c-id, grade, name,

address)ssn, c-id -> grade

ssn-> name, address

{c-id, ssn}+ = ??

}F

Page 41: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+if A+ = {all attributes of table}then ‘A’ is a candidate key

Page 42: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+ Algorithm to compute +, the closure of

under Fresult := ;while (changes to result) do

for each in F do begin

if result then result := result

end

Page 43: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - Closure A+ (example)

R = (A, B, C, G, H, I) F = {A B, A C, CG H, CG I, B H} (AG)+

1.result = AG2.result = ABCG (A C and A B)3.result = ABCGH (CG H and CG AGBC)4.result = ABCGHI (CG I and CG AGBCH)

Is AG a candidate key? 1. Is AG a super key?

1. Does AG R? 2. Is any subset of AG a superkey?

1. Does A+ R?2. Does G+ R?

Page 44: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - A+ closureDiagrams

AB->C (1)A->BC (2)B->C (3)A->B (4)

CA

B

Page 45: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcGiven a set F of FD (on a schema)Fc is a minimal set of equivalent FD. Eg.,takes(ssn, c-id, grade, name, address)

ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name

F

Page 46: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ Fcssn, c-id -> grade

ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name

F

Fc

Page 47: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ Fc why do we need it? define it properly compute it efficiently

Page 48: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ Fc why do we need it?

easier to compute candidate keys define it properly compute it efficiently

Page 49: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ Fc define it properly - three properties

every FD a->b has no extraneous attributes on the RHS

ditto for the LHS all LHS parts are unique

Page 50: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

‘extraneous’ attribute: if the closure is the same, before and

after its elimination or if F-before implies F-after and vice-

versa

FDs - ‘canonical cover’ Fc

Page 51: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ Fcssn, c-id -> grade

ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name

F

Page 52: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcAlgorithm:

examine each FD; drop extraneous LHS or RHS attributes

merge FDs with same LHS repeat until no change

Page 53: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcTrace algo for

AB->C (1)A->BC (2)B->C (3)A->B (4)

Page 54: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcTrace algo for

AB->C (1)A->BC (2)B->C (3)A->B (4) (4) and (2)

merge:

AB->C (1)A->BC (2)B->C (3)

Page 55: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcAB->C (1)A->BC (2)B->C (3)

in (2): ‘C’ is extr.

AB->C (1)A->B (2’)B->C (3)

Page 56: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcAB->C (1)A->B (2’)B->C (3)

in (1): ‘A’ is extr.

B->C (1’)A->B (2’)B->C (3)

Page 57: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcB->C (1’)A->B (2’)B->C (3)

(1’) and (3) merge

A->B (2’)B->C (3)

nothing is extraneous: ‘canonical cover’

Page 58: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

FDs - ‘canonical cover’ FcAFTER

A->B (2’)B->C (3)

BEFOREAB->C (1)A->BC (2)B->C (3)A->B (4)

Page 59: Dept. of CIS, Temple Univ. CIS616/661 – Principles of Data Management

Overview - conclusions Domain; Ref. Integrity constraints Assertions and Triggers Functional dependencies

why definition Armstrong’s “axioms” closure and cover