ch 4: relational database design -...
TRANSCRIPT
Ch 4: Relational Database
Design
4.1 Features of Good Relational
designs
Four Informal measures
•Semantics of the relation attributes
•Reducing the redundant values in tuples.
•Reducing the null values
•Disallowing the possibility of generating spurious(wrong) tuples
1. Semantics of the relation attributes
• Design a reln schema so that it is easy to explain its
meaning .Do not combine attributes from multiple entity
types & relationship into single relation.
• In general ,the easier it is to explain the semantics of
relation, the better the relation schema design will be.
3
4
2. Redundant Information in Tuples & Update Anomalies
• Goal of dbase is to reduce storage space used by
relationship.
• Grouping attributes into relation schemas has a
significant effect on storage space.
• For e.g. if we combine Employee with department &
project , works_on will result into EMP_DEPT &
EMP_PRJ.
• Resultant relation shows repetition of several values
leading to higher storage.
• Other serious problems is of Update anomalies which is
classified as insert, delete & modification anomalies.
5
Insertion Anomalies
• Insert a new emp , we must include value for dept or need to
place NULL (if emp doesnot work for dept yet.) & need to enter
correctly so consistency problem donot occur.
• It is difficult to enter new dept that has no employee as we
cannot insert Null in ENO as its is primary key.
• Consider the relation:
– EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)
• Insert Anomaly:
– Cannot insert a project unless an employee is assigned to it.
• Conversely
– Cannot insert an employee unless a he/she is assigned to a
project.
6
7
8
Delation Anomalies
• If we delete from EMP_DEPT an employee that happens
to represent the last employee in that dept , the info abt
dept is also lost from dbase.
Consider the relation:
EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)
Delete Anomaly:
When a project is deleted, it will result in deleting all the
employees who work on that project.
Alternately, if an employee is the sole employee on a
project, deleting that employee would result in deleting
the corresponding project.
9
10
Modification Anomalies
• If we change the value of one of attributes say that of
manager of dept 5 , we need to change in each tuple
where dept no is 5, else it will lead to inconsistent.
• Design Dbase so that no insertion, deletion &
modification anomalies are present . 11
12
13
3. Null values in tuples
• NULL have multiple interpretations such as:
• Attributes does not apply to this tuple.
• Attributes value for this tuple is unknown.
• Value is known but absent, i.e. it has not been
recorded yet.
• Problem may occur in JOIN , n aggregrate operations.
14
4. Generation of Spurious Tuples
• Design relation schemas so that they can be joined with
equality condition on attributes that are primary or
foreign key.
• E.g. cross join leads to spurious tuples
15
16
17
4.2 Functional Dependencies (FD)
• A functional dependency, denoted by X Y (Read
X functionally determines Y), between two sets of
attributes X and Y that are subsets of R specifies a
constraint on possible tuples that can be form a
relation state r of R.
• The constraint is that for for all pairs of tuples t1
and t2 in r such that
t1 [X ] = t2 [X ] , they must also have,
t1 [Y] = t2 [Y].
In other words
Whenever two tuples of r agree on their X value, they
also agree on their Y value.
18
Functional Dependency
• Main concept associated with normalization.
• Functional Dependency
– Describes relationship between attributes in a
relation.
– If A and B are attributes of relation R, B is
functionally dependent on A (denoted A B), if
each value of A in R is associated with exactly one
value of B in R.
Example: Consider r(A,B ) with the following instance of r.
On this instance, A B does NOT hold, but B A does hold.
1 4
1 5
3 7
19
Functional Dependency
• Property of the meaning (or semantics) of the
attributes in a relation.
• Diagrammatic representation:
Determinant of a functional dependency refers
to attribute or group of attributes on left-hand
side of the arrow.
20
Example - Functional Dependency
21
Functional Dependencies (Cont.)
• K is a superkey for relation schema R if and only if K R
• K is a candidate key for R if and only if
– K R, and
– for no K, R
• Functional dependencies allow us to express constraints that
cannot be expressed using superkeys. Consider the schema:
bor_loan = (customer_id, loan_number, amount ).
We expect this functional dependency to hold:
loan_number amount
but would not expect the following to hold:
amount customer_name
22
Functional Dependencies (Cont.)
• Main use of FD is to describe further relation
schema R by specifying constraints on its
attributes that must hold all times.
• Certain FD can be specified without referring to
specific relation.
• {state, driving_licence} ENO
• {pincode) area
• {telephone code}city
23
Functional Dependencies (Cont.)
ENOEname
Pnumber{pname,plocation}
{eno,pnumber}hours
• Eno uniquely determines emp name
• Pnumber uniquely determines project name & location
• Combination of eno , pnumber uniquely determines
numbers of hours that employee had worked on that
project.
24
EXAMPLE :: TEACH
TEACHER COURSE TEXT
GUPTA CHEMISTRY SAHANI
GUPTA MATHS NAVATHE
KUMAR BIOLOGY HOFFMAN
GOYAL CHEMISTRY KAHATE
Possible FD’s
TEXTCOURSE hold
But,
TEACHERCOURSE is ruled out.
25
Trivial & Non trivial FD
• Trivial FD
– A FD XY is trivial if Y , right hand side of FD is subset of X.
• Non Trivial FD
– A FD XY is non trivial if Y , right hand side of FD is not subset
of X.
• F= ENO{ ENAME,DOB,ADDRESS,DNUM}
DNUMBER{DNAME,MGRNO}
ENOENO trivial
DUMBERDNAME trivial
ENO Ename trivial
DUMBER mgrno trivial
ENO DNAME,MGRNO NON Trivial
26
• In practice, trivial dependencies are not really
interesting, we are interested in practice non
trivial FD only. And goal should be to reduce
trivial FD from given set of FD.
• FIND FD
27
A B C
a1 b1 c1
a1 b1 c2
a2 b1 c1
a2 b1 c3
Closure (of set of dependencies)
• The set of all FDs that include F as well as all
dependencies that are implied by a given set F of
FDs is called the closure of F, denoted by F+.
• F= ENO{ ENAME,DOB,ADDRESS,DNUM}
DNUMBER{DNAME,MGRNO}
SOME ADDITIONAL FD’S ARE
ENO DNAME,MGRNO
ENOENO
DUMBERDNAME
28
Inference rules for FD Or AXIOMS
1. Reflexive: if B is a subset of A, then A B.
2. Augmentation: if A B then AC BC
3. Transitivity: it A B and B C then A C.
4. Self – determination: A A.
5. Decomposition: If A BC, then AB, AC.
6. Union: it A B and A C, then A BC
7. Composition: if A B, C D then AC BD.
8. Pseudo transitive : if A B and rBC then
Ar C.
• To find closure need to apply this rules
repeatedly until we stop producing new FD
29
ARMSTRONG AXIOMS
• First three axioms
1. Reflexive: if B is a subset of A, then A B.
2. Augmentation: if A B then AC BC
3. Transitivity: it A B and B C then A C.
are sound & complete
By sound, we mean that given a set of FD on relation R,
any dependency that can infer from F holds in
every reln satisfies the dependencies. They do not
generate incorrect FD.
By complete, we mean that using 3 FD repeatedly to a
complete set of all possible dependencies that can
be inferred from F.
30
Example: find closure of F
• R = (A, B, C, G, H, I)
F = { A B
A C
CG H
CG I
B H}
• some extra members of F+
– A H
• by transitivity from A B and B H
– AG I
• by augmenting A C with G, to get AG CG
and then transitivity with CG I
– CG HI
• by union CG I & CG H
• OR
• by augmenting CG I to infer CG CGI,
and augmenting of CG H to infer CGI HI,
and then transitivity 31
Example: find closure of F
• R = (A, B, C, D,E,F)
F = { A BC
B E
CD EF
}
• some extra members of F+
– A BC GIVEN
– A C DECOMPOSITION
– AD CD AUGMENTATION
– CD EF GIVEN
– AD EF TRANSITIVITY
– AD F DECOMPOSITION
32
Example of Attribute Set Closure
• R = (A, B, C, G, H, I)
• F = {A B A C CG H CG I B H}
• (AG)+
1. result = AG
2. result = ABCG (A C and A B)
3. result = ABCGH (CG H)
4. result = ABCGHI (CG I)
33
result := ;
while (changes to result) do
for each in F do
begin
if result then result :=result
end
Example of Attribute Set Closure
• R = (A, B, C, D ,E ,F)
• F = {A BC E CF B E
CD EF}
COMPUTE (AB)+
1. result = AB
2. result = ABC (A BC)
3. result = ABC (E CF ) LHS RESULT
4. result = ABCE (B E)
5. result = ABCE (CD EF) LHS RESULT
6. result = ABCEF (E CF ) AGAIN LOOP
(AB)+ = {A,B,C,E,F}
34
result := ;
while (changes to result) do
for each in F do
begin
if result then result :=result
end
Example
• F={A,B,C,D,E,F}
FD={ ABC , BE, CDEF}
PROVE THAT ADF
** AB AC FROM ABC
CDE CDF FROM CDEF
AB , BE == AE
AE CDF
A BC
ADBCD AUGMENTATION ADDING C
ADEEF FROM CDEF & BE
ADEF ADE
ADF 35
EXAMPLE
• F={A,B,C,D}
FD={ AB , AC, BCD}
PROVE THAT AD
• R={A,B,C,D,E,F}
FD={ AB , AC, CDE, CDF, BE}
FIND CLOSURE OF FD
• R={A,B,C,D,E}
FD={ ABC , BD, EA, CDE}
FIND CLOSURE OF FD
• R={A,B,C,D,E,F}
FD={ ABC , ECF, BE, CDEF}
FIND CLOSURE OF {A,B}+
36
Canonical Cover
• Sets of functional dependencies may have redundant
dependencies that can be inferred from the others
– For example: A C is redundant in: {A B, B C}
– Parts of a functional dependency may be redundant
• E.g.: on RHS: {A B, B C, A CD} can be
simplified to
{A B, B C, A D}
• E.g.: on LHS: {A B, B C, AC D} can be
simplified to
{A B, B C, A D}
• Intuitively, a canonical cover of F is a ―minimal‖ set of
functional dependencies equivalent to F, having no
redundant dependencies or redundant parts of
dependencies
37
Extraneous Attributes
• Consider a set F of functional dependencies and the functional
dependency in F.
– Attribute A is extraneous in if A
and F logically implies (F – { }) {( – A) }.
– Attribute A is extraneous in if A
and the set of functional dependencies
(F – { }) { ( – A)} logically implies F.
• Note: implication in the opposite direction is trivial in each of the
cases above, since a ―stronger‖ functional dependency always
implies a weaker one
• Example: Given F = {A C, AB C }
– B is extraneous in AB C because {A C, AB C} logically
implies A C (I.e. the result of dropping B from AB C).
• Example: Given F = {A C, AB CD}
– C is extraneous in AB CD since AB C can be inferred even
after deleting C
38
Testing if an Attribute is Extraneous
• Consider a set F of functional dependencies and the functional
dependency in F.
• To test if attribute A is extraneous in
1. compute ({} – A)+ using the dependencies in F
2. check that ({} – A)+ contains ; if it does, A is extraneous in
• To test if attribute A is extraneous in
1. compute + using only the dependencies in
F’ = (F – { }) { ( – A)},
2. check that + contains A; if it does, A is extraneous in
39
Computing a Canonical Cover
• R = (A, B, C)
F = {A BC
B C
A B
AB C}
• Combine A BC and A B into A BC
– Set is now {A BC, B C, AB C}
• A is extraneous in AB C
– Check if the result of deleting A from AB C is implied by the other
dependencies
• Yes: in fact, B C is already present!
– Set is now {A BC, B C}
• C is extraneous in A BC
– Check if A C is logically implied by A B and the other
dependencies
• Yes: using transitivity on A B and B C.
– Can use attribute closure of A in more complex cases
• The canonical cover is: A B
B C 40
e.g.
• Now we define a set of FD to be irreducible as minimal; if and only if it satisfies the following two properties.
(1) The right hand side of every FD in S involve just one attribute (i.e., it is a singleton set)
(2) The left hand side of every FD in S is irreducible in turn meaning that no attribute can be discarded from the determinant without changing the CLOSURE S+.
41
Minimal Sets of FDs
• A set of FDs is minimal if it satisfies the following conditions:
1. Every dependency in F has a single attribute for its RHS.
2. We cannot remove any dependency from F and have a set of dependencies that is equivalent to F.
3. We cannot replace any dependency X -> A in F with a dependency Y -> A, where Y proper-subset-of X ( Y subset-of X) and still have a set of dependencies that is equivalent to F.
42
Example
• A BC,
• B C
• A B
• AB C
• AC D
Compute an irreducible set of FD that is equivalent
to this given set.
Bring answer in reducible
form
43
Solution
(1) The step is to rewrite the FD such that each has a singleton right hand side.
• A B
• A C
• B C
• A B
• AB C
• AC D
We observe that the FD A B occurs twice. So one occurrence will be eliminated.
A BC,
B C
A B
AB C
AC D
44
Solution
2. Next, attributed C can be eliminated from the
left hand side of the FD AC D
• Because we have A C,
• By augmentation AA AC
• A AC
(Augmentation: if X Y then XZ YZ)
• And we are given AC D,
• A AC AC D
• So A D by transitivity;
Thus C on the left hand side is redundant.
A C B C A B
AB C AC D
45
Solution
3. Next, we observe that the FD AB C can be eliminated, because again we have
A C
By augmentation AB CB
By decomposition AB C AB B
4. Finally, the FD A C is implied by the FD A B and B C, by transitivity so it can be eliminated.
Now we have A B
B C
A D
This set is irreducible.
A C B C A B
AB C A D
46
Computing the Minimal Sets of FDs
Let the given set of FDs be
E : {B → A, D → A, AB → D}.
We have to find the minimum cover of E.
47
E : {B → A, D → A, AB → D }
■ All above dependencies are in canonical form; (RHS has single value)so we have completed step 1 and can proceed to step 2.
In step 2 we need to determine
if AB → D has any redundant attribute on the left-hand side; that is, can it be
replaced by B → D or A → D?
■ Since B → A, by augmenting with B on both sides (IR2), we have BB → AB, or
B → AB (i). However, AB → D as given (ii).
■ Hence by the transitive rule (IR3), we get from (i) and (ii), B → D. Hence
AB → D may be replaced by B → D.
48
Old E : {B → A, D → A, AB → D }
NEW:
We now have a set equivalent to original E , say E′ : {B → A, D → A, B → D}.
No further reduction is possible in step 2 since all FDs have a single attribute
on the left-hand side.
■ In step 3 we look for a redundant FD in E′. By using the transitive rule on
B → D and D → A, we derive B → A. Hence B → A is redundant in E’ and can
be eliminated.
■ Hence the minimum cover of E is
{B → D, D → A}.
49
Normalization of Relations
• Normalization:
– The process of decomposing unsatisfactory "bad"
relations by breaking up their attributes into smaller
relations
• Normal form:
– Condition using keys and FDs of a relation to certify
whether a relation schema is in a particular normal form
50
Normalization of Relations
• 2NF, 3NF, BCNF
– based on keys and FDs of a relation schema
• 4NF
– based on keys, multi-valued dependencies : MVDs; 5NF
based on keys, join dependencies : JDs
• Additional properties may be needed to ensure a
good relational design (lossless join, dependency
preservation)
Practical Use of Normal Forms
• Normalization is carried out in practice so that the resulting designs are of high quality and meet the desirable properties
• The practical utility of these normal forms becomes questionable when the constraints on which they are based are hard to understand or to detect
• The database designers need not normalize to the highest possible normal form – (usually up to 3NF, BCNF or 4NF)
• Denormalization: – The process of storing the join of higher normal
form relations as a base relation—which is in a lower normal form
52
Definitions of Keys and Attributes
Participating in Keys (1)
• A superkey of a relation schema R = {A1, A2, ....,
An} is a set of attributes S subset-of R with the
property that no two tuples t1 and t2 in any legal
relation state r of R will have t1[S] = t2[S]
• A key K is a superkey with the additional property
that removal of any attribute from K will cause K
not to be a superkey any more.
Definitions of Keys and Attributes
Participating in Keys (2)
• If a relation schema has more than one key, each
is called a candidate key.
– One of the candidate keys is arbitrarily designated to be
the primary key, and the others are called secondary
keys.
• A Prime attribute must be a member of some
candidate key
• A Nonprime attribute is not a prime attribute—
that is, it is not a member of any candidate key.
First Normal Form
• Disallows
– composite attributes
– multivalued attributes
– nested relations; attributes whose values for an
individual tuple are non-atomic
• Considered to be part of the definition of relation
Normalization into 1NF
Slide 10- 57
Normalization of nested relations into 1NF
3.3 Second Normal Form (1)
• Uses the concepts of FDs, primary key
• Definitions
– Prime attribute: An attribute that is member of the primary key K
– Full functional dependency: a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more
• Examples:
– {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER -> HOURS hold
– {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds
58
Second Normal Form (2)
• A relation schema R is in second normal form
(2NF) if every non-prime attribute A in R is fully
functionally dependent on the primary key
• R can be decomposed into 2NF relations via the
process of 2NF normalization
Normalizing into 2NF
4 General Normal Form Definitions (For
Multiple Keys) (1)
• The above definitions consider the primary key
only
• The following more general definitions take into
account relations with multiple candidate keys
• A relation schema R is in second normal form
(2NF) if every non-prime attribute A in R is fully
functionally dependent on every key of R
General Normal Form Definitions (3NF)
• Definition:
– Superkey of relation schema R - a set of attributes S of
R that contains a key of R
– A relation schema R is in third normal form (3NF) if
whenever a FD X -> A holds in R, then either:
• (a) X is a superkey of R, or
• (b) A is a prime attribute of R
i.e. either LHS or RHS should have some key
attributes.
• NOTE: Boyce-Codd normal form disallows
condition (b) above
3.4 Third Normal Form (1)
• Definition: – Transitive functional dependency: a FD X -> Z that can
be derived from two FDs X -> Y and Y -> Z
• Examples: – SSN -> DMGRSSN is a transitive FD
• Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold
– SSN -> ENAME is non-transitive
• Since there is no set of attributes X where SSN -> X and X -> ENAME
Third Normal Form (2)
• A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key
• R can be decomposed into 3NF relations via the process of 3NF normalization
• NOTE:
– In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only if Y is not a candidate key.
– When Y is a candidate key, there is no problem with the transitive dependency .
– E.g., Consider EMP (SSN, Emp#, Salary ).
• Here, SSN -> Emp# -> Salary and Emp# is a candidate key.
65
Normal Forms Defined Informally
• 1st normal form
– All attributes depend on the key
• 2nd normal form
– All attributes depend on the whole key
• 3rd normal form
– All attributes depend on nothing but the key
Successive Normalization of LOTS into 2NF and 3NF
SUMMARY OF NORMAL FORMS
based on Primary Keys
5 BCNF (Boyce-Codd Normal Form)
• A relation schema R is in Boyce-Codd Normal
Form (BCNF) if whenever an FD X -> A holds in R,
then X is a superkey of R
• Each normal form is strictly stronger than the
previous one
– Every 2NF relation is in 1NF
– Every 3NF relation is in 2NF
– Every BCNF relation is in 3NF
• There exist relations that are in 3NF but not in
BCNF
• The goal is to have each relation in BCNF (or 3NF)
Boyce-Codd Normal Form
Area is not
superkey
A relation TEACH that is in 3NF but not in
BCNF
Achieving the BCNF by Decomposition
(1)
• Two FDs exist in the relation TEACH:
– fd1: { student, course} -> instructor
– fd2: instructor -> course
• {student, course} is a candidate key for this relation and that the dependencies shown follow the pattern in Figure 10.12 (b).
– So this relation is in 3NF but not in BCNF
• A relation NOT in BCNF should be decomposed so as to meet this property, while possibly forgoing the preservation of all functional dependencies in the decomposed relations.
Achieving the BCNF by Decomposition
(2)
• Three possible decompositions for relation TEACH
– {student, instructor} and {student, course}
– {course, instructor } and {course, student}
– {instructor, course } and {instructor, student}
• All three decompositions will lose fd1.
– We have to settle for sacrificing the functional dependency preservation. But we cannot sacrifice the non-additivity property after decomposition.
• Out of the above three, only the 3rd decomposition will not generate spurious tuples after join.(and hence has the non-additivity property).
Slide 11- 74
3. Multivalued Dependencies and Fourth
Normal Form (1)
(a) The EMP relation with two MVDs: ENAME —>>
PNAME and ENAME —>> DNAME.
(b) Decomposing the EMP relation into two 4NF relations
EMP_PROJECTS and EMP_DEPENDENTS.
Slide 11- 75
3. Multivalued Dependencies and Fourth
Normal Form (1)
(c) The relation SUPPLY with no MVDs is in 4NF but not in 5NF if
it has the JD(R1, R2, R3). (d) Decomposing the relation SUPPLY
into the 5NF relations R1, R2, and R3.
Fourth Normal Fourth:
definition
• Fourth normal form (or 4NF) requires that
there are no non-trivial multi-valued
dependencies of attribute sets on
something other than a superset of a
candidate key. A table is said to be in 4NF
if and only if it is in the BCNF and multi-
valued dependencies are functional
dependencies. The 4NF removes
unwanted data structures ie.. multi-valued
dependencies
Slide 11- 77
Multivalued Dependencies and Fourth
Normal Form (4)
Definition:
• A relation schema R is in 4NF with respect to a set of dependencies F (that includes functional dependencies and multivalued dependencies) if, for every nontrivial multivalued dependency X —>> Y in F+, X is a superkey for R.
– Note: F+ is the (complete) set of all dependencies (functional or multivalued) that will hold in every relation state r of R that satisfies F. It is also called the closure of F.
Slide 11- 78
Multivalued Dependencies and Fourth Normal
Form (5)
Decomposing a relation state of EMP that is not in 4NF:
(a) EMP relation with additional tuples.
(b) Two corresponding 4NF relations EMP_PROJECTS and
EMP_DEPENDENTS.
Fourth Normal Form (4NF)
• MVD between attributes A, B, and C in a relation
using the following notation:
A B
A C
Example
• The table has no non-key attributes because its
only key is {Restaurant, Pizza Variety, Delivery
Area}. Therefore it meets all normal forms up to
BCNF. If we assume, however, that pizza varieties
offered by a restaurant are not affected by
delivery area, then it does not meet 4NF. The
problem is that the table features two non-trivial
multivalued dependencies on the {Restaurant}
attribute (which is not a superkey). The
dependencies are:
• {Restaurant} →→ {Pizza Variety}
• {Restaurant} →→ {Delivery Area}
Data is given
Restaurant Pizza Variety Delivery Area
A1 Pizza Thick Crust Springfield
A1 Pizza Thick Crust Shelbyville
A1 Pizza Thick Crust Capital City
A1 Pizza Stuffed Crust Springfield
A1 Pizza Stuffed Crust Shelbyville
A1 Pizza Stuffed Crust Capital City
Elite Pizza Thin Crust Capital City
Elite Pizza Stuffed Crust Capital City
Vincenzo's Pizza Thick Crust Springfield
Vincenzo's Pizza Thick Crust Shelbyville
Vincenzo's Pizza Thin Crust Springfield
Vincenzo's Pizza Thin Crust Shelbyville
Restaurant Pizza Variety Restaurant Delivery Area
A1 Pizza Thick Crust A1 Pizza Springfield
A1 Pizza Stuffed Crust A1 Pizza Shelbyville
Elite Pizza Thin Crust A1 Pizza Capital City
Elite Pizza Stuffed Crust Elite Pizza Capital City
Vincenzo's
Pizza
Thick Crust
Vincenzo's
Pizza
Springfield
Vincenzo's
Pizza
Thin Crust
Vincenzo's
Pizza
Shelbyville
Varieties By Restaurant Delivery Areas By
Restaurant
4NF - Example
Slide 11- 84
4. Join Dependencies and Fifth Normal
Form (1)
Definition:
• A join dependency (JD), denoted by JD(R1, R2, ...,
Rn), specified on relation schema R, specifies a
constraint on the states r of R.
– The constraint states that every legal state r of R
should have a non-additive join decomposition into
R1, R2, ..., Rn; that is, for every such r we have
– * (R1(r), R2(r), ..., Rn(r)) = r
Note: an MVD is a special case of a JD where n = 2.
• A join dependency JD(R1, R2, ..., Rn), specified on
relation schema R, is a trivial JD if one of the
relation schemas Ri in JD(R1, R2, ..., Rn) is equal to
R.
Slide 11- 85
Join Dependencies and Fifth Normal Form
(2)
Definition:
• A relation schema R is in fifth normal form (5NF) (or
Project-Join Normal Form (PJNF)) with respect to a
set F of functional, multivalued, and join
dependencies if,
– for every nontrivial join dependency JD(R1, R2, ..., Rn) in F+
(that is, implied by F),
• every Ri is a superkey of R.
Slide 11- 86
Relation SUPPLY with Join Dependency
and conversion to Fifth Normal Form
• Whenever a supplier supply part p, & a project j
uses part p, the supplier s supplies at least one
part to project j, then supplier s will be supplying
part p to project j.
87
Join dependency and 5th
normal form
• There are some relations ,which cannot be
decomposed into two projections. But this
relation can be decomposed into 3 projection,
this is in 5NF. Suppose there is a relation
Dealers,parts,customer which is in 4NF.
Teacher Class Subject
100 4th English
100 5th hindi
• Problem is although it is in 4th NF , it involves join dependency(which is neither FD or MVD), thus it is required to decompose such a relation into smaller ones. All projections are in 5th NF
Teacher Class
100 4th
100 5th
Class Subject
4th English
5th hindi
Subject Teacher
English 100
Hindi 100
DESIGNING A SET OF RELATIONS (2)
• Goals:
– Lossless join property (a must)
– Dependency preservation property
– Additional normal forms
• 4NF (based on multi-valued dependencies)
• 5NF (based on join dependencies)
Why do we preserve the dependency?
• We would like to check easily that updates to the
database do not result in illegal relations being created.
• It would be nice if our design allowed us to check
updates without having to compute natural joins.
• We want to preserve dependencies bcoz each
dependencies in F represents a constraints on
dbase.
•
Dependency Preservation
Decomposition
• Definition: Each FD specified in F either
appears directly in one of the relations in the
decomposition, or be inferred from FDs that
appear in some relation.
For example…….
• R=(A,B,C,D) and F =(AB,AC,CD)
Here R is decomposed into
R1=(A,B,C)with the FDs
F1=(AB,AC)
R2=(C,D) with the FDs
F2=(CD)
F’=F1UF2…U Fn = (AB,AC,CD)
Hence we can say decomposition is DP
Properties of Relational
Decompositions (3)
• Dependency Preservation Property of a Decomposition:
– Definition: Given a set of dependencies F on R, the projection of F on Ri, denoted by pRi(F) where Ri is a subset of R, is the set of dependencies X Y in F+ such that the attributes in X υ Y are all contained in Ri.
– Hence, the projection of F on each relation schema Ri in the decomposition D is the set of functional dependencies in F+, the closure of F, such that all their left- and right-hand-side attributes are in Ri.
Properties of Relational
Decompositions (4)
• Dependency Preservation Property of a Decomposition (cont.): – Dependency Preservation Property:
• A decomposition D = {R1, R2, ..., Rm} of R is dependency-preserving with respect to F if the union of the projections of F on each Ri in D is equivalent to F; that is ((R1(F)) υ . . . υ (Rm(F)))+ = F+
• (See examples in Fig 10.12a and Fig 10.11)
• Claim 1: – It is always possible to find a dependency-preserving
decomposition D with respect to F such that each relation Ri in D is in 3nf.
Properties of Relational
Decompositions (5)
• Lossless (Non-additive) Join Property of a Decomposition: – Definition: Lossless join property: a decomposition D
= {R1, R2, ..., Rm} of R has the lossless (nonadditive) join property with respect to the set of dependencies F on R if, for every relation state r of R that satisfies F, the following holds, where * is the natural join of all the relations in D:
* ( R1(r), ..., Rm(r)) = r
– : The word loss in lossless refers to loss of information, not to loss of tuples. In fact, for ―loss of information‖ a better term is ―addition of spurious information‖
Slide 11- 97
Properties of Relational
Decompositions (9)
• Testing Binary Decompositions for Lossless Join Property – Binary Decomposition: Decomposition of a relation R
into two relations.
– PROPERTY LJ1 (lossless join test for binary decompositions): A decomposition D = {R1, R2} of R has the lossless join property with respect to a set of functional dependencies F on R if and only if either
• The f.d. ((R1 ∩ R2) (R1- R2)) is in F+, or
• The f.d. ((R1 ∩ R2) (R2 - R1)) is in F+.
OR
• The f.d. ((R1 ∩ R2) (R1)
• The f.d. ((R1 ∩ R2) (R2 )
NONLOSS DECOMPOSITION AND
FUNCTIONAL DEPENDENCIES
Supplier _no. Status City
S3 30 Chennai
S5 30 delhi
SUPPLIERS:- TABLE(RELATION)
Supplier_ no. Status
S3 30
S5 30
SST:TABLE(A)
Supplier _no. City
S3 Chennai
S5 delhi
SC:TABLE
No information is lost. This first decomposition is indeed
nonloss
Lossy decomposition or lossy-join
decomposition
Supplier_ no. Status
S3 30
S5 30
Status City
30 Chennai
30 delhi
SST: TABLE
(B)
STC:TABL
E
Here we cannot tell which supplier has which city
This decomposition is nonloss but lossy
Example
• Leading_schema= { branchname, city,assets, customer,
loan, amount}
• R1.branch={branchname, city,assets}
• R2. loan={branchname, customer,loan,amount}
• R1 ∩ R2
• =======branch name
• R2===R21 & R22
• R21={loan, branchname, amount}
• R22={customer, loan}
• R21 ∩ R22=====loan
• It’s a lossless join dependencies.
100
Find its lossy or lossless??
101
102
Slide 11- 103
Properties of Relational
Decompositions (8)
Lossless (nonadditive) join test for n-ary decompositions.
(a) Case 1: Decomposition of EMP_PROJ into
EMP_PROJ1 and EMP_LOCS fails test.
(b) A decomposition of EMP_PROJ that has the lossless
join property.
Slide 11- 104
Properties of Relational Decompositions (8)
Lossless (nonadditive)
join test for n-ary
decompositions.
(c) Case 2:
Decomposition of
EMP_PROJ into EMP,
PROJECT, and
WORKS_ON satisfies
test.