schema refinement and normal formsefreidoc.fr/l3/bdd/cours/2011-12 : cours complet en...•...

37
Schema Refinement and Normal Forms V2.0 2011/09/09 Jeffrey D. Ullman – Jean-Michel Busca

Upload: others

Post on 22-Apr-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Schema Refinement and Normal Forms

V2.0 2011/09/09 Jeffrey D. Ullman – Jean-Michel Busca

Page 2: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Outline

1. Problem statment

• Redundancy, Anomalies

• Refinement Process, Properties

2. Functional Dependencies

3. Normal Forms

4. Decomposition Algorithm

1

Page 3: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Why Refine Schemas?

Problems:

• Sally's address is recorded multiple times

• the same for Miller's manufacturer

• Bob does not like beer: NULL values

2

name addr beer manf Sally 123 Maple Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River NULL NULL

Page 4: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Why Refine Schemas? (2)

Consequences:

1. redundancy and NULL values waste space

in the database

2. redundancy can lead to several anomalies

when updating data

3

Page 5: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

name addr beer manf Sally 10 Bank Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River NULL NULL

Update Anomalies

If Sally moves:

• we need to change all her tuples

• otherwise an inconsistency is created

4

Page 6: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

If Sally does not like Bud anymore:

• we loose track of Bud's manufacturer

(if Sally was the only one to like Bud)

name addr beer manf Sally 10 Bank Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River NULL NULL

Delete Anomalies

5

Page 7: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

If we now state that Bob likes Bud Lite:

• we need to know Bud Lite's manufacturer

(if Bob is the first to like Bud Lite)

name addr beer manf Sally 10 Bank Bud A.B. Sally 123 Maple Miller Pete's Tom 45 Bank Miller Pete's Bob 10 River Bud Lite ???

Insert Anomalies

6

Page 8: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Schema Refinement

The goal of schema refinement is to redesign the database schema so as to eliminate:

• redundancy and NULL values

• update, delete, insert anomalies

Schema refinement relies on the theory of Functional Dependencies.

7

Page 9: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Example: Functional Dependencies

Two functional dependencies here:

• name determines addr (name -> addr)

• beer determines manf (beer -> manf)

• ??? can be deduced from name/beer

• this is why data is redundant

8

name addr beer manf Sally 123 Maple Bud A.B. Sally ??? Miller Pete's Tom 45 Bank Miller ???

Page 10: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Refinement Process

How do we know refinement is needed?

• Redundancy arises when a relation schema forces an association between attributes that is not natural.

• Functional dependencies help identify such situations and suggest one or more refinements.

9

Page 11: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Refinement Process (2)

Redundant relations are then decomposed into smaller non-redundant ones.

• Non-redundant relations are projections of the original relation.

• Example: BadDesign(name, addr, beer, manuf) is decomposed into:

- Drinkers(name, addr)

- Beers(beer, manf)

- Likes(name, beer)

10

Page 12: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Decomposition Properties

There are two important properties of a decomposition:

1. Lossless Join : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original.

2. Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied. 11

Page 13: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Outline

1. Problem statment

2. Functional Dependencies

• Definition, Axiomes, Closure Set

• Super Keys, Keys

3. Normal Forms

4. Decomposition Algorithm

12

Page 14: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

13

Functional Dependencies

X ->Y reads "Y functionally depends on X" or "X determines Y"

X ->Y is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they also agree on all attributes of Y:

• For any tuples t1, t2 of R

• If t1[X] = t2[X] then t1[Y] = t2[Y]

• Say "X -> Y holds in R"

Page 15: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

14

Functional Dependencies (2)

Conventions:

• X, Y, Z,… represent sets of attributes.

• A, B, C,… represent single attributes.

• no set formers in sets of attributes:

- just ABC, rather than {A,B,C }.

Page 16: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Armstrong's Axioms

Reflexivity

If Y ⊆ X then X -> Y

Augmentation

If X -> Y then XZ -> YZ

Transitivity

If X -> Y and Y -> Z then X -> Z

15

Page 17: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Armstrong's Axioms (2)

Derived properties:

Union

If X -> Y and X -> Z then X -> YZ

Decomposition

If X -> YZ then X -> Y and X -> Z

Pseudo-transitivity

If X -> Y and V,Y -> Z then VX -> Z

16

Page 18: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Elementary FD

X -> A is an elementary FD iif:

1. A is a single attribute

2. A does not belong to X

3. There is no X' X such that X' -> A

When looking for FDs, we are only concerned by elementary FDs.

17

Page 19: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

18

Example: FD’s

Drinkers(name, addr, beersLiked, manf, favBeer)

Reasonable FD’s to assert:

1. name -> addr favBeer

Note this FD is the same as

name -> addr

name -> favBeer

2. beersLiked -> manf

Page 20: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

19

Example: Possible Data

name addr beersLiked manf favBeer Janeway Voyager Bud A.B. WickedAle Janeway Voyager WickedAle Pete’s WickedAle Spock Enterprise Bud A.B. Bud

Because name -> addr Because name -> favBeer

Because beersLiked -> manf

Page 21: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

20

Keys of Relations

FDs generalize the notion of key.

K is a superkey for relation R if

K determines all attributes of R.

K is a key for R if K is a superkey,

but no proper subset of K is a

superkey: K is a minimal superkey.

Page 22: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

21

Example: Superkey

Drinkers(name, addr, beersLiked, manf, favBeer)

name, beersLiked is a superkey because together these attributes determine all the other attributes.

• name -> addr favBeer

• beersLiked -> manf

Page 23: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

22

Example: Key

name, beersLiked is a key because neither name nor beersLiked is a superkey.

• name doesn’t -> manf

• beersLiked doesn’t -> addr

There are no other keys, but lots of superkeys.

• Any superset of name, beersLiked.

Page 24: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

23

How to find relevant FDs?

1. Look for IDs:

• SSn is an ID for People.

• Thus SSn->name, SSn->firstName, etc.

2. Look for properties:

• "No two courses can meet in the same room at the same time."

• Thus: hour room -> course

Page 25: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

24

How to find relevant FDs? (2)

3. Infer other elementary FDs using

Armstrong's axioms and properties.

• If A -> B and B -> C,

• Then A -> C also holds.

Page 26: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Outline

1. Problem statment

2. Functional Dependencies

3. Normal Forms

• Definition

• 1NF, 2NF, 3NF

4. Decomposition Algorithm

25

Page 27: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Normal Forms

A Normal Form defines the quality of a database schema with respect to redundancy.

Several Normal Forms have been proposed, based on FDs:

• 1NF, 2NF, 3NF

• Boyce-Codd NF, 4NF, 5NF, …

26

Page 28: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Normal Forms (2)

Note: Other NFs exist that are based on more sophisticated integrity constraints, such as multivalued dependencies and join dependencies.

NFs have increasingly restrictive requirements, i.e.

• a schema in 2NF is also in 1NF

• a schema in 3NF is also in 2NF

27

Page 29: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

First Normal Form

A schema is 1FN if all attributes are atomic, i.e. neither compound or multi-valued.

Every relational schema is 1NF by definition.

28

Page 30: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Second Normal Form

A schema is in 2NF iif:

1. It is in 1NF

2. For every non-key attribute A, for every candidate key K, A depends on the whole key K rather than just a part of it.

Counter example:

R(A, B, C, D) with A -> C

R is not in 2NF

29

Page 31: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

30

Example: 2NF?

Drinkers(name, addr, beersLiked, manf, favBeer)

FD’s: name->addr, name->favBeer, beersLiked->manf

The only key is name, beersLiked.

adr depends on a subset of the key.

Conclusion: Drinkers is not in 2NF and thus not in 3NF either.

Page 32: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Third Normal Form

A schema is in 3NF iif:

1. It is in 2NF

2. For every non-key attribute A, A does not depend on another non-key attribute.

Counter example:

R(A, B, C) with B -> C

R is not in 3NF

31

Page 33: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

32

Example: 3NF?

Beers(name, manf, manfAddr) FD’s: name->manf, manf->manfAddr, name->manfAddr

Only key is name.

The key has only one attribute: Beers is in 2NF.

manf->manfAddr: there is a FD between two non-key attributes.

Conclusion: Beers is not in 3NF.

Page 34: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

Outline

1. Problem statment

2. Functional Dependencies

3. Normal Forms

4. Decomposition Algorithm

33

Page 35: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

2FN Decomposition Algorithm

Start: a relation R in 1NF

For every key subset X that determines non-key attributes A, B, C, …:

• create a new relation Rx with primary key X and A, B, C, … as non-key attributes

• remove attributes A, B, C, … from original relation R

34

Page 36: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

3FN Decomposition Algorithm

Start: a relation R in 2NF

For every non-key attribute set X that determines non-key attributes A, B, C, …:

• create a new relation Rx with primary key X and A, B, C, … as non-key attributes

• remove attributes A, B, C, … from original relation R

35

Page 37: Schema Refinement and Normal Formsefreidoc.fr/L3/BDD/Cours/2011-12 : Cours complet en...• Definition, Axiomes, Closure Set • Super Keys, Keys 3. Normal Forms 4. Decomposition Algorithm

3FN Decomposition Algorithm (2)

Note that Rx is not necessarily in 3NF.

If so, repeat the process on Rx.

36