database normalization. designing good schemas we know how to create schemas, but... how do we...

30
Database Normalization

Upload: jonah-hubbard

Post on 31-Dec-2015

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Database

Normalization

Page 2: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Designing Good Schemas

We know how to create schemas, but ... how do we create good schemas? what does good mean?

Schema quality measurements: semantics of the attributes minimal redundancy minimal frequency of null values

Page 3: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Functional Dependences

A column Y of relational table R is functionally dependent up on column X of relational table R if and only if:

Each value of X in R associated with each value of Y at any given time

Page 4: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Functional dependences

Y is functional dependent up on X same as values of X identify values of Y

If X Y then XZYZ IF XY and Y Z then XZ X Y means that Y depend on X or

X identify Y

Page 5: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Examples

S# Ename {S#, P#} Hours If for each value of S#, there are exactly one

corresponding value for sname, state, city then:

Sname Sate CityS#

Page 6: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Example

If {S#, p#} Qty

P# QTYS#

Page 7: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Redundancy Example Where’s the redundancy?

Page 8: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Redundancy Example

Page 9: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Example FDs

Transitive FDsProper FDs

Partial Key FDs

Partial Key FD

Page 10: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

10

Normal Forms

Each normal form is a set of conditions on a schema that guarantees certain properties (relating to redundancy and update anomalies)

The two commonly used normal forms are third normal form (3NF) and Boyce-Codd normal form (BCNF)

Page 11: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Normalization0NF

removemulti-valued

attributes1NF 2NF 3NF

removepartial

dependencies

removetransitive

dependencies

BCNF 4NF 5NFremove

remainingFD anomal

dependencies

removemultivalue

dependencies

removeremaininganomalies

Page 12: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

1 NF

First normal form is

NO multi-valued attributes

No composite attribute

No nested relation

We create new table or new field (telephone, visiting)

Page 13: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

1NF Normalization

Proper translation from ER multi-value attributes will achieve 1NF.

Still not a good solution,since we have redundancy in Dnumber and Dmgr_ssn.(This will be handled by 2NF.)

Page 14: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

2 NF form

Second normal form that if primary key is multiple attribute and non-key attribute depend on part of primary key

P# HoursS# Cname pname Loc

Page 15: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

2NF Normalization

Move the partial key and dependent attributes to a new relation.

Page 16: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Transitive Dependencies

X → Y is a transitive dependency (PD)

if there exists Z ⊈ any key

such that X → Z → Y TDs can cause redundancy if there are multiple

values of X that determine the same value of Z the value of Y for that value of Z is stored multiple times

3NF normalization: move (Z,Y) to new relation in which Z is the primary key

Page 17: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

3 NF

The relation in 3NF if it is 2 NF and every

non-key attribute is non-transitively

dependent on primary key

Page 18: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

3NF Normalization

Create new relation to hold the attributes in the transitive FD.

LHS of transitive FD becomes PK of new relation.

Page 19: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Transitive Dependency Example

I_OFFICE (instructor's office) is determinedby the non-PK attribute INSTR

DEPT COURSE SECTION ROOM INSTR I_OFFICE

DEPT COURSE SECTIONCOMP 51 1COMP 51 2COMP 163 1COMP 53 1COMP 53 2

ROOMWPC122WPC219WPC122WPC130WPC130

INSTRDOHERTYCLIBURNDOHERTYBOWRINGCARMAN

I_OFFICECSB109CSB107CSB109CSB108CSB104

Page 20: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

NF Decomposition: Foreign Keys

DEPT COURSE SECTION ROOM INSTR I_OFFICE

DEPT COURSE SECTION ROOM INSTR

INSTR I_OFFICE

Decomposition:

Page 21: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Arthur Keller – CS 180

NormalizationGoal = BCNF = Boyce-Codd Normal Form =all FD’s follow from the fact “key

everything.” Formally, R is in BCNF if for every nontrivial

FD for R, say X A, then X is a superkey. “Nontrivial” = right-side attribute not in left side.

Why?1. Guarantees no redundancy due to FD’s.

2. Guarantees no update anomalies = one occurrence of a fact is updated, not all.

3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted.

Page 22: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Boyce-Codd Normal Form Sample data for Course Section table

Because Prefix Department, we know that (Prefix, Num, SecNum) could also be a primary key for this table.

Department Prefix Num SecNum CourseName Instructor

Mathematics Math 101 1 Algebra I Al Jeebra

Mathematics Math 101 2 Algebra I Al Jeebra

Mathematics Math 201 1 Calculus I Kal Kuelus

Philosophy Phil 201 1 Greek Thought Arie Stottle

Philosophy Phil 202 1 Euro Thought Mike Angelo

Marketing Mktg 410 1 Marketing Strategy

Marc Ekking

Marketing SpMkg 401 1 Advanced Sports Marketing

Hulk Hogan

Page 23: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

23

Example

Students(name, addr, phones, CarLiked) A student’s phones are independent of the cars

they like. Thus, each of a student’s phones appears with

each of the cars they like in all combinations. This repetition is unlike redundancy due to

FD’s, of which name->addr is the only one.

Page 24: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

24

Example

Students(name, addr, CarLiked, manf, favCar) FD’s: name->addr favCar, carsLiked->manf

Only key is {name, CarsLiked}. In each FD, the left side is not a superkey. Any one of these FD’s shows Students is not

in BCNF

Page 25: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

25

Boyce-Codd Normal Form

We say a relation R is in BCNF if whenever X ->A is a nontrivial FD that holds in R, X is a superkey. Remember: nontrivial means A is not a member

of set X. Remember, a superkey is any superset of a key

(not necessarily a proper superset).

Page 26: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

26

Example

Students(name, addr, CarsLiked, manf, favCar) F = name->addr, name -> favCar, CarsLiked->manf Pick BCNF violation name->addr. Close the left side: {name}+ = {name, addr, favCar}. Decomposed relations:

1. Students1(name, addr, favCar)

2. Students2(name, CarsLiked, manf)

Page 27: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

27

3NF and BCNF

3rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation.

X ->A violates 3NF if and only if X is not a superkey, and also A is not prime.

Page 28: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

ExercisesThe following relation schema

is not in third normal form (3NF).

Is this an example of a transitive dependency or a partial key dependency?

Give an equivalent schema that is in 3NF.

SID FROM_CITY TO_CITY DISTANCE

SHIPMENT

WEIGHT

Page 29: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

Exercises

This relation has been proposed to track Pacific alumni:

Alumni( SID, LastName, FirstName, Degree, YearAwarded, Phone).

Pacific allows students to receive multiple degrees,possibly in different years. Identify all FDs.

Give a new schema that is in third normal form.

Page 30: Database Normalization. Designing Good Schemas We know how to create schemas, but... how do we create good schemas? what does good mean? Schema quality

ExercisesConsider the following relation schema:

Movie(title, genre, length, actor, sag_id, studio, studio_addr) 

Every movie has a unique title. A movie may have multiple actors. Each actor has a unique sag_id. An actor may appear in multiple movies. A movie has exactly one studio,

but a studio may produce more than one movie. Each studio has exactly one address.

Identify all functional dependencies.

Normalize the schema to 3NF.