logical model agenda - informal mapping er-diagram to schemas - functional dependencies - definition...

25
Logical Model Agenda - Informal Mapping ER-Diagram to Schemas - Functional Dependencies - Definition of ‘Good Design’ - Normalization (1NF, 2NF, 3NF, BCNF)

Post on 21-Dec-2015

224 views

Category:

Documents


2 download

TRANSCRIPT

Logical Model

Agenda

- Informal Mapping ER-Diagram to Schemas

- Functional Dependencies

- Definition of ‘Good Design’

- Normalization (1NF, 2NF, 3NF, BCNF)

Relational Model: definitions

- All data is stored in ‘relations’

- Relation TableColumns: AttributesRows: Tuples

- Domain of an Attribute, A allowed set of values of A

- Relational Schema NAME( A1, A2, …, An)

- Tuple, t, of R(A1, A2, …, An) ORDERED set of values, < v1, v2, v3, …, vn>vi dom( Ai)

- Relation Instance, r( R) a set of tuples

Relational model: example

7.52199234567MAN Ko Yee

8.92199888899LEE Chi Cheung

9.82299000011Alvin LAM

10.22199012233LAM Wai Kin

11.192299001122CHAN Kin Ho

GPAAgeSIDName

Each row is one tuple;This instance of schema STUDENT has 5 tuplest3 = <MAN Ko Yee, 99234567, 21, 7.5>

Attributes dom(GPA) = [0.0, 12.0]dom( Name) = character string, max length 100.

7.52199234567MAN Ko Yee

8.92199888899LEE Chi Cheung

9.82299000011Alvin LAM

10.22199012233LAM Wai Kin

11.192299001122CHAN Kin Ho

GPAAgeSIDName

Each row is one tuple;This instance of schema STUDENT has 5 tuplest3 = <MAN Ko Yee, 99234567, 21, 7.5>

Attributes dom(GPA) = [0.0, 12.0]dom( Name) = character string, max length 100.

Student( Name, SID, Age, GPA)

Constraints on Relational Schemas

A. Domain constraints

t[Ai] dom( Ai), for all t, Ai

B. Key constraints

Superkey of R: A set of attributes, SK, of R such that t1[ SK] != t2[SK] whenever t1 ≠ t2

Key: minimal Superkey of R

minimal:removal of any attribute from Key

no longer a Superkey of R

Constraints on Relational Schemas..

A. Domain constraintsB. Key constraints, examples:

CAR( State, LicensePlateNo, VehicleID, Model, Year, Manufacturer)

K1 = { State, LicensePlateNo}K1 is a minimal Superkey Key

K2 = { VehicleID } K2 Key (Why ?)

K3 = { VehicleID, Manufacturer}

Superkey ?

Key ?

Constraints on Relational Schemas..

A. Domain constraintsB. Key constraints

C. Entity Integrity constraints

If PK is the Primary Key, thent[PK] != NULL for any tuple t r( R)

D. Referential constraints

- All referential constraints must be defined

- X(Ai) references Y(Bj) dom(Ai) = dom(Bj)

- Foreign Key attributes that reference a Primary Key

D EPT D no D N a m e Lo c n M grE N o

EM PLO Y EE E N o N a m e A d d re s s D e p tN o S u p E N oExample:

Informal Mapping of ER-diagram to Schemas

1. For each regular entity, E, One relation E with all the simple attributes of E.

Select a primary key for E, and mark it.

2. For each binary relation type, R, between entity types, S and T:

For 1:1 relationship between S and TEither add PK(S) as FK(T), or add PK(T) as FK(R)

For 1:N relationship between S and T (S: the N-side)Add PK(T) as a foreign key in S.

For M:N relationship, R, between S and TCreate a new relation, R, withthe PK’s of S and T as FK’s of P, plus any attributes of R

Informal Mapping of ER-diagram to Schemas..

3. For each weak entity type, W, whose identifying entity is EOne relation W with all attributes of W and the primary key of EMark the Primary Key

4. For each multi-valued attribute A,Create a new relation, R, including A, plus PK of the entity/relationship containing A

5. For each n-ary relationship, R, with degree > 2Create a relation R, withPK of each participating entity as FK, plus all simple attributes of R

Informal Mapping of ER-diagram to Schemas…

S UP ER VIS IO N

W O R KS _ F O R

M ANAG ES

W O R KS _ O N

C O NT R O LS

DEP ENDENT S _ O F

E M P LO Y E ES S N

Bd ate

S ex

Ad d ress

S alary

F name M init Lname

Name

D E P A R T M E N T

Name

Numb er

Lo c atio ns

No O fEmp lo yees

S tartDate

Name Numb er

Lo c atio n

DEP ENDENT

Name R elatio nshipBirthDateS ex

N 1

1 1

MN

1

N

1

N

1 N

sup erviso r

Hrs

P R O JE C T

Examples:

Formal DB Design

How can we tell if a DB design is ‘Good’ ?

A DB Design is good if:

(1) it provides a way to store all information in the system

(2) the design is not bad

How can we tell if a DB design is ‘Bad’ ?

Bad DB DesignsEM PLO Y EE_D EPT

M grS S NS S N LN a m e A d d re s s D N u m b e r D N a m eB D a te

EM PLO Y EE_PRO J

S S N P N u m b e r H o u rs LN a m e P Lo c a tio nP N a m e

(a) Information is stored redundantly

(b) Insertion anomalies

(c) Deletion Anomalies

(d) Modification Anomalies

Example:

Bad DB Designs..

- Avoid too many NULL values in tuples

STUDENT( SID, Name, Phone, Email, SocietyName, MembershipNo)

OR

STUDENT( SID, Name, Phone, Email)

MEMBERSHIP( SID, SocietyName, MembershipNo)

Bad DB Designs..

- Spurious Tuples must not be created when ‘join’-ing tables

P1Proj2

P1Proj1

PartNoProjectNo

Proj2 P2

P2S2

P1S1

PartNoSupplierNo

S2 P1

Qty

25

10

20

PROJECT_PARTS SUPPLIER_PARTS

P1Proj2

P1Proj1

PartNoProjectNo

Proj2 P2

P2S2

P1S1

PartNoSupplierNo

S2 P1

Qty

25

10

20

PROJECT_PARTS SUPPLIER_PARTSExample:

- Who supplied P2 to Proj2 ?

-- the answer requires us to ‘join’ the two tables

- Who supplied P1 to Proj2 ?

Formal DB Design: Functional Dependencies

A set of attributes, X, functionally determines a set of attributes Yif the value of X determines a unique value for Y.

X Y implies thatfor any two tuples, t1 and t2,if t1[X] = t2[X], then t1[ Y] = t2[ Y]

NOTATION: X Y

Examples:

{SSN} {Employee name}

{Employee SSN, Project Number} {Hours per week}

FD’s: Armstrong’s Rules

A1. (Reflexive). If Y X, then X Y

A2. (Augmentation). If X Y, then XZ YZ

(XZ == X union Z)

A3. (Transitive). If X Y and Y Z, then X Z

Common methods of proving: Construction, Induction, Contradiction

Common methods of disproving: Construction, Counterexamples

More Theorems about FD’s

A4. (Decomposition). If X YZ, the X Y and X Z

A5. (Union). If X Y and X Z, then X YZ

A6. (Pseudotransitive). If X Y and WY Z, then WX Z

Definition:

Two sets of FDs, F and G are said to be equivalent if

every FD in F can be inferred from G, and

every FD in G can be inferred from F.

FD’s are critical in our definition of Normalized DB designs

First Normal Form: 1NF

A schema is in 1NF if it does not contain - any composite attributes,- any multi-valued attributes,- any nested relations

Any non-1NF schema can be converted into a set of 1NF schemas

SID Name SemYr Courses

0401 John Smith Fall 05 ie110, ie215

0402 Jane Doe Fall 05 ie110, ie317

STUDENT_COURSES

Not 1NF

Composite Multi-valued

SID Lname Fname Sem Yr Course

0401 Smith John Fall 05 ie110

0401 Smith John Fall 05 ie215

0402 Doe Jane Fall 05 ie110

0402 Doe Jane Fall 05 ie317

STUDENT_COURSES_1NF

1NF

1NF..

5P2

10P1JohnSmith1123

JaneDoe3312P3

P2

ProjNo

5

10

HoursFnameLnameSSN

Projects

EMPLOYEE_PROJECTS Nested

Not 1NF

JaneDoe3312

JohnSmith1123

FnameLnameSSN

5P21123

10P23312

5P33312

10P11123

HoursProjNoSSN

EMPLOYEE EMP_PROJECTS

1NF

Second Normal Form, 2NF

Prime Attribute:An attribute that is a member of the primary key

Full functional Dependency:A FD, Y Z, such that X Z is false for all X Y

{SSN, PNumber} {Hours} Full FD ?

{SSN, PNumber} EName Full FD ?

2NF..

A schema R is in 2NF if every non-prime attribute A in R isfully functionally dependent on the primary key.

Any non-2NF schema can be converted into a set of 2NF schemas

EMP_PROJ

SSN Pnumber Hours EName PName PLocation

EMP_PROJ1

SSN PNumber Hours

EMP_PROJ2

SSN EName

EMP_PROJ3

PNumber PLocationPNameNot 2NF

2NF

Third Normal Form, 3NF

A Transitive Functional Dependency is an FD, Y Zthat can be derived from two FDs Y X and X Z.

Examples:

SSN MgrSSN is a transitive dependency

[SSN DNumber, and DNumber MgrSSN]

SSN LName is NOT a transitive dependency

[there is no set of attributes X, s.t. SSN X and X LName]

3NF..

A schema is in 3NF if- it is in 2NF, and- no non-prime attribute A in R is transitively dependent on the primary key.

EMP_DEPT

SSN DnoAddressEName DName MgrSSN

EMP_DEPT1

SSN AddressEName Dno

EMP_DEPT2

DNo MgrSSNDName

Not 3NF

3NF

General normal forms

Our previous definitions depend on our choice of the PK

Problem ?

A schema is in general 2NF if:- it is in 1NF, and- every non-prime attribute FFD on every key of R.

General 1NF: -same as before-

A schema is in general 3NF if:- whenever a FD X A holds in R, then

either X is a superkey of R, or A is a prime attribute of R.

General normal forms..

Example: Property Lots DB

LOTS2

District TaxRate

LOTS1A

PropertyID AreaLot#District

FD1

FD2

LOTS1B

Area Price

LOTS

PropertyID AreaLot#District Price TaxRate

FD1

FD4FD3

FD2

1NF

2NF

3NF

LOTS1

PropertyID AreaLot#District Price

FD1

FD4FD2

LOTS2

District TaxRate

Keys?

Slightly stricter than the general 3NF definition:

A schema is in BCNF, 2NF if:- it is in 1NF, and- every non-prime attribute FFD on every key of R.

BCNF, 1NF: -same as before-

A schema is in BCNF, 3NF if:- whenever a FD X A holds in R, then

X is a superkey of R

Boyce Codd Normal Form, BCNF