chapter 7 normalization. mcgraw-hill/irwin © 2004 the mcgraw-hill companies, inc. all rights...
TRANSCRIPT
![Page 1: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/1.jpg)
Chapter 7Chapter 7Normalization
![Page 2: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/2.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Outline Outline
Modification anomaliesFunctional dependenciesMajor normal formsRelationship independencePractical concerns
![Page 3: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/3.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Modification AnomaliesModification Anomalies
Unexpected side effectInsert, modify, and delete more data than
desiredCaused by excessive redundanciesStrive for one fact in one place
![Page 4: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/4.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Big University Database TableBig University Database Table
StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc
S1 JUN O1 2003 3.5 C1 DB S1 JUN O2 2003 3.3 C2 VB S2 JUN O3 2003 3.1 C3 OO S2 JUN O2 2003 3.4 C2 VB
![Page 5: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/5.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Functional DependenciesFunctional Dependencies
Constraint on the possible rows in a tableValue neutral like FKs and PKsAssertedUnderstand business rules
![Page 6: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/6.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
FD DefinitionFD Definition
X YX (functionally) determines YX: left-hand-side (LHS) or determinantFor each X value, there is at most one Y
valueSimilar to candidate keys
![Page 7: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/7.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
FD Diagrams and ListsFD Diagrams and Lists
StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGradeCourseNo CrsDesc
StdSSN StdCity, StdClass
OfferNo OffTerm, OffYear, CourseNo, CrsDesc
CourseNo CrsDesc
StdSSN, OfferNo EnrGrade
![Page 8: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/8.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
FDs in DataFDs in Data
• Prove non existence (but not existence) by looking at data
• Two rows that have the same X value but a different Y value
StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc
S1 JUN O1 2003 3.5 C1 DB S1 JUN O2 2003 3.3 C2 VB S2 JUN O3 2003 3.1 C3 OO S2 JUN O2 2003 3.4 C2 VB
![Page 9: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/9.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
NormalizationNormalization
Process of removing unwanted redundancies
Apply normal forms– Identify FDs– Determine whether FDs meet normal form– Split the table to meet the normal form if there
is a violation
![Page 10: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/10.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Relationships of Normal FormsRelationships of Normal Forms1NF
2NF
3NF/BCNF
4NF
5NF
DKNF
![Page 11: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/11.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
1NF1NF
Starting point for SQL:1999 databasesNo repeating groups: flat rows
StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc
S1 JUN O1 2003 3.5 C1 DB O2 2003 3.3 C2 VB S2 JUN O3 2003 3.1 C3 OO O2 2003 3.4 C2 VB
![Page 12: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/12.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Combined Definition of Combined Definition of 2NF/3NF2NF/3NFKey column: candidate key or part of
candidate keyAnalogy to the traditional justice oathEvery non key depends on a key, the
whole key, and nothing but the keyUsually taught as separate definitions
![Page 13: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/13.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
2NF2NF
Every nonkey column depends on a whole key, not part of a key
Violations– Part of key nonkey– Violations only for combined keys
![Page 14: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/14.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
2NF Example2NF ExampleMany violations for the big university
database table– StdSSN StdCity, StdClass– OfferNo OffTerm, OffYear, CourseNo,
CrsDescSplitting the table
– UnivTable1 (StdSSN, StdCity, StdClass) – UnivTable2 (OfferNo, OffTerm, OffYear,
CourseNo, CrsDesc)
![Page 15: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/15.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
3NF3NF
Every nonkey column depends only on a key not on non key columns
Violations: Nonkey NonkeyAlterative formulation
– No transitive FDs– A B, B C then A C– OfferNo CourseNo, CourseNo CrsDesc
then OfferNo CrsDesc
![Page 16: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/16.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
3NF Example3NF ExampleOne violation in UnivTable2
– CourseNo CrsDescSplitting the table
– UnivTable2-1 (OfferNo, OffTerm, OffYear, CourseNo)
– UnivTable2-2 (CourseNo, CrsDesc)
![Page 17: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/17.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
BCNFBCNF
Every determinant must be a candidate key.
Simpler definitionApply with simple synthesis procedureSpecial cases not covered by 3NF
– Part of key Part of key– Nonkey Part of key– Special cases are not common
![Page 18: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/18.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
BCNF ExampleBCNF ExampleMany violations for the big university
database table– StdSSN StdCity, StdClass– OfferNo OffTerm, OffYear, CourseNo– CourseNo CrsDesc
Splitting into four tables
![Page 19: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/19.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Simple Synthesis ProcedureSimple Synthesis Procedure
1. Eliminate extraneous columns from the LHSs
2. Remove derived FDs 3. Arrange the FDs into groups with each
group having the same determinant. 4. For each FD group, make a table with the
determinant as the primary key.5. Merge tables in which one table contains
all columns of the other table.
![Page 20: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/20.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Simple Synthesis ExampleSimple Synthesis Example
Begin with FDs shown in Slide 7Step 1: no extraneous columnsStep 2: eliminate OfferNo CrsDescStep 3: already arranged by LHSStep 4: four tables (Student, Enrollment,
Course, Offering)Step 5: no redundant tables
![Page 21: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/21.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Multiple Candidate KeysMultiple Candidate Keys
Multiple candidate keys do not violate either 3NF or BCNF
Step 5 of the Simple Synthesis Procedure creates tables with multiple candidate keys.
You should not split a table just because it contains multiple candidate keys.
Splitting a table unnecessarily can slow query performance.
![Page 22: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/22.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Relationship Independence and Relationship Independence and 4NF4NFM-way relationship that can be derived
from binary relationships Split into binary relationshipsSpecialized problem4NF does not involve FDs
![Page 23: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/23.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Relationship Independence Relationship Independence ProblemProblem
StdSSNStdName
StudentOfferNoOffLocation
Offering
TextNoTextTitle
Textbook
EnrollStd-Enroll
Offer-Enroll
Text-Enroll
![Page 24: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/24.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Relationship Independence Relationship Independence SolutionSolution
StdSSNStdName
Student
OfferNoOffLocation
Offering
TextNoTextTitle
Textbook
Enroll Orders
![Page 25: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/25.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Extension to the Relationship Extension to the Relationship Independence SolutionIndependence Solution
StdSSNStdName
StudentOfferNoOffLocation
Offering
TextNoTextTitle
Textbook
Enroll Orders
PurchaseStd-Purch
Offer-Purch
Text-Purch
![Page 26: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/26.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
MVDs and 4NFMVDs and 4NF
MVD: difficult to identify– A B | C (multi-determines)– A associated with a collection of B and C
values– B and C are independent– Non trivial MVD: not also an FD
4NF: no non trivial MVDs
![Page 27: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/27.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
MVD RepresentationMVD Representation
A B C A1 B1 C1 A1 B2 C2 A1 B2 C1 A1 B1 C2
A B | C
OfferNo StdSSN TextNo O1 S1 T1 O1 S2 T2 O1 S2 T1 O1 S1 T2
OfferNo StdSSN | TextNo
Given the two rows above the line, the two rows below the line arein the table if the MVD is true.
![Page 28: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/28.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Higher Level Normal FormsHigher Level Normal Forms
5NF for M-way relationshipsDKNF: absolute normal formDKNF is an ideal, not a practical normal
form
![Page 29: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/29.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Role of NormalizationRole of Normalization
Refinement– Use after ERD– Apply to table design or ERD
Initial design– Record attributes and FDs– No initial ERD– May reverse engineer an ERD after
normalization
![Page 30: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/30.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
Normalization ObjectiveNormalization Objective
Update biasedNot a concern for databases without
updates (data warehouses)Denormalization
– Purposeful violation of a normal form– Some FDs may not cause anomalies– May improve performance
![Page 31: Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies](https://reader036.vdocuments.us/reader036/viewer/2022081511/56649eb55503460f94bbe421/html5/thumbnails/31.jpg)
McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved.
SummarySummary
Beware of unwanted redundanciesFDs are important constraintsStrive for BCNFUse a CASE tool for large problemsImportant tool of database developmentFocus on the normalization objective