dbms
DESCRIPTION
DbmsTRANSCRIPT
![Page 1: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/1.jpg)
1
Normalization of Database Tables
CHAPTER 4
![Page 2: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/2.jpg)
2Chapter Objectives
Understand concepts of normalization Learn how to normalize tables Understand normalization and database design issues
![Page 3: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/3.jpg)
3Database Tables and Normalization Normalization is a process for assigning attributes to entities. It reduces data redundancies. An un-normalized relation (table) stores redundant data, which
can cause insertion, deletion, and modification anomalies. In simple words: Normalization means keeping a single copy
of data in your database. Normalization theory provides a step by step method to
remove redundant data and undesirable table structures.
![Page 4: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/4.jpg)
4Normal Forms Tables are normalized by applying rules to create a series of
normal forms: First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Boyce/Codd normal form (BCNF) Fourth normal form (4NF) Projection Join normal form (PJNF, aka 5NF)
A table or relation in a higher level normal form always confirms to lower level normal forms.
![Page 5: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/5.jpg)
5Normal Forms
While higher level normal forms are available, normalization up to BCNF is often found to be adequate for business data.
PJ/NF (5NF) Relations
4NF Relations
BCNF Relations
3NF Relations
2NF Relations
1NF Relations
![Page 6: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/6.jpg)
6First Normal Form A relation is in 1NF if all underlying domains contain atomic
values only, i.e., the intersection of each row and column contains one and only one value.
The relation must not contain repeating groups.
PNo PName ENo EName Jcode ChgHr Hrs 1 Alpha 101 John Doe NE $65 20
105 Jane Vo SA $80 15110 Bob Lund CP $60 40
2 Beta 101 John Doe NE $65 20108 Jeb Lee NE $65 15106 Sara Lee SA $80 20
3 Omega 102 Beth Reed PM $125 20105 Jane Vo SA $80 10
Is the above relation in 1NF?
![Page 7: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/7.jpg)
7First Normal Form
The previous relation can be converted into first normal form by adding Pno and Pname to each row.
PNo PName ENo EName Jcode ChgHr Hrs1 Alpha 101 John Doe NE $65 201 Alpha 105 Jane Vo SA $80 151 Alpha 110 Bob Lund CP $60 402 Beta 101 John Doe NE $65 202 Beta 108 Jeb Lee NE $65 152 Beta 106 Sara Lee SA $80 203 Omega 102 Beth Reed PM $125 203 Omega 105 Jane Vo SA $80 10
What is the primary key in this relation?
Do you see redundant data in this table?What anomalies could be caused?
![Page 8: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/8.jpg)
8Functional Dependency Revisited If A and B are attributes (or group of attributes) of a relation R,
B is functionally dependent on A (denoted A B), if each value of A in R is associated with exactly one value of B in R.
A is called a determinant. Consider the relation
Student (ID, Name, Soc Sec Nbr, Major, Deptmt) Assume a department offers several majors, e.g. INSY
department offers, INSY, MASI, and POMA majors. How many determinants can you identify in Student?
![Page 9: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/9.jpg)
9Functional Dependency Revisited
ID
A Dependency diagram
Name Soc_Sec_Nbr Major Dept
![Page 10: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/10.jpg)
10Functional Dependency Revisited Full functional dependency
Attribute B is fully functionally dependent on attribute A if it is functionally dependent on A and not functionally dependent on any proper subset of A.
This becomes an issue only with composite keys. Transitive dependency
A, B and C are attributes of a relation such that A B and B C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C)
![Page 11: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/11.jpg)
11Second Normal Form Dependency diagram for Project
PNo PName ENo EName JCode ChgHr Hrs
![Page 12: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/12.jpg)
12Second Normal Form A relation is in 2NF if:
It is in 1NF and every nonkey attribute is fully dependent on the primary
key, i.e., no partial dependency. A nonkey attribute is one that is not a primary key or part of a
primary key. We create new relations that are in 2NF through projection
of the original relation. Project(PNo, PName) Employee(ENo, EName, Jcode, ChgHr) Charge(PNo, ENo, Hrs)
![Page 13: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/13.jpg)
132NF2NF
PNo
ENo EName JCode ChgHr
PNo ENo Hrs
PName
![Page 14: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/14.jpg)
14Second Normal Form Tables in 2NF
ProjectPNo PName 1 Alpha2 Beta3 Omega
Charge PNo ENo Hrs 1 101 20 1 105 15 1 110 40 2 101 20 2 108 15 2 106 20 3 102 20 3 105 10
Employee ENo EName JCode ChgHr101 John Doe NE $65102 Beth Reed PM $125105 Jane Vo SA $80106 Sara Lee SA $80108 Jeb Lee NE $65110 Bob Lund CP $60
![Page 15: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/15.jpg)
15Second Normal Form Note that the original relation can be recreated through natural
join of the new relation. Thus, no information is lost in the process of creating 2NF
relations from a 1NF relation. This is called nonloss decomposition.
If a relation that is in 1NF has a non composite primary key (i.e., the primary key consists of a single attribute) what can you say about its status with regard to 2NF?
Do you see any redundant data in the tables that are in 2NF? What anomalies could be caused by such redundancy?
![Page 16: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/16.jpg)
16Third Normal Form A relation is in 3NF if:
It is in 2NF and every nonkey attribute is nontransitively dependent on the
primary key (i.e., no transitive dependency). Relation Employee has a transitive dependency:
ENo JCode ChgHr Employee can be replaced by two relations, that are in 3NF:
Employee(ENo, EName, Jcode) Job(JCode, ChgHr)
![Page 17: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/17.jpg)
173NF3NF
PNo
ENo EName JCode
PNo ENo Hrs
PName
JCode ChgHr
![Page 18: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/18.jpg)
18Third Normal Form Tables in 3NF
ProjectPNo PName 1 Alpha2 Beta3 Omega
Charge PNo ENo Hrs 1 101 20 1 105 15 1 110 40 2 101 20 2 108 15 2 106 20 3 102 20 3 105 10Employee
ENo EName Jcode101 John Doe NE102 Beth Reed PM105 Jane Vo SA106 Sara Lee SA108 Jeb Lee NE110 Bob Lund CP
JobJcode ChgHrCP $60NE $65PM $125SA $80
![Page 19: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/19.jpg)
19Boyce-Codd Normal Form A relation is in BCNF if
every determinant is a candidate key. A determinant is an attribute (combination of attributes) on which some
other attribute is fully functionally dependent. BCNF is a special case of 3NF. The potential to violate BCNF may occur in a relation that:
contains two (or more) composite candidate keys, these keys overlap and share at least one attribute.
Thus, if a table contains only one candidate key or only non-composite keys, then 3NF and BCNF are equivalent.
![Page 20: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/20.jpg)
203NF Table Not in BCNFFigure 4.7
![Page 21: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/21.jpg)
21Decomposition of Table Structure to Meet BCNF
Figure 4.8
![Page 22: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/22.jpg)
22Boyce-Codd Normal Form Consider the following example:
The members of a recruiting team interview candidates on a one-to-one basis. Each member is assigned a particular room on a given date. Each candidate is interviewed only once on a specific date. He/she may return for follow up interviews on later dates.
Interview (CID, IDate, ITime, StaffID, RmNo)
CID IDate ITime StaffID RmNoC01 8-22-99 10:00 S01 B107C02 8-22-99 11:00 S01 B107C03 8-22-99 10:00 S05 B108C01 8-29-99 3:00 S06 B108
![Page 23: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/23.jpg)
23Boyce-Codd Normal Form This relation has following functional dependencies:
CID, IDate ITime, StaffID, RmNo StaffID, IDate, ITime CID, RmNo RmNo, Idate, Itime StaffID, CID StaffID, IDate RmNo
This relation does not have any partial or transitive dependencies on the primary key (CID, IDate)
It is not in BCNF because (StaffID, Idate) is a determinant but not a candidate key.
The new relations in BCNF are: Interview (CID, IDate, ITime, StaffID) Room(StaffID, IDate, RmNo)
![Page 24: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/24.jpg)
24Dependency DiagramDependency diagram
CID IDate ITime StaffID RmNo
CID IDate ITime StaffID RmNo
CID IDate ITime StaffID RmNoFig 1
Fig 2
Fig 3
![Page 25: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/25.jpg)
25Fourth Normal Form A table is in 4NF if
it is in 3NF and has no multiple sets of multivalued dependencies.
Consider the following example: Each course is taught by many teachers and requires many texts.
CTXU (Unnormalized)Course Teacher TextPhysics Green Basic Mechanics
Brown Intro to OpticsMath White Modern Algebra
Intro to Calculus
CTXN (Normalized)Course Teacher TextPhysics Green Basic MechanicsPhysics Green Intro to OpticsPhysics Brown Basic MechanicsPhysics Brown Intro to OpticsMath White Modern AlgebraMath White Intro to Calculus
![Page 26: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/26.jpg)
26Fourth Normal Form CTXN is in BCNF, because it is all key and there are no other
functional dependencies. It, however, has redundant data that could cause update
anomalies. This table shows two multivalued dependencies:
Each course has a defined set of teachers and Course Teacher
Each course has a defined set of textbooks. Course Text
MVDs can exist only when the relation has at least three attributes.
An FD is a special case of MVD when the set of dependent values has a single value.
![Page 27: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/27.jpg)
27Fourth Normal Form Tables in 4NF
CTCourse TeacherPhysics GreenPhysics BrownMath White
CXCourse TextPhysics Basic MechanicsPhysics Intro to OpticsMath Modern AlgebraMath Intro to Calculus
![Page 28: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/28.jpg)
28Conversion to 4NF
Figure 4.14Multivalued Dependencies
Figure 4.15Set of Tables in 4NF
![Page 29: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/29.jpg)
29Normalization and Database Design
Normalization should be part of the design process E-R Diagram provides macro view Normalization provides micro view of entities
Focuses on characteristics of specific entities May yield additional entities
Difficult to separate normalization from E-R diagramming Business rules must be determined Normalization purity is difficult to sustain due to conflict
in:– Design efficiency– Information requirements– Processing
![Page 30: Dbms](https://reader030.vdocuments.us/reader030/viewer/2022020202/577cc0321a28aba7118f35d9/html5/thumbnails/30.jpg)
30Denormalization Normalized (decomposed) tables require additional processing,
thus reducing system speed. Sometimes normalization is not done keeping in mind
processing speed requirements and practical aspects of the situation.
A good example is: storing Zip code and City as attributes in a Customer relation violates 3NF because City is transitively dependent on Cust ID via Zip Code. Why should we not create a separate relation ZIP (ZipCode,
City)?