normalization of database lecture - ans yong choi school of business csub
TRANSCRIPT
Normalization of DatabaseLecture - ANS
Yong Choi
School of Business
CSUB
2
1NF Example
Unnormalized TablePK
3
1NF Example (con’t.)
Conversion to 1NFPK
4
Another 1NF Example
Cust_ID Name Street Phone
104 Mr. Ray Suchecki
123 Pond Hill Road, Detroit, MI, 48161
(313) 792-3546
Cust_ID SalesRep_Name Rep_Office Order_1 Order_2 Order_3
1022 Jones 412 10 14 19
PK
PK
5
2NF ExamplePK PK
Each arrow shows partial dependency
6
2NF Example
7
Example of 3NF
PK: Cust_ID
8
Relation with transitive dependency
PK
9
Transitive dependency
• All attributes are functionally dependent on Cust_ID. – Cust_ID -> Name– Cust_ID -> Salesperson– Cust_ID -> Region
• However, there is a transitive dependency.– Region is functionally dependent on Salesperson.– Salesperson -> Region
10
Problems with Transitive dependency
• A new sales person (Yong) assigned to the North region cannot be entered until a customer has been assigned to that salesperson (since a value for Cust_ID must be provided to insert a row in the relation).
• If customer number 6837 is deleted from the table, we lose the information that salesperson Hernandez is assigned top the Easy region.
• If sales person Smith is reassigned to the East region, several rows must be changed to reflect that fact.
11
Decomposing the SALES relation
12
Relations in 3NF
Now, there are no transitive dependencies…Both relations are in 3rd NF
CustID Name
CustID Salesperson
Salesperson Region
13
Dependency Diagram
14
Boyce-Codd Normal Form (BCNF)
• Special case of 3NF.• A relation is in BCNF if it’s in 3NF and there is no
hidden dependencies. • Below is in 3NF but not in BCNF
15
BCNF
Stu_ID Advisor Major GPA
123 Nasa Physics 4.0
123 Elvis Music 3.3
456 King Literature 3.2
789 Jackson Music 3.7
678 Nasa Physics 3.5
Student
16
BCNF
Stu_ID Advisor Major GPA
Don’t confuse with Transitive Dependency!
Advisor is functionally dependent on Major.
17
BCNF
• In Physics the advisor Nasa is replaced by Einstein. This change must be made in two ( or more) rows in the table.
• If we want to insert a row with the information that Choi advises in MIS. This cannot be done until at least one student majoring in MIS is assigned Choi as an advisor.
• If student number 789 withdraw from school, we lose the information that Jackson advises in Music.
18
Conversion to BCNF
Stu_ID Advisor GPA
123 Nasa 4.0
123 Elvis 3.3
456 King 3.2
789 Jackson 3.7
678 Nasa 3.5
Advisor Major
Nasa Physics
Elvis Music
King Literature
Jackson Music
Student Advisor
19
Decomposition into BCNF
20
3NF and BCNF
• In practice, most relation schemas that are in 3NF are also in BCNF. Only if a hidden dependency X -> A exists in a relation.
• In general, it is best to have relation schemas in BCNF. If that is not possible, 3NF will do. However, 2NF and 1NF are not considered good relation schema designs.
21
4NF
• A relation is in 4NF if it is already in 3NF and does not contain two multi-valued dependencies that are independent.- it’s a different meaning than not having multi-valued attributes for 1NF.
e.g.,
Smith can cook and type.
Smith speaks French, German, and Greek
22
4NF
E_Name Skill Language
Smith cook null
Smith type null
Smith null French
Smith null German
Smith null Chinese
E-Name ->-> SkillE-Name ->-> Language
PK
23
4NF
The values for Skill and the values for Language are independent.
E_Name Skill E_Name Language
Smith Cook Smith French
Smith Type Smith German
Smith Chinese
PK PK
24
Faculty (A) relation
FacultyNum StudentNum CommitteeCode
123 12805
24139
ADV
HSG
PER
444 57384 HSG
456 24139
36273
37573
CUR
Faculty (A) – normalized table
25
Faculty (A) relation
• Each FacultyNum has a well-defined set of StudentNums.
• Each FacultyNum has a well-defined set of CommitteeCodeses.
• The STUDENTNUM and the COMMITTEECODE are independent of each other.
26
Faculty (B) relationFaculty (B)
27
Faculty (B) relation
• Has a composite PK– FacultyNum, StudentNum , and CommitteeCode
• Since there are no determinants other than the PKs, the relation is in BCNF.
• Yet it does contain much redundant data that can easily lead to update anomalies because of multi-valued dependencies.
28
Problems with Faculty (B) relation
• Changing the CommitteeCode for faculty member requires more than one change.
• Suppose that a new faculty member 555 but does not yet serve on any committee. When a facultyNum 555 begins advising student 44332, there is a problem because the CommitteeCode is a part of PK.
• If faculty member 444 no longer advises student 57384 and delete appropriate record from the relation, we lose the information that faculty member serves on the Housing committee (HSG).
29
Conversion to 4NF