normalization a337. a337 - reed smith2 structure what is a database? ◦ tables of information rows...
TRANSCRIPT
NormalizationA337
A337 - Reed Smith 2
StructureWhat is a database?
◦Tables of information Rows are referred to as records Columns are referred to as fields or
attributes Record identifier is referred to as a record
key
Types◦Relational - Most common, Object-
Oriented◦Hierarchical, Network (much older
types)
A337 - Reed Smith 3
Database structure
Two approaches to the structure issue:◦ Conceptual (you start with the question of
“what information should I have?”) ERD from “scratch”
◦ Empirical (you already know what data there will be - you just want to organize it into tables) – NORMALIZATION
Database Tables and NormalizationNormalization - Process for evaluating
and correcting table structures to minimize data redundancies◦Works through a series of stages called
normal forms: Normal form (1NF) Second normal form (2NF) Third normal form (3NF)
◦There are higher forms but are rarely necessary
A337 - Reed Smith 5
NormalizationWhy?
◦Data structures need to: Minimize redundancy Avoid insertion, update, and deletion
anomalies
How?◦Restructure information such that:
Only flat (rectangular) files exist (1st normal form) – No Nulls
All items in each record depend upon) the primary record key (2nd normal form) – No Partial Dependencies
If a field depends upon another then the “other” must be a primary key (3rd normal form) – No Transitive Dependencies
A337 - Reed Smith 6
Normalize the following table:
SALES_ORDERSSO_Number Item_Number Item_Name Qty_Ordered Cust_Code Cust_Name
1010 2010-0050 Formed Handlebar 2 WHEEL Wheelaway Cycle Center1000-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center
1011 1002-1 24 in. Bicycle 5 ETC Bikes Et Cetera1001-1 26 in. Bicycle 10 ETC Bikes Et Cetera
1012 1003-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center1001-1 26 in. Bicycle 10 WHEEL Wheelaway Cycle Center
1013 1001-1 26 in. Bicycle 50 IBS Inter. Bicycle Sales1014 1003-1 20 in. Bicycle 25 ETC Bikes Et Cetera1015 1003-1 20 in. Bicycle 25 WHEEL Wheelaway Cycle Center1016 3961-1041 Tire Tube, 26 in. 5 ETC Bikes Et Cetera
3965-1050 Spoke Reflector 50 ETC Bikes Et Cetera1003-1 20 in. Bicycle 5 ETC Bikes Et Cetera1000-1 20 in. Bicycle 4 ETC Bikes Et Cetera
A337 - Reed Smith 7
What is wrong with this solution?
First Normal Form
Eliminate Nulls/Repeating Groups –
◦Eliminate repeating groups by eliminating nulls, filling in cells with implied values with actual values
Select a primary key
◦may be a composite key
A337 - Reed Smith 9
1NF:
SALES_ORDERSSO_Number Item_Number Item_Name Qty_Ordered Cust_Code Cust_Name
1010 2010-0050 Formed Handlebar 2 WHEEL Wheelaway Cycle Center1010 1000-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center1011 1002-1 24 in. Bicycle 5 ETC Bikes Et Cetera1011 1001-1 26 in. Bicycle 10 ETC Bikes Et Cetera1012 1003-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center1012 1001-1 26 in. Bicycle 10 WHEEL Wheelaway Cycle Center1013 1001-1 26 in. Bicycle 50 IBS Inter. Bicycle Sales1014 1003-1 20 in. Bicycle 25 ETC Bikes Et Cetera1015 1003-1 20 in. Bicycle 25 WHEEL Wheelaway Cycle Center1016 3961-1041 Tire Tube, 26 in. 5 ETC Bikes Et Cetera1016 3965-1050 Spoke Reflector 50 ETC Bikes Et Cetera1016 1003-1 20 in. Bicycle 5 ETC Bikes Et Cetera1016 1000-1 20 in. Bicycle 4 ETC Bikes Et Cetera
A337 - Reed Smith 10
What is wrong with this solution?Partial Dependencies
◦For example, the sales order number is not relevant in the determination of the item name
◦Similarly, the customer code and customer name do not depend upon the Item ID, they only depend upon the sales order number.
Second Normal FormEliminate Partial Dependencies
◦ Write each key component on separate line, and then write the original (composite) key on the last line
◦ Each component will become the key in a new table
Identify the Dependent Attributes
◦ Determine which attributes are dependent on which other attributes
Creating 3 TablesSO_Number, Item_Number
SO_Number,
Item_Number
A337 - Reed Smith 13
2NF:
SALES_ORDER line item INVENTORYSO_Number Item_Number Qty_Ordered
1010 2010-0050 21010 1000-1 51011 1002-1 51011 1001-1 101012 1003-1 51012 1001-1 101013 1001-1 501014 1003-1 251015 1003-1 251016 3961-1041 51016 3965-1050 501016 1003-1 51016 1000-1 4
SALES_ORDERSSO_Number Cust_Code Cust_Name
1010 WHEEL Wheelaway Cycle Center1011 ETC Bikes Et Cetera1012 WHEEL Wheelaway Cycle Center1013 IBS Inter. Bicycle Sales1014 ETC Bikes Et Cetera1015 WHEEL Wheelaway Cycle Center1016 ETC Bikes Et Cetera
INVENTORY_ITEMSItem_Number Item_Name1000-1 20 in. Bicycle1001-1 26 in. Bicycle1002-1 24 in. Bicycle1003-1 20 in. Bicycle1003-1 20 in. Bicycle2010-0050 Formed Handlebar3961-1041 Tire Tube, 26 in.3965-1050 Spoke Reflector
A337 - Reed Smith 14
What is wrong with this solution?Transitive Dependencies
◦Notice that the third column of the Sales Orders file has the Customer name and that depends upon the customer number.
◦But the customer number is not the primary key
Database Systems: Design, Implementation, & Management,
6th Edition, Rob & Coronel
Third Normal Form
1. For every transitive dependency, write its determinant as a PK for a new table
2. Identify the attributes dependent on each determinant identified in Step 1 and identify the dependency
3. Remove the dependent attributes in transitive relationship(s) from each table that has such a transitive relationship
A337 - Reed Smith 16
3NF:
SALES_ORDER line item INVENTORYSO_Number Item_Number Qty_Ordered
1010 2010-0050 21010 1000-1 51011 1002-1 51011 1001-1 101012 1003-1 51012 1001-1 101013 1001-1 501014 1003-1 251015 1003-1 251016 3961-1041 51016 3965-1050 501016 1003-1 51016 1000-1 4
SALES_ORDERSSO_Number Cust_Code
1010 WHEEL1011 ETC1012 WHEEL1013 IBS1014 ETC1015 WHEEL1016 ETC
INVENTORY_ITEMSItem_Number Item_Name1000-1 20 in. Bicycle1001-1 26 in. Bicycle1002-1 24 in. Bicycle1003-1 20 in. Bicycle1003-1 20 in. Bicycle2010-0050 Formed Handlebar3961-1041 Tire Tube, 26 in.3965-1050 Spoke Reflector
CUSTOMERSCust_Code Cust_Name
ETC Bikes Et CeteraIBS Inter. Bicycle SalesWHEEL Wheelaway Cycle Center
DenormalizationCreation of normalized relations is
important database design goal
Processing requirements should also be a goal
If tables decomposed to conform to normalization requirements
◦Number of database tables expands
17
Denormalization (continued)Joining larger number of tables takes
additional disk input/output (I/O) operations and processing logic ◦Reduces system speed
Conflicts among design efficiency, information requirements, and processing speed are often resolved through compromises that may include denormalization
A337 Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel
18