normalization a337. a337 - reed smith2 structure what is a database? ◦ tables of information rows...

19
Normalization A337

Upload: hector-davis

Post on 21-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

NormalizationA337

Page 2: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 2

StructureWhat is a database?

◦Tables of information Rows are referred to as records Columns are referred to as fields or

attributes Record identifier is referred to as a record

key

Types◦Relational - Most common, Object-

Oriented◦Hierarchical, Network (much older

types)

Page 3: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 3

Database structure

Two approaches to the structure issue:◦ Conceptual (you start with the question of

“what information should I have?”) ERD from “scratch”

◦ Empirical (you already know what data there will be - you just want to organize it into tables) – NORMALIZATION

Page 4: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

Database Tables and NormalizationNormalization - Process for evaluating

and correcting table structures to minimize data redundancies◦Works through a series of stages called

normal forms: Normal form (1NF) Second normal form (2NF) Third normal form (3NF)

◦There are higher forms but are rarely necessary

Page 5: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 5

NormalizationWhy?

◦Data structures need to: Minimize redundancy Avoid insertion, update, and deletion

anomalies

How?◦Restructure information such that:

Only flat (rectangular) files exist (1st normal form) – No Nulls

All items in each record depend upon) the primary record key (2nd normal form) – No Partial Dependencies

If a field depends upon another then the “other” must be a primary key (3rd normal form) – No Transitive Dependencies

Page 6: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 6

Normalize the following table:

SALES_ORDERSSO_Number Item_Number Item_Name Qty_Ordered Cust_Code Cust_Name

1010 2010-0050 Formed Handlebar 2 WHEEL Wheelaway Cycle Center1000-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center

1011 1002-1 24 in. Bicycle 5 ETC Bikes Et Cetera1001-1 26 in. Bicycle 10 ETC Bikes Et Cetera

1012 1003-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center1001-1 26 in. Bicycle 10 WHEEL Wheelaway Cycle Center

1013 1001-1 26 in. Bicycle 50 IBS Inter. Bicycle Sales1014 1003-1 20 in. Bicycle 25 ETC Bikes Et Cetera1015 1003-1 20 in. Bicycle 25 WHEEL Wheelaway Cycle Center1016 3961-1041 Tire Tube, 26 in. 5 ETC Bikes Et Cetera

3965-1050 Spoke Reflector 50 ETC Bikes Et Cetera1003-1 20 in. Bicycle 5 ETC Bikes Et Cetera1000-1 20 in. Bicycle 4 ETC Bikes Et Cetera

Page 7: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 7

What is wrong with this solution?

Page 8: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

First Normal Form

Eliminate Nulls/Repeating Groups –

◦Eliminate repeating groups by eliminating nulls, filling in cells with implied values with actual values

Select a primary key

◦may be a composite key

Page 9: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 9

1NF:

SALES_ORDERSSO_Number Item_Number Item_Name Qty_Ordered Cust_Code Cust_Name

1010 2010-0050 Formed Handlebar 2 WHEEL Wheelaway Cycle Center1010 1000-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center1011 1002-1 24 in. Bicycle 5 ETC Bikes Et Cetera1011 1001-1 26 in. Bicycle 10 ETC Bikes Et Cetera1012 1003-1 20 in. Bicycle 5 WHEEL Wheelaway Cycle Center1012 1001-1 26 in. Bicycle 10 WHEEL Wheelaway Cycle Center1013 1001-1 26 in. Bicycle 50 IBS Inter. Bicycle Sales1014 1003-1 20 in. Bicycle 25 ETC Bikes Et Cetera1015 1003-1 20 in. Bicycle 25 WHEEL Wheelaway Cycle Center1016 3961-1041 Tire Tube, 26 in. 5 ETC Bikes Et Cetera1016 3965-1050 Spoke Reflector 50 ETC Bikes Et Cetera1016 1003-1 20 in. Bicycle 5 ETC Bikes Et Cetera1016 1000-1 20 in. Bicycle 4 ETC Bikes Et Cetera

Page 10: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 10

What is wrong with this solution?Partial Dependencies

◦For example, the sales order number is not relevant in the determination of the item name

◦Similarly, the customer code and customer name do not depend upon the Item ID, they only depend upon the sales order number.

Page 11: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

Second Normal FormEliminate Partial Dependencies

◦ Write each key component on separate line, and then write the original (composite) key on the last line

◦ Each component will become the key in a new table

Identify the Dependent Attributes

◦ Determine which attributes are dependent on which other attributes

Page 12: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

Creating 3 TablesSO_Number, Item_Number

SO_Number,

Item_Number

Page 13: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 13

2NF:

SALES_ORDER line item INVENTORYSO_Number Item_Number Qty_Ordered

1010 2010-0050 21010 1000-1 51011 1002-1 51011 1001-1 101012 1003-1 51012 1001-1 101013 1001-1 501014 1003-1 251015 1003-1 251016 3961-1041 51016 3965-1050 501016 1003-1 51016 1000-1 4

SALES_ORDERSSO_Number Cust_Code Cust_Name

1010 WHEEL Wheelaway Cycle Center1011 ETC Bikes Et Cetera1012 WHEEL Wheelaway Cycle Center1013 IBS Inter. Bicycle Sales1014 ETC Bikes Et Cetera1015 WHEEL Wheelaway Cycle Center1016 ETC Bikes Et Cetera

INVENTORY_ITEMSItem_Number Item_Name1000-1 20 in. Bicycle1001-1 26 in. Bicycle1002-1 24 in. Bicycle1003-1 20 in. Bicycle1003-1 20 in. Bicycle2010-0050 Formed Handlebar3961-1041 Tire Tube, 26 in.3965-1050 Spoke Reflector

Page 14: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 14

What is wrong with this solution?Transitive Dependencies

◦Notice that the third column of the Sales Orders file has the Customer name and that depends upon the customer number.

◦But the customer number is not the primary key

Page 15: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

Database Systems: Design, Implementation, & Management,

6th Edition, Rob & Coronel

Third Normal Form

1. For every transitive dependency, write its determinant as a PK for a new table

2. Identify the attributes dependent on each determinant identified in Step 1 and identify the dependency

3. Remove the dependent attributes in transitive relationship(s) from each table that has such a transitive relationship

Page 16: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

A337 - Reed Smith 16

3NF:

SALES_ORDER line item INVENTORYSO_Number Item_Number Qty_Ordered

1010 2010-0050 21010 1000-1 51011 1002-1 51011 1001-1 101012 1003-1 51012 1001-1 101013 1001-1 501014 1003-1 251015 1003-1 251016 3961-1041 51016 3965-1050 501016 1003-1 51016 1000-1 4

SALES_ORDERSSO_Number Cust_Code

1010 WHEEL1011 ETC1012 WHEEL1013 IBS1014 ETC1015 WHEEL1016 ETC

INVENTORY_ITEMSItem_Number Item_Name1000-1 20 in. Bicycle1001-1 26 in. Bicycle1002-1 24 in. Bicycle1003-1 20 in. Bicycle1003-1 20 in. Bicycle2010-0050 Formed Handlebar3961-1041 Tire Tube, 26 in.3965-1050 Spoke Reflector

CUSTOMERSCust_Code Cust_Name

ETC Bikes Et CeteraIBS Inter. Bicycle SalesWHEEL Wheelaway Cycle Center

Page 17: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

DenormalizationCreation of normalized relations is

important database design goal

Processing requirements should also be a goal

If tables decomposed to conform to normalization requirements

◦Number of database tables expands

17

Page 18: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred

Denormalization (continued)Joining larger number of tables takes

additional disk input/output (I/O) operations and processing logic ◦Reduces system speed

Conflicts among design efficiency, information requirements, and processing speed are often resolved through compromises that may include denormalization

A337 Database Systems: Design, Implementation, & Management, 6th Edition, Rob & Coronel

18

Page 19: Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred