database normalisation - extended · pdf filelevel table using a foreign key ... boyce-codd...

28
Database Normalisation Lecturer: Mark Gillan

Upload: vunhan

Post on 13-Mar-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Database Normalisation Lecturer: Mark Gillan

Page 2: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Example data

Student ID First Name Surname Date of Birth Unit No Unit Name Level Unit Duration Completion Date Result80300123 Benjamin Boateng 25/06/1996 100 Intro to IT 1 30 10/10/2014 Pass

80300123 Benjamin Boateng 25/06/1996 200 Hardware Foundation 2 60 23/06/2014 Pass

80300123 Benjamin Boateng 25/06/1996 250 Applications Foundation 2 60 05/03/2014 Pass

80300140 Tatiana McDonald 02/03/1997 330 Core Skills 3 100 10/10/2014 Merit

80300140 Tatiana McDonald 02/03/1997 345 Web Design 3 100 05/03/2014 Distinction

80300140 Tatiana McDonald 02/03/1997 210 Searching Online 2 60 13/01/2015 Pass

80300110 Jimmy Chan 14/04/1996 200 Hardware Foundation 2 60 31/05/2014 Unclassified

80300145 Selina Khan 09/10/1997 100 Intro to IT 1 30 19/10/2014 Pass

80300145 Selina Khan 09/10/1997 102 Sending Email 1 30 02/07/2014 Merit

80300145 Selina Khan 09/10/1997 250 Applications Foundation 2 60 12/03/2014 Merit

Page 3: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Understand the data

• List of student details, units, results and date of completion

• Duplication of various data

• Unnecessarily large table

• Slow operating database

• Locating information takes longer than required

• Time consuming entries

• Data could easily consist of errors

• Anomalies with regards to modification

Page 4: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Database Modification Anomalies

• Deletion Anomaly – deleting data means information being lost from another data set

• Insertion Anomaly – inserting a piece of data means duplication of information across two or more entities

• Update Anomaly – updating data within one data set requires multiple updates within other data sets

Page 5: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Levels of Normalisation

• UNF – Un-normalised Form

• 1NF – First Normal Form

• 2NF – Second Normal Form

• 3NF – Third Normal Form

• BCNF – Boyce Codd Normal Form

• 4NF – Fourth Normal Form (also regarded same as BCNF or 3.5 NF)

• 5NF – Fifth Normal Form (hardly ever used)

Page 6: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

UNF – Un-Normalised Form

• Data with possible issues due to flat file database / large chunk of information:

Duplication

Difficulty Inserting, Editing, Deleting

• Benefits could be:

Reduced duplication

More efficient keying in, editing or deleting of data

Reduces errors and inconsistencies

Faster database due to size being less (less duplication of data held)

Page 7: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Example data - UNF

Student ID First Name Surname Date of Birth Unit No Unit Name Level Unit Duration Completion Date Result80300123 Benjamin Boateng 25/06/1996 100 Intro to IT 1 30 10/10/2014 Pass

80300123 Benjamin Boateng 25/06/1996 200 Hardware Foundation 2 60 23/06/2014 Pass

80300123 Benjamin Boateng 25/06/1996 250 Applications Foundation 2 60 05/03/2014 Pass

80300140 Tatiana McDonald 02/03/1997 330 Core Skills 3 100 10/10/2014 Merit

80300140 Tatiana McDonald 02/03/1997 345 Web Design 3 100 05/03/2014 Distinction

80300140 Tatiana McDonald 02/03/1997 210 Searching Online 2 60 13/01/2015 Pass

80300110 Jimmy Chan 14/04/1996 200 Hardware Foundation 2 60 31/05/2014 Unclassified

80300145 Selina Khan 09/10/1997 100 Intro to IT 1 30 19/10/2014 Pass

80300145 Selina Khan 09/10/1997 102 Sending Email 1 30 02/07/2014 Merit

80300145 Selina Khan 09/10/1997 250 Applications Foundation 2 60 12/03/2014 Merit

Page 8: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

UNF – list of fields from example data

• All within one table

• Data table shows duplication

• No structure

• 1NF will have requirements

Page 9: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

1NF – First Normal Form

• Eliminate repeating fields within tables

• Create a separate table for each set of related data

• Identify each set of related data with a primary key (unique field)

Page 10: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Let’s see the data again

Student ID First Name Surname Date of Birth Unit No Unit Name Level Unit Duration Completion Date Result80300123 Benjamin Boateng 25/06/1996 100 Intro to IT 1 30 10/10/2014 Pass

80300123 Benjamin Boateng 25/06/1996 200 Hardware Foundation 2 60 23/06/2014 Pass

80300123 Benjamin Boateng 25/06/1996 250 Applications Foundation 2 60 05/03/2014 Pass

80300140 Tatiana McDonald 02/03/1997 330 Core Skills 3 100 10/10/2014 Merit

80300140 Tatiana McDonald 02/03/1997 345 Web Design 3 100 05/03/2014 Distinction

80300140 Tatiana McDonald 02/03/1997 210 Searching Online 2 60 13/01/2015 Pass

80300110 Jimmy Chan 14/04/1996 200 Hardware Foundation 2 60 31/05/2014 Unclassified

80300145 Selina Khan 09/10/1997 100 Intro to IT 1 30 19/10/2014 Pass

80300145 Selina Khan 09/10/1997 102 Sending Email 1 30 02/07/2014 Merit

80300145 Selina Khan 09/10/1997 250 Applications Foundation 2 60 12/03/2014 Merit

Page 11: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

1NF – example fields to first normal form

• The unit is not dependent upon student details

• Separation into individual tables

Page 12: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

1NF – more action required (primary key)

• First name, surname and date of birth could be duplicated with students having the same first name and / or surname and / or date of birth

• Therefore, student id should be unique to identify the record as being unique

• Primary key designated within the first master table

Page 13: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

1NF – more action (second table primary key)

• Each table should have a unique identifier / primary key

• Therefore, unit number is designated as a primary key identifying each unit as being unique to the others because some fields can have duplicated data within records

Page 14: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

1NF – primary key to link tables

• The Student ID primary key is provided within the second table to link a unit record to students

• Relationships between tables provide required links to other tables

• One student can have many units as shown within our original data

• Therefore, we could say a “One to Many” relationship

Page 15: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Reminder: 1NF – First Normal Form

• Eliminate repeating fields within tables

• Create a separate table for each set of related data

• Identify each set of related data with a primary key (unique field)

Page 16: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Moving onto 2NF

Page 17: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

2NF – Second Normal Form

• Meet all requirements for the first normal form

• Remove subsets of data that apply to multiple rows (records) and place them in separate tables

• If required, ensure relationships between the new tables are created

Page 18: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

2NF – example fields to second normal form

• Remove subsets of data that refers to multiple rows (records)

• Create new tables for any subsets that will have data within multiple records

• The completion date and result is obviously related to the unit but it is also related to individual students

• Therefore, another table for this data, linking unit number to the unit table and the student id to the student table

Page 19: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

3NF – Third Normal Form

• Meet all requirements of the second normal form

• Remove fields that are not dependent upon the primary key within the same table

• Normally, normalisation does not require proceeding any further than 3NF as most incidents of possible anomalies have been fixed

Page 20: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Reminder of tables from 2NF • 3NF requires in addition to 2NF, that

we “remove fields that are not fully dependent upon the primary key within the same table”

• Student table: First Name, Surname and Date of Birth are all dependent upon the individual Student

• Unit completion: Dependent upon each student completing and each student having a result upon completion

• However, is the unit duration dependent upon the unit number or could it be dependent upon the level? Look at the original data again.

Page 21: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Reminder of original data

Student ID First Name Surname Date of Birth Unit No Unit Name Level Unit Duration Completion Date Result80300123 Benjamin Boateng 25/06/1996 100 Intro to IT 1 30 10/10/2014 Pass

80300123 Benjamin Boateng 25/06/1996 200 Hardware Foundation 2 60 23/06/2014 Pass

80300123 Benjamin Boateng 25/06/1996 250 Applications Foundation 2 60 05/03/2014 Pass

80300140 Tatiana McDonald 02/03/1997 330 Core Skills 3 100 10/10/2014 Merit

80300140 Tatiana McDonald 02/03/1997 345 Web Design 3 100 05/03/2014 Distinction

80300140 Tatiana McDonald 02/03/1997 210 Searching Online 2 60 13/01/2015 Pass

80300110 Jimmy Chan 14/04/1996 200 Hardware Foundation 2 60 31/05/2014 Unclassified

80300145 Selina Khan 09/10/1997 100 Intro to IT 1 30 19/10/2014 Pass

80300145 Selina Khan 09/10/1997 102 Sending Email 1 30 02/07/2014 Merit

80300145 Selina Khan 09/10/1997 250 Applications Foundation 2 60 12/03/2014 Merit

Page 22: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

3NF – example fields to third normal form • Remember 3NF is to ensure 2NF is

completed and also states we “remove fields that are not fully dependent upon the primary key within the same table”

• Review the dependencies • Student table: already identified all fields

dependent upon the primary key • Unit completion: already identified the

completion date and result are dependent upon the unit being completed and dependent upon each student completing

• Unit details: unit name is dependent upon the unit identifier primary key

• Unit duration: dependent upon the level … so another table can be created

Page 23: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

3NF continued

• The level table requires a primary key

• The level itself would be unique as only the different levels would be listed within the data of this table along with the unit duration that is dependent upon the primary key – satisfying 3NF

• However, a wee problem!

Page 24: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

3NF continued – problem solved

• Unit table: now has a link to the level table using a foreign key being the level itself

• So…

• Unit table: linked to level table

• Level table: linked to unit table

• Unit completion: linked to both the unit table and the main student table

Page 25: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

BCNF or 3.5NF – Boyce-Codd Normal Form • Meet all requirements of the third normal form

• Every determinant must be a candidate key

• A determinant in a database table is any attribute (data within a record for a given field) that can be used to determine the values assigned to other attributes within the same row (record)

• For example, if a table has the attributes of employee_ID, first_name, surname. The name fields do not determine the employee_ID because there might be more than one employee with the same first name and surname as another.

Page 26: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

4NF – Fourth Normal Form

• Meet all requirements of the third normal form

• A relationship is in 4NF if it has no multi-valued dependencies

• For example, consider a computer case manufacturer always making the same model and design with black as well as white coloured cases. If a table is present that contains the case model identifier, design model, colour … there is a multi-valued dependency. Why? Because, if there is a record for a case model in black, there must also be a similar record to the white version of the same case model.

Page 27: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

5NF – Fifth Normal Form

• Meet all requirements for fourth normal form

• Deal with any issues remaining from 4NF, such as joining two attributes rather than splitting them into separate tables.

Page 28: Database Normalisation - Extended · PDF filelevel table using a foreign key ... Boyce-Codd Normal Form ... •Every determinant must be a candidate key •A determinant in a database

Revise all theory

• Carefully read notes

• Try exercises

• Answer questions at the end of each section

• Homework

• Remember to illustrate as well as provide written statements