rda 1 - 1 relational data analysis stems from work of ted codd in early 1970s led to relational...
TRANSCRIPT
RDA 1 - 1
Relational Data Analysis
• Stems from work of Ted Codd in early 1970s
• Led to relational database theory– Detailed mathematical theory
• Relational data analysis for database design– Bottom-up technique based on analysing the inter-relationships between attributes
Logical Data Structuring
Relational Data Analysis
Enhanced Logical Data
Model
RDA 1 - 2
Relations (also known as Tables )
Patient No.
Patient Surname
Patient Forename Sex Date of
birthWard No.
923 109 854 231 459
Moneybags Foot Hare Knee Legg
Maurice Ivor Susan Boris Brian
Male Male Female Male Male
23/7/53 3/4/41
13/11/61 4/2/31
10/2/70
10 11 10
7 10
Primary key
Foreign key
Column name
Rows
Patient No. Surname Forename Sex Date of birth Ward No.
Male Female 1–12
The Patient table
Ward No.
Ward Name
Type Number of beds
3 11 10
Nightingale Fleming Barnard
Medical Medical Surgical
8 12 21
The Ward table
Domains
RDA 1 - 3
Properties of Relations
• Rows– show occurrences of Patient– each row must be uniquely identifiable– order of rows must NOT be significant
• Columns– column name = attribute type – each value = attribute occurrence – order of columns must not be significant– only one value should be associated with each column/row intersection in the table
• Domain– pool of possible values from which the actual values appearing in the columns of the table are
drawn
e.g. domain of Patient Numbers includes all of the possible Patient Numbers, not just the ones currently in hospital.
– important for comparing values from different tables
RDA 1 - 4
Normalised Relations Objective
The organisation of a system's attributes into a set of
‘well normalised’ relations.
• Well normalised relations avoid:– unnecessary duplication of data
i.e. no redundant data– problems with modifying, inserting and deleting data
N.B. sometimes referred to as the update anomalies
RDA 1 - 5
Stages of Normalisation
• Normalised takes place in stages
• Each stage is known as a normal form
• Each stage is a development from the previous stage
Un-Normalised Form
First Normal Form
Second Normal Form
Third Normal Form
RDA 1 - 6
Sample Source of Data
DRUG CARD
Patient No. Surname Forename
Ward No. Ward Name
Drugs Prescribed
Date Drug Code Drug Name DosageLength of
Treatment
923 Moneybags Maurice
10 Barnard
20/5/88 CO2355P Cortisone2 pills 3 x day
after meals 14 days
20/5/88 MO3416T MorphineInjection every 4 hours 5
25/5/88 MO3416T Morphine Injection
every 8 hours3
26/5/88 PE8694N Penicillin 1 pill 3 x day 7
for additional drugs continue on another card
DRUG CARD
Patient No. Surname Forename
Ward No. Ward Name
Drugs Prescribed
Date Drug Code Drug Name DosageLength of
Treatment
109 Foot Ivor
11 Fleming
15/5/88 AS473A2 pills 3 x dayafter meals 7 days
20/5/88 VA231M Valium 2 per day 5
for additional drugs continue on another card
Aspirin
RDA 1 - 7
Steps in Normalisation - 1Un-normalised Form
Represent the data in un-normalised form and pick a key.
• Column headings (attribute names) should be meaningful
• Choice of key attribute/attributes– must be unique for the particular data source
– may require two or more attributes
e.g. Invoice No. and Supplier, because different suppliers may send invoices having the same number
– use smallest combination of attributes possible
– avoid textual keys wherever possible
RDA 1 - 8
Data Re-organised from Drug Cards
• Patient Number is chosen as primary key and underlined
Prescn Date Drug Code Drug Name Dosage
Length Trtmnt
20/5/88 CO2355P Cortisone2 pills 3 x day after meals
14
20/5/88 MO3416T MorphineInjection every 4 hours 5
25/5/88 MO3416T Morphine Injection every 8 hours
3
26/5/88 PE8694N Penicillin 1 pill 3 x day 7
15/5/88 AS473A2 pills 3 x day after meals
7
20/5/88 Valium 2 per day 5
AspirinFleming
Barnard10
11
Maurice
Ivor
Moneybags
Foot109
923
Pat No.
Surname Fore -name
Wd No.
Ward Name
VA231M
RDA 1 - 9
Steps in Normalisation - 2First Normal Form
Remove any repeating groups of ATTRIBUTES to separate relations
• Pick keys for any relations identified
Drug Code Drug Name DosageLength Trtmnt
20/5/88 CO2355P Cortisone 2 pills 3 x dayafter meals
14
20/5/88 MO3416T Morphine Injection every 4 hours
5
25/5/88 MO3416T Morphine Injection every 8 hours
3
26/5/88 PE8694N Penicillin 1 pill 3 x day 7
15/5/88 AS473A 2 pills 3 x dayafter meals
7
20/5/88 Valium 2 per day 5
Aspirin
Pat No.
923
923
923
923
109
109
Prescn Date
VA231M
RDA 1 - 10
Repeating Groups
Any attribute or group of attributes that may occur with multiple values for a single value of the primary key attribute.
• Several values of Drug Code, Drug Name, Prescription Date, Dosage and Treatment Length for one value of Patient Number
– Remove to new relation
• Key of new relation will be a compound key involving several attributes:
– key identified in Step 1 (Patient No.)
– further items required to make each row in the new relation unique (Drug Code and Prescription Date )
RDA 1 - 11
Non-repeating Attributes
• Attributes that do not repeat remain as a relation
Fleming
Barnard10
11
Maurice
Ivor
Moneybags
Foot109
923
Pat
No.
Surname Fore
-name
Wd
No.
Ward
Name
RDA 1 - 12
Functional Dependency
• For any two attributes A and B, A is dependent on B if and only if
– for a given value of B there is precisely one associated value of A at any one time.
e.g. Patient Surname is totally dependent on Patient Number because each patient is given a unique Patient Number
• Another way of describing this is to say that:– Attribute B determines attribute A
i.e. Patient Number determines Patient Surname
NB The opposite is not true:For a given value of Patient Surname, say Moneybags, there may be several Patient Numbers, as there may be several patients called Moneybags in the hospital together.
Dependency can also occur with groups of attributes.
RDA 1 - 13
Steps in Normalisation - 3Second Normal Form
Remove ATTRIBUTES that only depend upon part of the key to separate relations
• Only applies to compound key relations
NB. Others are already in Second Normal Form
• Thus– Patient Number, Prescription Date, and Drug Code in combination determine attributes
Dosage and Treatment Length
• However – Drug Name is only dependent on part of the key (Drug Code) and can be removed from the
relation
– Drug Code and Drug Name form a new relation with Drug Code as the key.
RDA 1 - 14
Development to Second Normal Form
Drug Code Drug NameDosageLength Trtmnt
20/5/88 CO2355P Cortisone2 pills 3 x day after meals
14
20/5/88 MO3416T MorphineInjection every 4 hours
5
25/5/88 MO3416T Injection every 8 hours
3
26/5/88 PE8694N
Penicillin
1 pill 3 x day 7
15/5/88 AS473A2 pills 3 x day after meals
7
20/5/88
Valium
2 per day 5
Aspirin
Pat No.
923
923
923
923
109
109
Drug Code
CO2355P
MO3416T
PE8694N
AS473A
Prescn Date
VA231M
VA231M
RDA 1 - 15
Steps in Normalisation - 4Third Normal Form
Remove any ATTRIBUTES not directly and fully dependent on the key to separate relations.
• Create a separate relation for attributes which can be uniquely determined by any other attribute in the relation
• The key attribute in the separate relation must uniquely determine the remaining attributes in a one-to-one relationship
Fleming
Barnard10
11
Maurice
Ivor
Moneybags
Foot109
923
Pat
No.
Surname Forename Ward
No.
Ward
Name
RDA 1 - 16
Example in Third Normal Form
Fleming
Barnard10
11
Maurice
Ivor
Moneybags
Foot109
923
Pat No.
Surname Fore -name
Wd No.
Ward Name
10
11
Wd No.
• Patient No. does not uniquely determine Ward No. or Ward Name, because the patient might be in different wards at different times.
• Ward No uniquely determines Ward Name, and these attributes can therefore be removed to a separate relation.
• Ward No becomes a foreign key in the Patient relation.
RDA 1 - 17
Full Set of Well Normalised Relations
Drug Name
Cortisone
Morphine
Penicillin
Valium
Aspirin
Drug Code
CO2355P
MO3416T
PE8694N
AS473A
VA231M
Drug Code DosageLgth Treat
20/5/88 CO2355P2 pills 3 x day after meals
14
20/5/88 MO3416T Injection every 4 hours
5
25/5/88 MO3416T Injection every 8 hours
3
26/5/88 PE8694N 1 pill 3 x day 7
15/5/88 AS473A2 pills 3 x day after meals
7
20/5/88 2 per day 5
Pat No.
923
923
923
923
109
109 VA231M
Presc Date
Maurice
Ivor
Moneybags
Foot109
923
Pat No
Surname Forename
10
11
Wd No.
Fleming
Barnard10
11
Wd No.
Ward Name
RDA 1 - 18
SSADM Notation
SYSTEM: DATE / / AUTHOR
Source ID No.: Name of Source:
UNF 1NF 2NF 3NF
Hospital
Drug Card
Patient NumberPatient Surname Patient Forename Ward Number Ward Name Prescription Date Drug Code Drug Name Dosage Length of Treatment
Patient NumberPatient Surname Patient Forename Ward Number Ward Name
Patient Number Prescription Date Drug CodeDrug Name Dosage Length of Treatment
Patient NumberPatient Surname Patient Forename Ward Number Ward Name
Patient Number Prescription Date Drug CodeDosage Length of Treatment
Drug NameDrug Code
Patient Number Prescription Date Drug CodeDosage Length of Treatment
Drug NameDrug Code
Patient Surname Patient Forename Ward Number
Patient Number
Ward Name
Ward Number
*
RDA 1 - 19
Relations as a Logical Data Structure
Drug Code Dosage Trt Lgth
20/5/88 CO2355P2 pills 3 x dayafter meals 14
20/5/88 MO3416T Injection every 4 hours 5
25/5/88 MO3416TInjection every 8 hours 3
26/5/88 PE8694N 1 pill 3 x day 7
15/5/88 AS473A 2 pills 3 x day after meals 7
20/5/88 2 per day 5
Pat No
923
923
923
923
109
109 VA231M
Prescr Date
Maurice
Ivor
Moneybags
Foot109
923
Pat No Surname Forename
10
11
Wd No
Patient
Fleming
Barnard10
11
Wd No Ward Name
Ward
Prescription
Drug Name
Cortisone
Morphine
Penicillin
Valium
Aspirin
Drug Code
CO2355P
MO3416T
PE8694N
AS473A
VA231M
Drug