rda 1 - 1 relational data analysis stems from work of ted codd in early 1970s led to relational...

19
RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory Detailed mathematical theory Relational data analysis for database design Bottom-up technique based on analysing the inter-relationships between attributes Logical Data Structuring Relational Data Analysis Enhanced Logical Data Model

Upload: gabrielle-daley

Post on 28-Mar-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 1

Relational Data Analysis

• Stems from work of Ted Codd in early 1970s

• Led to relational database theory– Detailed mathematical theory

• Relational data analysis for database design– Bottom-up technique based on analysing the inter-relationships between attributes

Logical Data Structuring

Relational Data Analysis

Enhanced Logical Data

Model

Page 2: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 2

Relations (also known as Tables )

Patient No.

Patient Surname

Patient Forename Sex Date of

birthWard No.

923 109 854 231 459

Moneybags Foot Hare Knee Legg

Maurice Ivor Susan Boris Brian

Male Male Female Male Male

23/7/53 3/4/41

13/11/61 4/2/31

10/2/70

10 11 10

7 10

Primary key

Foreign key

Column name

Rows

Patient No. Surname Forename Sex Date of birth Ward No.

Male Female 1–12

The Patient table

Ward No.

Ward Name

Type Number of beds

3 11 10

Nightingale Fleming Barnard

Medical Medical Surgical

8 12 21

The Ward table

Domains

Page 3: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 3

Properties of Relations

• Rows– show occurrences of Patient– each row must be uniquely identifiable– order of rows must NOT be significant

• Columns– column name = attribute type – each value = attribute occurrence – order of columns must not be significant– only one value should be associated with each column/row intersection in the table

• Domain– pool of possible values from which the actual values appearing in the columns of the table are

drawn

e.g. domain of Patient Numbers includes all of the possible Patient Numbers, not just the ones currently in hospital.

– important for comparing values from different tables

Page 4: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 4

Normalised Relations Objective

The organisation of a system's attributes into a set of

‘well normalised’ relations.

• Well normalised relations avoid:– unnecessary duplication of data

i.e. no redundant data– problems with modifying, inserting and deleting data

N.B. sometimes referred to as the update anomalies

Page 5: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 5

Stages of Normalisation

• Normalised takes place in stages

• Each stage is known as a normal form

• Each stage is a development from the previous stage

Un-Normalised Form

First Normal Form

Second Normal Form

Third Normal Form

Page 6: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 6

Sample Source of Data

DRUG CARD

Patient No. Surname Forename

Ward No. Ward Name

Drugs Prescribed

Date Drug Code Drug Name DosageLength of

Treatment

923 Moneybags Maurice

10 Barnard

20/5/88 CO2355P Cortisone2 pills 3 x day

after meals 14 days

20/5/88 MO3416T MorphineInjection every 4 hours 5

25/5/88 MO3416T Morphine Injection

every 8 hours3

26/5/88 PE8694N Penicillin 1 pill 3 x day 7

for additional drugs continue on another card

DRUG CARD

Patient No. Surname Forename

Ward No. Ward Name

Drugs Prescribed

Date Drug Code Drug Name DosageLength of

Treatment

109 Foot Ivor

11 Fleming

15/5/88 AS473A2 pills 3 x dayafter meals 7 days

20/5/88 VA231M Valium 2 per day 5

for additional drugs continue on another card

Aspirin

Page 7: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 7

Steps in Normalisation - 1Un-normalised Form

Represent the data in un-normalised form and pick a key.

• Column headings (attribute names) should be meaningful

• Choice of key attribute/attributes– must be unique for the particular data source

– may require two or more attributes

e.g. Invoice No. and Supplier, because different suppliers may send invoices having the same number

– use smallest combination of attributes possible

– avoid textual keys wherever possible

Page 8: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 8

Data Re-organised from Drug Cards

• Patient Number is chosen as primary key and underlined

Prescn Date Drug Code Drug Name Dosage

Length Trtmnt

20/5/88 CO2355P Cortisone2 pills 3 x day after meals

14

20/5/88 MO3416T MorphineInjection every 4 hours 5

25/5/88 MO3416T Morphine Injection every 8 hours

3

26/5/88 PE8694N Penicillin 1 pill 3 x day 7

15/5/88 AS473A2 pills 3 x day after meals

7

20/5/88 Valium 2 per day 5

AspirinFleming

Barnard10

11

Maurice

Ivor

Moneybags

Foot109

923

Pat No.

Surname Fore -name

Wd No.

Ward Name

VA231M

Page 9: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 9

Steps in Normalisation - 2First Normal Form

Remove any repeating groups of ATTRIBUTES to separate relations

• Pick keys for any relations identified

Drug Code Drug Name DosageLength Trtmnt

20/5/88 CO2355P Cortisone 2 pills 3 x dayafter meals

14

20/5/88 MO3416T Morphine Injection every 4 hours

5

25/5/88 MO3416T Morphine Injection every 8 hours

3

26/5/88 PE8694N Penicillin 1 pill 3 x day 7

15/5/88 AS473A 2 pills 3 x dayafter meals

7

20/5/88 Valium 2 per day 5

Aspirin

Pat No.

923

923

923

923

109

109

Prescn Date

VA231M

Page 10: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 10

Repeating Groups

Any attribute or group of attributes that may occur with multiple values for a single value of the primary key attribute.

• Several values of Drug Code, Drug Name, Prescription Date, Dosage and Treatment Length for one value of Patient Number

– Remove to new relation

• Key of new relation will be a compound key involving several attributes:

– key identified in Step 1 (Patient No.)

– further items required to make each row in the new relation unique (Drug Code and Prescription Date )

Page 11: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 11

Non-repeating Attributes

• Attributes that do not repeat remain as a relation

Fleming

Barnard10

11

Maurice

Ivor

Moneybags

Foot109

923

Pat

No.

Surname Fore

-name

Wd

No.

Ward

Name

Page 12: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 12

Functional Dependency

• For any two attributes A and B, A is dependent on B if and only if

– for a given value of B there is precisely one associated value of A at any one time.

e.g. Patient Surname is totally dependent on Patient Number because each patient is given a unique Patient Number

• Another way of describing this is to say that:– Attribute B determines attribute A

i.e. Patient Number determines Patient Surname

NB The opposite is not true:For a given value of Patient Surname, say Moneybags, there may be several Patient Numbers, as there may be several patients called Moneybags in the hospital together.

Dependency can also occur with groups of attributes.

Page 13: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 13

Steps in Normalisation - 3Second Normal Form

Remove ATTRIBUTES that only depend upon part of the key to separate relations

• Only applies to compound key relations

NB. Others are already in Second Normal Form

• Thus– Patient Number, Prescription Date, and Drug Code in combination determine attributes

Dosage and Treatment Length

• However – Drug Name is only dependent on part of the key (Drug Code) and can be removed from the

relation

– Drug Code and Drug Name form a new relation with Drug Code as the key.

Page 14: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 14

Development to Second Normal Form

Drug Code Drug NameDosageLength Trtmnt

20/5/88 CO2355P Cortisone2 pills 3 x day after meals

14

20/5/88 MO3416T MorphineInjection every 4 hours

5

25/5/88 MO3416T Injection every 8 hours

3

26/5/88 PE8694N

Penicillin

1 pill 3 x day 7

15/5/88 AS473A2 pills 3 x day after meals

7

20/5/88

Valium

2 per day 5

Aspirin

Pat No.

923

923

923

923

109

109

Drug Code

CO2355P

MO3416T

PE8694N

AS473A

Prescn Date

VA231M

VA231M

Page 15: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 15

Steps in Normalisation - 4Third Normal Form

Remove any ATTRIBUTES not directly and fully dependent on the key to separate relations.

• Create a separate relation for attributes which can be uniquely determined by any other attribute in the relation

• The key attribute in the separate relation must uniquely determine the remaining attributes in a one-to-one relationship

Fleming

Barnard10

11

Maurice

Ivor

Moneybags

Foot109

923

Pat

No.

Surname Forename Ward

No.

Ward

Name

Page 16: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 16

Example in Third Normal Form

Fleming

Barnard10

11

Maurice

Ivor

Moneybags

Foot109

923

Pat No.

Surname Fore -name

Wd No.

Ward Name

10

11

Wd No.

• Patient No. does not uniquely determine Ward No. or Ward Name, because the patient might be in different wards at different times.

• Ward No uniquely determines Ward Name, and these attributes can therefore be removed to a separate relation.

• Ward No becomes a foreign key in the Patient relation.

Page 17: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 17

Full Set of Well Normalised Relations

Drug Name

Cortisone

Morphine

Penicillin

Valium

Aspirin

Drug Code

CO2355P

MO3416T

PE8694N

AS473A

VA231M

Drug Code DosageLgth Treat

20/5/88 CO2355P2 pills 3 x day after meals

14

20/5/88 MO3416T Injection every 4 hours

5

25/5/88 MO3416T Injection every 8 hours

3

26/5/88 PE8694N 1 pill 3 x day 7

15/5/88 AS473A2 pills 3 x day after meals

7

20/5/88 2 per day 5

Pat No.

923

923

923

923

109

109 VA231M

Presc Date

Maurice

Ivor

Moneybags

Foot109

923

Pat No

Surname Forename

10

11

Wd No.

Fleming

Barnard10

11

Wd No.

Ward Name

Page 18: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 18

SSADM Notation

SYSTEM: DATE / / AUTHOR

Source ID No.: Name of Source:

UNF 1NF 2NF 3NF

Hospital

Drug Card

Patient NumberPatient Surname Patient Forename Ward Number Ward Name Prescription Date Drug Code Drug Name Dosage Length of Treatment

Patient NumberPatient Surname Patient Forename Ward Number Ward Name

Patient Number Prescription Date Drug CodeDrug Name Dosage Length of Treatment

Patient NumberPatient Surname Patient Forename Ward Number Ward Name

Patient Number Prescription Date Drug CodeDosage Length of Treatment

Drug NameDrug Code

Patient Number Prescription Date Drug CodeDosage Length of Treatment

Drug NameDrug Code

Patient Surname Patient Forename Ward Number

Patient Number

Ward Name

Ward Number

*

Page 19: RDA 1 - 1 Relational Data Analysis Stems from work of Ted Codd in early 1970s Led to relational database theory –Detailed mathematical theory Relational

RDA 1 - 19

Relations as a Logical Data Structure

Drug Code Dosage Trt Lgth

20/5/88 CO2355P2 pills 3 x dayafter meals 14

20/5/88 MO3416T Injection every 4 hours 5

25/5/88 MO3416TInjection every 8 hours 3

26/5/88 PE8694N 1 pill 3 x day 7

15/5/88 AS473A 2 pills 3 x day after meals 7

20/5/88 2 per day 5

Pat No

923

923

923

923

109

109 VA231M

Prescr Date

Maurice

Ivor

Moneybags

Foot109

923

Pat No Surname Forename

10

11

Wd No

Patient

Fleming

Barnard10

11

Wd No Ward Name

Ward

Prescription

Drug Name

Cortisone

Morphine

Penicillin

Valium

Aspirin

Drug Code

CO2355P

MO3416T

PE8694N

AS473A

VA231M

Drug