44220: database design & implementation avoiding database anomalies ian perry room: c49 tel...
TRANSCRIPT
44220: Database Design & Implementation
Avoiding Database Anomalies
Ian PerryRoom: C49 Tel Ext.: 7287
E-mail: [email protected]
http://itsy.co.uk/ac/0506/sem2/44220_DDI/
Ian Perry Slide 244220: Database Design & Implementation: Avoiding Database Anomalies
Avoiding Database Anomalies This lecture concentrates upon building
a ‘robust’ Logical Data Model. i.e.: Transforming a Conceptual Data Model
into a set of Relations. Checking these Relations for any
Anomalies. Documenting them as a Database Schema.
Most Database books have a section describing a mathematically-based technique called Normalisation: I will show you a much easier way of
achieving the same result, i.e. a robust database design.
Ian Perry Slide 344220: Database Design & Implementation: Avoiding Database Anomalies
What is an Anomaly? Anything we try to do with a database that
leads to unexpected and/or unpredictable results.
Three types of Anomaly to guard against: insert delete update
Need to check your database design carefully: the only good database is an anomaly free
database.
Ian Perry Slide 444220: Database Design & Implementation: Avoiding Database Anomalies
Insert Anomaly When we want to enter a value into a data
cell but the attempt is prevented, as another value is not known.
e.g. We have built a new Room (e.g. B123), but it has not yet been timetabled for any courses or members of staff.
CoNo Tutor Room RSize EnLimit 353 Smith A532 45 40 351 Smith C320 100 60 355 Clark H940 400 300 456 Turner H940 400 45
Ian Perry Slide 544220: Database Design & Implementation: Avoiding Database Anomalies
Delete Anomaly When a value we want to delete also means
we will delete values we wish to keep.
CoNo Tutor Room RSize EnLimit353 Smith A532 45 40351 Smith C320 100 60355 Clark H940 400 300456 Turner H940 400 45
e.g. CoNo 351 has ended, but Room C320 will be used elsewhere.
Ian Perry Slide 644220: Database Design & Implementation: Avoiding Database Anomalies
Update Anomaly When we want to change a single data item
value, but must update multiple entries
CoNo Tutor Room RSize EnLimit 353 Smith A532 45 40 351 Smith C320 100 60 355 Clark H940 400 300 456 Turner H940 400 45
e.g. Room H940 has been improved, it is now of RSize = 500.
Ian Perry Slide 744220: Database Design & Implementation: Avoiding Database Anomalies
A Conceptual Model Consider the following ‘simple’ conceptual
data model:
Staff(Staff-ID, Name, Address, ScalePoint, RateOfPay, DOB, ...)
Student(Enrol-No, Name, Address, OLevelPoints, ...)
Course(CourseCode, Name, Duration, ...)
Staff Course Student1 MM M
Ian Perry Slide 844220: Database Design & Implementation: Avoiding Database Anomalies
The ‘Translation’ Process Entities become Relations Attributes become Attributes(?) Key Attribute(s) become Primary
Key(s) Relationships are represented by
additional Foreign Key Attributes: for those Relations that are at the ‘M’ end
of each 1:M Relationship.
Ian Perry Slide 944220: Database Design & Implementation: Avoiding Database Anomalies
The ‘Staff’ & ‘Student’ Relations
Staff(Staff-ID, Name, Address, ScalePoint, RateOfPay, DOB, ...)becomes:
Staff Staff-ID Name Address ScalePoint RateOfPay DOB
Student(Enrol-No, Name, Address, OLevelPoints, ...)
becomes:
StudentEnrol-No Name Address OLevelPoints Tutor
NB. Foreign Key Tutor references Staff.Staff-ID
Ian Perry Slide 1044220: Database Design & Implementation: Avoiding Database Anomalies
The ‘Staff’ & ‘Course’ Relations
Course(CourseCode, Name, Duration, ...)becomes:
Course CourseCode Name Duration
NB. Can’t add a Foreign Key; as BOTH Relations have a ‘M’ end: I warned you about leaving M:M relationships
in your Conceptual Data Model. Must create an ‘artificial’ linking Relation.
Staff Staff-ID Name Address ScalePoint RateOfPay DOB
Ian Perry Slide 1144220: Database Design & Implementation: Avoiding Database Anomalies
‘Staff’, ‘Course’ & ‘Team’ Relations
NB. In the ‘artificial’ Relation (i.e. Team):The Primary Key is a composite of CourseCode & Staff-ID
Foreign Key CourseCode references Course.CourseCode
Foreign Key Staff-ID references Staff.Staff-ID
CourseCode Staff-ID
Team
Course
Staff
CourseCode Name Duration
Staff-ID Name Address ScalePoint RateOfPay DOB
Ian Perry Slide 1244220: Database Design & Implementation: Avoiding Database Anomalies
4 Relations from 3 Entities?
BUT - are they anomaly free?
Student
Team
Course
Staff
Enrol-No Name Address OLevelPoints Tutor
CourseCode Staff-ID
CourseCode Name Duration
Staff-ID Name Address ScalePoint RateOfPay DOB
Ian Perry Slide 1344220: Database Design & Implementation: Avoiding Database Anomalies
Check Relations for Anomalies!
every Tuple unique? no hidden meaning from location? data cells atomic? for Relations with single-attribute
keys: every Attribute depends upon the
Primary Key? for Relations with composite keys:
every Attribute depends upon all of the Composite Key?
Ian Perry Slide 1444220: Database Design & Implementation: Avoiding Database Anomalies
What if the checks fail? If any Relation fails ‘checks’:
especially those checking dependency. we MUST split that Relation into
multiple Relations: until they pass the tests.
but MUST remember to leave behind a Foreign Key: to ‘point’ forwards to the Primary Key of
the ‘new’ split-off Relation.
Ian Perry Slide 1544220: Database Design & Implementation: Avoiding Database Anomalies
Are they Anomaly Free?
Student
Team
Course
Staff
Enrol-No Name Address OLevelPoints Tutor
CourseCode Staff-ID
CourseCode Name Duration
Staff-ID Name Address ScalePoint RateOfPay DOB
NOT this one! as RateOfPay does NOT depend upon Staff-ID
Ian Perry Slide 1644220: Database Design & Implementation: Avoiding Database Anomalies
Fixing this ‘Problem’
The Attribute ‘RateOfPay’ depends upon ‘ScalePoint’ NOT ‘Staff-ID’. So, we need to split this Relation:
NB. Foreign Key ScalePoint references Pay.ScalePoint
Staff
Pay
Staff-ID Name Address ScalePoint DOB
ScalePoint RateOfPay
Staff Staff-ID Name Address ScalePoint RateOfPay DOB
Ian Perry Slide 1744220: Database Design & Implementation: Avoiding Database Anomalies
5 Relations from 3 Entities
Student Enrol-No Name Address OLevelPoints Tutor
Team CourseCode Staff-ID
Course CourseCode Name Duration
Pay ScalePoint RateOfPay
Staff Staff-ID Name Address ScalePoint DOB
an ‘artificial’ Relation- to ‘solve’ a M:M ‘problem’
a ‘split-off’ Relation- to ‘solve’ a Dependency ‘problem’
Ian Perry Slide 1844220: Database Design & Implementation: Avoiding Database Anomalies
Don’t change Conceptual Model
Remember, we can chose from one of a range of Database Theories with which to build our Logical Data Model: Hierarchical Relational Object
Each of these Database Theories may require different compromises (i.e. at the Logical Modelling stage); from the ‘pure’ meaning captured by
your Conceptual Model.
Ian Perry Slide 1944220: Database Design & Implementation: Avoiding Database Anomalies
Document Relations as a Database Schema
A Database Schema: defines all Relations, lists all Attributes (with their Domains), and identifies all Primary & Foreign Keys.
We should have ‘captured’ the Business situation (assumptions and constraints) in the Conceptual Data Model, e.g: a College only delivers 10 Courses. a Hospital only has 12 Wards.
These assumptions and constraints need to be expressed as the Domains of the Database Schema.
Ian Perry Slide 2044220: Database Design & Implementation: Avoiding Database Anomalies
Logical Schema 1 - Domains Schema College Domains
StudentIdentifiers = 1 - 9999; StaffIdentifiers = 1001 - 1199; PersonNames = TextString (15 Characters); Addresses = TextString (25 Characters); CourseIdentifiers = 101 - 110; CourseNames = Comp, IS, Law, Mkt, ...; OLevelPoints = 0 - 100; ScalePoints = 1 - 12; PayRates = £14,005, £14,789, £15,407, ...; StaffBirthDates = Date (dd/mm/yyyy), >21 Years
before Today;
Ian Perry Slide 2144220: Database Design & Implementation: Avoiding Database Anomalies
Logical Schema 2 - Relations Relation Student
Enrol-No: StudentIdentifiers; Name: PersonNames; Address: Addresses; OLevelPoints: OLevelPoints; Tutor: StaffIdentifiers;
Primary Key: Enrol-No Foreign Key Tutor references Staff.Staff-
ID
Ian Perry Slide 2244220: Database Design & Implementation: Avoiding Database Anomalies
Logical Schema 3 - Relations Relation Staff
Staff-ID: StaffIdentifiers; Name: PersonNames; Address: Addresses; ScalePoint: ScalePoints; DOB: StaffBirthDates;
Primary Key: Staff-ID Foreign Key ScalePoint references
Pay.ScalePoint
Ian Perry Slide 2344220: Database Design & Implementation: Avoiding Database Anomalies
Logical Schema ... Relation Course
CourseCode: CourseIdentifiers; Name: CourseNames; … etc.
Continue to define each of the Relations in a similar manner. All Relations MUST have a Primary Key. Any Relation at the M-end of a 1:M
Relationship MUST have a Foreign Key. Make sure that you define ALL of the
Relations, including: ‘artificial’ ones (e.g. Team) ‘split-off’ ones (e.g. Pay)
Ian Perry Slide 2444220: Database Design & Implementation: Avoiding Database Anomalies
This Week’s Workshop In this Workshops session we will;
1. test a logical data model; to ensure that it is anomaly free (i.e. robust),
2. practice documenting a Database Schema; based on a small conceptual model (as represented by an ER Diagram).
1. Examine a table of data: Explain the ‘potential’ for insert, delete & update
anomalies in a table of data. Define what a ‘better’ set of tables (Relations?) to store
the data look like?
2. Examine an ER Diagram: Identify suitable Attributes for each Relation; as a
minimum those that will act as the Primary & Foreign Keys.
Document as a Database Schema; starting with the Relations first, then coming back to document suitable Domains.