cm20145 database design dr alwyn barry dr joanna bryson

32
CM20145 CM20145 Database Design Database Design Dr Alwyn Barry Dr Joanna Bryson

Upload: hunter-mcintyre

Post on 28-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

CM20145CM20145Database DesignDatabase Design

Dr Alwyn BarryDr Joanna Bryson

Page 2: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Lecture PlanLecture Plan

1. Basic Concepts

2. Data, Information & Knowledge

3. Data Models (The E-R Model)

4. The Relational Algebra

5. Introduction to SQL

6. Further SQL (Joins, RA Equivalences)

7. Database Design

8. Further DB Design – Normalisation

9. Architectures and Implementations

10. Integrity and Security

Page 3: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Lecture PlanLecture Plan

11. Ethics and Professional Conduct

12. Legal Issues

13. Transactions

14. Recovery

15. Concurrency Control

16. Storage and File Structure

17. Indexing and Hashing

18. Query Processing & Optimisation

January… Review Session

`Reading’ (PROJECT) Week

Page 4: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Page 5: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

What Is Relational Algebra For?What Is Relational Algebra For?

Fundamental Operators: Restriction Projection Cartesian Product Union Difference

Useful Operators: Join Intersection Division

Unary

Binary

Fun

dam

en

tal

Not

Prim

itive

Page 6: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

File Problems DBs Should SolveFile Problems DBs Should Solve

Data redundancy & inconsistency Duplication of information; Different formats

Data isolation Different formats; Different locations

Limited access Need new program for each query

Cannot support “Business Rules” Consistency Validity

Atomicity Failures leave data in inconsistent state; Two users

cannot update the database at once

Security Needs bespoke security in each application

Page 7: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Database DesignDatabase Design

DB design usually covers formal methods for trying to ensure databases meet all these goals.

No formal system is perfect: Formal methods take a long time,

may be computationally intractable. Human error: mistakes in proofs. Formally correct databases can still

have errors (will be examples.) Design is an art; but science can

help.

Page 8: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

E-R Model – System DesignE-R Model – System Design

Problemdomain

Problemdomain

Data RequirementsFunctional Requirements

RequirementsCollection & Analysis

Conceptual DesignFunctional Analysis

Conceptual SchemaHigh-level Transaction Spec.

Logical Schema

Logical Design(Data model mapping)

Physical Design

Application ProgramDesign

Internal Schema

TransactionImplementation

Application Programs

DB

MS

In

de

pe

nd

en

tD

BM

S D

ep

en

de

ntt

From Elmasri & Navathe, 2003, pg 51

Page 9: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Software Development IteratesSoftware Development Iterates

http://www.extremeprogramming.org/

Page 10: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Waterfall Feigning IterationWaterfall Feigning Iteration

This is not agile SE. People don’t want to know, but

iteration happens. The question is how well you cope.

Page 11: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Software Development IteratesSoftware Development Iterates

Models change as programmers understand the problem better.

Requirements change as users understand possibilities better.

More resources become available. So:

Save and maintain all modelling and planning tools.

Interact with users frequently. Learn rules to recognize common

failures.

Page 12: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Page 13: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Where Do Models Come From?Where Do Models Come From?

Demand – What do the users want to see?

Data – Don’t throw anything away.

Design – Never store anything twice. Efficiency. Integrity. Clarity. Security.

Page 14: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Automating Trading OperationsAutomating Trading Operations

Page 15: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Automating Trading OperationsAutomating Trading Operations

Data from: trading cards, market prices, underlying

values.

Demand from: Knowing what is

owned for determining risk.

Knowing what individuals did for determining pay.

Page 16: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Entities and attributesEntities and attributes

Trades Instrument, trade,

price, 2 traders, 2 clearing firms.

Traders Positions, trades.

Trading cards Trades, trader,

number, time.

Instruments Daily values,

volatilities, expiration dates.

(draw on board)

Page 17: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

New Requirement: ReconcilingNew Requirement: Reconciling

Business process – dual entry: Operations: primary concerns are

risk, execution on trading floor, relations with individual traders.

Clearing: primary concerns are accounting, banks, law, relations with other clearing firms.

The same trading cards resulted in different trade quantities.

The same trade has different dates!

Page 18: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Models will change!Models will change!

Design – Find out models are inefficient or clumsy to maintain.

Data – Discover new categories, salient values.

Demand – Users see new potential.

Page 19: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Page 20: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Views: Why they’re important.Views: Why they’re important.

The same data may be seen by different users in different ways.

Shorthand for frequent joins, formulas – may be more efficient.

Automate / enforce security – make access to tables and views depend on user’s function.

Keep reports logically independent from underlying representation – protect the users!

Page 21: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Why Limit / Protect Users?Why Limit / Protect Users?

Databases contain all data for a company. Limits on access eliminate

suspects for errors, crime. Information overload.

Smart, authoritative users still need to find things quickly.

Some users really are naïve.

Programmers are users.

Page 22: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Page 23: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Software Development IteratesSoftware Development Iterates

Models change as programmers understand the problem better.

Requirements change as users understand possibilities better.

More resources become available. So:

Save and maintain all modelling and planning tools.

Interact with users frequently. Learn rules to recognize common

failures.

Page 24: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Software Development IteratesSoftware Development Iterates

Models change as programmers understand the problem better.

Requirements change as users understand possibilities better.

More resources become available. So:

Save and maintain all modelling and planning tools.

Interact with users frequently. Learn rules to recognize common

failures.

Page 25: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Pitfalls in Relational DB DesignPitfalls in Relational DB Design

A bad design may lead to: redundant information, difficulty in representing certain

information, or difficulty in checking integrity

constraints.

Design goals: Avoid redundant data. Ensure that relationships among

attributes are represented. Facilitate the checking of updates for

violation of integrity constraints.

©Silberschatz, Korth and Sudarshan

Modifications & additions by S Bird, Melbourne

Page 26: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Example of Bad DesignExample of Bad Design Consider the relation schema:

Lending-schema = (branch-name,branch-city,assets, customer-name,loan-

number,amount)

Redundant Information: Data for branch-name, branch-city, assets are

repeated for each loan that a branch makes. Wastes space and complicates updates, introducing

possibility of inconsistency of assets value. Difficulty representing certain information:

Cannot store info. about a branch if no loans exist. Can use null values, but they are difficult to handle.

Page 27: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Solution: DecompositionSolution: Decomposition

Break up redundant tables into multiple tables - this operation is called decomposition. E.g. Lending-schema = (branch-

name, branch-city, assets, customer-name, loan-number, amount)

Branch-schema = (branch-name, branch-city,assets)

Loan-info-schema = (customer-name, loan-number, branch-name, amount)

Page 28: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Lossless-Join Decomposition:Lossless-Join Decomposition: Want to ensure that the original

data is recoverable.1.All attributes of the original schema

(R) must appear in the decomposition (R1, R2), i.e. R = R1 R2

2.Decomposition must be a lossless-join decomposition.

Definition: R1,R2 is a lossless-join decomposition of Rif, for all possible relations r(R) r = R1 (r) ⋈ R2 (r)

Page 29: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Bad Decomposition ExampleBad Decomposition Example

A Non Lossless-Join Decomposition R = (A, B) R1 = (A), R2 = (B)

A B

121

A

B

12

rA(r)

B(r)

A (r) ⋈ B (r)

A B

1212

Thus, r is different to A (r) ⋈ B (r)

So A,B is not a lossless-join decomposition of R.

Page 30: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

OverviewOverview

Database Design and Software Engineering.

Example from real life. Views and Users. Redundancy and

Decomposition.

Loosely based on A. Keller, 2002, USCU

Page 31: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

SummarySummary

Database design is an ongoing, iterative process. Requirements come from data, user

demands, design issues. Change occurs:

Corporations & technologies grow. Programmers & users learn.

Views / security. Lossless-join decomposition

Next: Science for improving design.

Page 32: CM20145 Database Design Dr Alwyn Barry Dr Joanna Bryson

Reading & ExercisesReading & Exercises

Reading Connolly & Begg Chapter 9, (13, 14) Silberschatz Chapter 7.

Much of 7 will be in the next lecture!

Exercises: C&B: 9.2, 9.3/11, 9.9, 9.10 Silberschatz:7.1, 7.2, 7.16

These need functional dependencies, which are covered next lecture.