dbdesign 5 / 1 some aspects of database design. dbdesign 5 / 2 5. aims of database design reduce...

69
DBDesign 5 / 1 Some Aspects of Database Design

Upload: jayson-sherman

Post on 27-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

DBDesign 5 / 1

Some Aspects of Database Design

DBDesign 5 / 2

5. Aims of Database Design5. Aims of Database Design

Reduce data redundancy.

Provide stable data structures that can be readily changed with changing user requirements.

Allow users to make ad hoc requests for data.

Maintain complex relationships between data elements.

Support a large variety of decision needs.

DBDesign 5 / 3

A Practical Development ScenarioA Practical Development Scenario

In 198n , a Company decided to develop and introduce a Payroll system using database technology. It looked this this

Payroll System

PayrollData

DBDesign 5 / 4

A Practical Development ScenarioA Practical Development Scenario

The Company grew in numbers and expertise and decided to introduce a Personnel System. The ‘new’

design was this

Payroll System

Payroll Data

Payroll System

PersonnelData

Personnel System

DBDesign 5 / 5

A Practical Development ScenarioA Practical Development Scenario

In the next few years, these components were addedPayroll System

Payroll Data

Payroll System

PersonnelData

Personnel System

Job History Labor Distribution

EmployeeTracing Data

LaborAnalysis Data

DBDesign 5 / 6

A Practical Development ScenarioA Practical Development Scenario

Notice the complex nature of related data

The design of the databases must be such

that data can be accessed :

Correctly

Completely

Clearly

Without Anomalies

Access = Insert, update, modify, delete and retrieve

DBDesign 5 / 7

Design CriteriaDesign Criteria

- data availability

- data reliability

- data currency

- data consistency

- data flexibility

- data efficiency

Verify these criteria are satisfied via technical review

DBDesign 5 / 8

End Product of DataBase DesignEnd Product of DataBase Design

A database which will:

Accurately reflect the ‘real world’ data in all required aspects

Be responsive to Management Information demands

Reflect Business Rules and Controls

Be capable of modification to meet changes in Management needs - as in the previous example

Be an asset to the Organisation/Enterprise

DBDesign 5 / 9

Business Functions and ProcessesBusiness Functions and Processes

• Business Functions :

Broad groups of closely related activities and decisions which contribute to a product or service like cycle. (e.g.. planning, materials management, production planning, quality assurance).

• Business Processes : Decision related activities which occur within a function. They are related to management of people, money, material and information.

• Materials Management (Business Function)could be

subdivided into: requirements planning, purchasing, good received, material accounting, stockkeeping

Business Processes should reflect related activity groupings

DBDesign 5 / 10

Business ActivitiesBusiness Activities

• Business Activities : Specific operations or transactions

required to carry out a process

Some guidelines:

An activity should produce some clearly defined (identifiable) result - a product, a decision, a plan ......

An activity has clear boundaries - a clear beginning and end. Activities do not overlap.

An activity is carried out as a unit, by a single agent or a team

Once initiated, an activity proceeds independently of and from other activities.

DBDesign 5 / 11

Business Rules and their Impact on DataBusiness Rules and their Impact on Data

Business Rules are the defining force between application behaviour and database content

Conditions/rules/behaviour/standards/requirements govern

what data

what domains

what processes

what relationships

what variations may or can occur

Business rules provide a formal foundation for understanding how a business operates

- with interesting implications for the database design

DBDesign 5 / 12

Business RulesBusiness Rules

An example - an active employee must have a position - no problems with this in modelling and understanding

The ‘position’ is optional (or non-allowable) for an employee with a different status - terminated, retired, on leave

So is the relation between Employee and Position optional or mandatory ? Or when does the relationship alter from mandatory to optional ? Is there some ‘intelligence’ built into the status attribute ?

DBDesign 5 / 13

Business RulesBusiness Rules

Another one:

Automatic re-ordering in a retail business :

Rules exist to determine what levels of stock trigger off automatic re-ordering for stock.

Is this a common rule for all stock levels ?

Are there ‘different’ rules for different items ?

Can the rules alter depending on

season

production or supply variables

competition

new lines

phasing out ‘old’ items ?

DBDesign 5 / 14

Business RulesBusiness Rules

Some more thoughts about Business Rules:

1. They should be declarative

2. There must be only 1 interpretation

3. They must be atomic - it must be a single complete thought

4. Consistent - A set of rules must not be contradictory

5. Non-redundant - The rules must not contain rules stating the same information

6. They must be stated in business terms

7. The rules must be stated by business people. They own the rule and are the source of modification and deletion.

A reference: The Business Rule Book: Classifying, Defining and Modelling Rules (Database research Group Inc.,1994)

DBDesign 5 / 15

Business RulesBusiness Rules

In a project such as the one you are developing, the Business Rules may seem trivial - balance ^< 0, names must not be null, automatic date compliance ….

However in a business environment, the influence of Rules has an impact on the data and processing of data in the database from Conceptual stage, to implementation, to operational use, and to termination or integration into a wider database.

DBDesign 5 / 16

Business RulesBusiness Rules

There are many Business Rules, and the database designers (and the users) need to know much about them

Who is responsible for them ?

What caused the rules to be created ?

Why were they put in place ?

Have the rules changed over time ?

What is the inference of this change (s) on ‘legacy’ data ?

What business processes do they control ?

Where have they been (or will be) implemented in the current application

DBDesign 5 / 17

Business RulesBusiness Rules

Business rules need to be managed - and a Business Rule database provides the means of doing this

The business rule statements can be captures and cross referenced to the business environment

Additional detail on the Rule can be captured :

Business topic,

People, Departments … associated with the rule,

Which information systems and databases implement the rule

DBDesign 5 / 18

Business RulesBusiness Rules

Categories of Business Rule details

1. Basic : the actual business rule statement, its name and any aliases, associated terms, rule management such as versions, related change efforts, milestone dates during the business rule’s life cycle

2. Business Information : Relevance with the business (areas of application), primary focus and purpose, basis of the rule. Also provides links the business rile to associated reference material (policy manual, system documentation, legislation - e.g. the ‘new’ privacy legislation).

DBDesign 5 / 19

Business RulesBusiness Rules

3. Categories. The means by which the rule is classified - enforcement level, type of rule, system aspect

4. Context Information : Cross references into business systems, business workflows and the systems and databases within the applications which support, or are influenced by, the rule

5. Relationships with other Rules : Cross references 2 business rules - one rule may replace, conflict with, or is a subset of another rule (wait for the ‘issue date’ of an Insurance policy).

DBDesign 5 / 20

Business RulesBusiness Rules

6. Parties to Rules : Identifies all parties involved with the business rule and the roles (authorising agent, business sponsor, enforcer, implementer, maintainer)

7. Evaluation : Use of metrics to quantify specific characteristics about the rule, (volatility, confidence level, sensitivity, effectiveness).

DBDesign 5 / 21

And What About the Data ?And What About the Data ?

DBDesign 5 / 22

ModelsModels

A Model is : an abstract representation a subject which looks and behaves like all or part of the original

The model may be physical (a mock up of a container, space shuttle, car ….)

A drawing, blueprint

It may be conceptual - mathematical model of weather forecasting, behaviour of vehicles at speed, financial projections

Modelling is the process of creating an abstract representation of a subject.

DBDesign 5 / 23

ModelsModels

Models are used to

Studied more cheaply than the full scale process or subject

Time freeze activities of process - and rerun

Allow manipulation, modification, alteration of the subject without interfering with the original subject

Some interesting examples :

turnstiles at the M.C.G and Docklands

the Tennis Stadium alterations (new roof, courts)

the City Link tunnel system

DBDesign 5 / 24

Process ModellingProcess Modelling

Process modelling is directed at

formalising the construction of models

formalising the analysis and design of application models

The results are:

1. End-user needs are understood and documented

2. Critical issues are recognised

3. Development staff have relevant input to support the successful completion of analysis, design and programming steps

DBDesign 5 / 25

Data ModellingData Modelling

Process Modelling reflects ‘Business Functions’ and their Interdependencies

It does not clearly communicate the DATA which organisations need and use

This is a database design requirement

4 reasons:

1. Processes are dynamic, data is typically static

2. Processes are more volatile than data

3. Database designers need different information than application programmers

4. Data requires a data-oriented method of documentation

DBDesign 5 / 26

Process Modelling / Data ModellingProcess Modelling / Data Modelling

Process Characteristics

Dynamic (movement)

Frequent definition changes

Principal use is in programming

Documentation methods show algorithms, movement of data, flow of controls

Data Characteristics

Static

Definition changes rare

Principal use is in database design

Documentation methods show data definitions and data relationships

DBDesign 5 / 27

Data ModellingData Modelling

The objectives of a Developers model are

1. To accurately and fully show facts about information and to communicate and verify these facts with the user(s)

2. Separation of logical from physical leads to the recognition of the data definitions an organisation requires. These are isolated from changes in storage technology, and access methods

3. Integration across applications (life cycle, creates, updates, how and when data is used)

DBDesign 5 / 28

AdvantagesAdvantages

1. Duplication of data is easier to detect

2. By separating data and use, data modelling reduces the chance of small changes to data or processes causing major alteration to applications and databases

3. Minimises the risk of Inconsistent processing - for example the updating of a customer’s payment history without the flow onto the customer’s credit status

Data modelling provides a secure basis for developers to fully understand the dependencies of data

DBDesign 5 / 29

TerminologyTerminology

A model : an abstract representation of a subject which looks or behaves, in one or more ways, like all or part of the original

A data model : represents the definition, characterisation, and relationships of data in a given environment

A logical data model : a model of the information used in an organisation from the end-user perspective, without regard for its functional or physical aspects

A database design model : a model configured to reflect data usage in a particular physical environment

DBDesign 5 / 30

TerminologyTerminology

Logical design : represents all of the phases which identify the subject from the user perspective (planning , requirements definition, analysis)

Physical design : represents the phases where the user’s view of the application is converted into technical design specifications

End users : commission the building of, and then use, the information system

Technique : is a series of steps applied to a subject to alter its representation. (e.g. data modelling, process modelling, prototyping)

DBDesign 5 / 31

TerminologyTerminology

Tool : a physical or conceptual construct which assists in the application of techniques (CASE products, flow-charting templates, DBMS, Powerpoint ….)

Methodology : is the approach made to apply one or more techniques (e.g. Chen Entity-Relationship modelling)

There are others :

Nijssen’s Information Analysis Method (NIAM)

Semantic Object Modelling (SOM)

There is an article on these methodologies.

DBDesign 5 / 32

TerminologyTerminology

The Entity-Relationship model reflects the real world which consists of collection of basic objects.

These objects are called entities. This is an object which exists in the real world and can be distinguished from other objects ( such as bank accounts, persons, loans).

Entities are described by a set of attributes - such as an account could be described by attributes such as account number and balance.

Another entity (customer) could be described by the attributes customer_name, address, customer_identity_ number

DBDesign 5 / 33

TerminologyTerminology

A Relationship is an association among several entities (or in a special case among attributes in a single entity (such as the Telephone directory).

The set of all entities of the same type is known as an entity set (what else ?)

and the set of all relationships of the same type is known as a relationship set.

DBDesign 5 / 34

More TerminologyMore Terminology

The Relational Model uses a collection of tables to represent» Data» The Relationships between the sets of data

Each table has at least 1 column (normally more than 1)

and each column has a unique name in its table

The relational model is an example of a record-based model. The database is structured in fixed format records, generally with many types

Each record defines a fixed number of attributes (fields). The columns correspond to the attributes of the record type

DBDesign 5 / 35

We’re getting thereWe’re getting there

The diagram shows the relationship of a set of customers and a set of accounts

customer account

Although it’s a bit early, the customer set reflects the attributes

Customer_name, customer_address, and the customer_ID

The account set reflects (or could reflect) the attributes

Account_number, balance

DBDesign 5 / 36

ModellingModelling

Here is some supporting data :

The customer table

customer_id customer_name address

1831567 Johnson, R 19 Green Street Toorak

1723746 Norman, G 13 The Avenue Newport

1189345 Shomari, V 234 West Street Melton

The account table

Account number Balance What is missing ?

73145 900 We can’t tell which customer

45893 1200 owns which account

36587 70

DBDesign 5 / 37

ModellingModelling

• We need another table - depositor. This table allows us to associate customers, accounts and account numbers.

The Depositor table

customer_id account_number

1831567 73145

1723746 45893

1189345 36587

This arrangement allows for more than one customer ‘owning’ an account

And yes, you’re correct - it’s not the only way of modelling the sets

DBDesign 5 / 38

More TermsMore Terms

In real life, modelling requires much analysis.

In the example shown, the address would be broken into subparts - probably number, street/road, suburb, post code

The attribute name could be structured as first_name, initial, last_name, and there would most likely be another attribute title (Ms. Mrs. Dr. Mr. …..)

- these are known as composite attributes

DBDesign 5 / 39

Business EntitiesBusiness Entities

Are persons, objects or events about which Information is, or will be, recorded in the Information Data Base

Many of these Entities can be identified with Business Activities (e.g. suppliers, purchase orders, customer)

And, we gain an insight into Business Rules, which determine how data exists or co-exists with other dataor events.

DBDesign 5 / 40

Critical Success FactorsCritical Success Factors

* Key factors which must be performed well to ensure the success of an organisation

* Also known as Critical Performance Items

production failure rate < 0.01% of total production unitsproduction cost increases <= c.p.i. increasescustomer service complaints < 1% of all customer transactionsabsenteeism < 1% of staff in any 24 hour periodproduct quality => advertised standards (water,power)no more than 1% of trains > 3 mins late at destination

DBDesign 5 / 41

Database DesignDatabase Design

- the process of developing database structures from user requirements for data

Structured Methodology - a number of ordered formalprocesses with known inputs and expected outputs

Objectives

1. derive relationships2. evolve to meet user requirements3. user requests are met within reasonable time limits

DBDesign 5 / 42

Database Design ContextDatabase Design Context

System Analysis

Abstract LOGICAL from PHYSICAL

Real world data (data modelling)

Real world functions & behaviour (procedure modelling)

System Design

Transform new LOGICAL into PHYSICAL model

Design database, programs, screens etc.

Implement

Build new PHYSICAL model and install

DBDesign 5 / 43

The Stages of Database DesignThe Stages of Database Design

Requirements Analysis

Conceptual Design

Implementation Design

Physical Design

DBDesign 5 / 44

Conceptual DesignConceptual Design

- Create an enterprise model

- Normalise user views

- Integrate user views

- Create a database

– Review

DBDesign 5 / 45

Modelling User RequirementsModelling User Requirements

DATA

Entity Relationship Modelling (Semantic)

Normalisation (Minimality)

PROCESS

Data Flow Analysis

Procedure Formation

Functional Analysis

DBDesign 5 / 46

Importance of DataImportance of Data

The structure of data in an information system is fundamental to the effectiveness and scope of the system.

Data structures are more stable than the ways in which data may be used within an organisation.

Data centered system development methods provide more integrated information systems more quickly and efficiently than process centered techniques.

DBDesign 5 / 47

Well Structured DataWell Structured Data

ADVANTAGES

Better integration of applications through shared access to common data.

More adaptable systems because files and databases will follow the natural structures inherent in data rather than application oriented relationships.

Processing should be simpler if data is held in its simplest least redundant form.

Uniform application of Business Rules

DBDesign 5 / 48

Data Focussed DevelopmentData Focussed Development

1. Develop broad data model and broad function model.

2. Partition data model by function.

3. Develop detailed data model by each functional partition.

4. Develop the detailed procedure models by functional partition.

DBDesign 5 / 49

Entity Relationship ModellingEntity Relationship Modelling

Represents major data objects and the relationships between them.

Widely used commercially in Australia.

Originated by Chen in 1976 and adapted in the 'Information Engineering' approach.

An Entity model is composed of textual descriptions of objects and a graphical representation called an Entity Relationship Diagram

DBDesign 5 / 50

Entity Relationship ModellingEntity Relationship Modelling

Entity types form relationship types with one another and each entity type may be described by one or more attribute types.

Customer Sales Order Places

(Placed By)

ENTITY TYPE RELATIONSHIP TYPE ATTRIBUTE TYPES

CUSTOMER

Any person or organisation who purchases goods

Customer Number, Customer Name, Customer Phone

DBDesign 5 / 51

Entity TypeEntity Type

In practice Entity Instance is often shortened to 'Instance' and Entity Type is shortened to 'Entity'.

Entity names should be brief but unambiguous i.e. clearly reflect the meaning .

Entities are often recognised as nouns or noun phrases.

Entities can model any object set of interest i.e. physical, conceptual, events etc.

DBDesign 5 / 52

Relationship TypesRelationship Types

A named association between two entity types.

CUSTOMER

SALES ORDER

Places

(Placed By)

Entity type

Relationship Type

Entity Type

DBDesign 5 / 53

Relationship InstancesRelationship Instances

A relationship instance is an association between two entity instances.

CUSTOMER SALES ORDER

Smith Places OrderNo 651

Jones Places OrderNo 375

A relationship type is a classification of relationship instances.

CUSTOMER SALES ORDER

Places

(Placed By)

DBDesign 5 / 54

Degree of RelationshipDegree of Relationship

Binary

Binary Employee Salary

Order Part

Employee Employee

Computer Op. System

Software

1 1

1 m

1

m

m

n

m

n

p

is paid

contain

is aleader/member

is aleader/member

runs on

Unary

Ternary

DBDesign 5 / 55

Connectivity of RelationshipConnectivity of Relationship

Employee

Project

Leads

(Lead by)

Customer

Sales Order

Places

(Placed by)

Supplier

Item

Supplies

(Supplied by)

One to One One to Many Many to Many

Also known as Maximal Cardinalities

Relationships represent the 'Business Rules' of the organisation.

DBDesign 5 / 56

Nature of RelationshipsNature of Relationships

The entity participation in the relationship may be mandatory or optional. (Minimal Cardinalities)

CUSTOMER SALES ORDER

Places

(Placed By)

A Customer may place many Sales Orders.

A Sales Order must be placed by one Customer.

DBDesign 5 / 57

Logical Data ModellingLogical Data Modelling

Mandatory One

Mandatory Many - at least One

Optional One - No less than One

Optional Many - May be none

There is NO standard notation for E-R Modelling

Modelling symbols - McFadden, Hoffer and Prescott (P 87)

DBDesign 5 / 58

Attribute TypesAttribute Types

A data item or element that describes one entity type.

STUDENT RESULT

Studentno Unit Code Year Semester Grade

S1 CSE4230 2000 1 N

S1 CSE4230 2000 2 HD

PRIMARY KEY or IDENTIFIER

Some or all of the attributes describing an entity type serve to uniquely identify each entity instance.

DBDesign 5 / 59

Value Sets or DomainsValue Sets or Domains

Each attribute type describes one entity type and is based on an underlying domain or value set.

EMPLOYEE

Empno Emp_Gender Emp_Name Emp_Hire_Date

E1 Female Jones 20/6/1997

DEPENDENT

Empno Depend_Name Depend_Gender Birthdate

E1 Katherine Female 23/6/1994

DBDesign 5 / 60

Synonyms or AliasesSynonyms or Aliases

Be careful to differentiate between things and their names.

Item = Part = Product ???

Synonym or Alias

The same object called different names by different people.

This problem concerns entity types, relationship types and attribute types.

DBDesign 5 / 61

HomonymsHomonyms

Different things called the same name by different people ??

Employee Start Date

In the Personnel area this may mean the date an employee started with the company.

In a given department this may mean the date an employee started with the department.

This problem also concerns entity types, relationships types and attribute types.

DBDesign 5 / 62

Identifying Entity TypesIdentifying Entity Types

Entity type names are generally nouns or noun phrases.

Can the entity type be fully defined in one or

two sentences ?

What are some typical attributes describing the entity type ?

Which attributes uniquely identify each entity instance ?

DBDesign 5 / 63

What are the Entities Here ?What are the Entities Here ?

Order No ____________ Order Date __ / __ / __

Customer Name

Customer Address______________________________

ItemNo Description Qty-ord Unit-Price Ext-Price

ABC COMPANY SALES ORDER

DBDesign 5 / 64

Building an E-R ModelBuilding an E-R Model

– List the major entity types in the system and note the relationship types between them.

– Represent the entity types by named rectangles.

– Identify relationship types between entity types including relationship degree and nature.

DBDesign 5 / 65

E-R ModellingE-R Modelling

Normalisation

Verification Attributes

Initial E-R Model

Data AnalysisUser views and Business Rules

DatabaseInitial Study

Database Life CycleProcesses anddatabase transactions

FinalE-R Model

DBDesign 5 / 66

Conceptual Design ToolsConceptual Design Tools

Tools Available Sources of Informationto Designer for the Designer

Entity-Relationship Business rulesDiagram and data constraints

Normalisation Data flow diagrams

Data dictionary Process functionaldescriptions (user views)(output generated by the system analysis and designphases)

Conceptual Model

DefinitionandValidation

DBDesign 5 / 67

Modelling TechniquesModelling Techniques

These are some of the methods used in data modelling

Network Model (Codasyl)

Relational Model

Entity-Set model

Entity-Relationship (based on Chen - many variations)

Smith’s Abstraction

Semantic Data Model

IDEFI

DBDesign 5 / 68

Modelling TechniquesModelling Techniques

Functional Data Model - (Dapter)

Data Designer (Automated Design Tool)

Design Manager (Automated support for describing data models -- leads to database design ……….

And the ‘latest’ is the Unified Modelling Language (UML)

which is very much favoured with Object Oriented approach to software development.

Terms such as Class, Object, Statechart, Activity, Sequence, Collaboration, Use-case, Component and Deployment are included in its use

DBDesign 5 / 75

Who is studying whom/what ?