dbdesign 5 / 1 some aspects of database design. dbdesign 5 / 2 5. aims of database design reduce...
TRANSCRIPT
DBDesign 5 / 2
5. Aims of Database Design5. Aims of Database Design
Reduce data redundancy.
Provide stable data structures that can be readily changed with changing user requirements.
Allow users to make ad hoc requests for data.
Maintain complex relationships between data elements.
Support a large variety of decision needs.
DBDesign 5 / 3
A Practical Development ScenarioA Practical Development Scenario
In 198n , a Company decided to develop and introduce a Payroll system using database technology. It looked this this
Payroll System
PayrollData
DBDesign 5 / 4
A Practical Development ScenarioA Practical Development Scenario
The Company grew in numbers and expertise and decided to introduce a Personnel System. The ‘new’
design was this
Payroll System
Payroll Data
Payroll System
PersonnelData
Personnel System
DBDesign 5 / 5
A Practical Development ScenarioA Practical Development Scenario
In the next few years, these components were addedPayroll System
Payroll Data
Payroll System
PersonnelData
Personnel System
Job History Labor Distribution
EmployeeTracing Data
LaborAnalysis Data
DBDesign 5 / 6
A Practical Development ScenarioA Practical Development Scenario
Notice the complex nature of related data
The design of the databases must be such
that data can be accessed :
Correctly
Completely
Clearly
Without Anomalies
Access = Insert, update, modify, delete and retrieve
DBDesign 5 / 7
Design CriteriaDesign Criteria
- data availability
- data reliability
- data currency
- data consistency
- data flexibility
- data efficiency
Verify these criteria are satisfied via technical review
DBDesign 5 / 8
End Product of DataBase DesignEnd Product of DataBase Design
A database which will:
Accurately reflect the ‘real world’ data in all required aspects
Be responsive to Management Information demands
Reflect Business Rules and Controls
Be capable of modification to meet changes in Management needs - as in the previous example
Be an asset to the Organisation/Enterprise
DBDesign 5 / 9
Business Functions and ProcessesBusiness Functions and Processes
• Business Functions :
Broad groups of closely related activities and decisions which contribute to a product or service like cycle. (e.g.. planning, materials management, production planning, quality assurance).
• Business Processes : Decision related activities which occur within a function. They are related to management of people, money, material and information.
• Materials Management (Business Function)could be
subdivided into: requirements planning, purchasing, good received, material accounting, stockkeeping
Business Processes should reflect related activity groupings
DBDesign 5 / 10
Business ActivitiesBusiness Activities
• Business Activities : Specific operations or transactions
required to carry out a process
Some guidelines:
An activity should produce some clearly defined (identifiable) result - a product, a decision, a plan ......
An activity has clear boundaries - a clear beginning and end. Activities do not overlap.
An activity is carried out as a unit, by a single agent or a team
Once initiated, an activity proceeds independently of and from other activities.
DBDesign 5 / 11
Business Rules and their Impact on DataBusiness Rules and their Impact on Data
Business Rules are the defining force between application behaviour and database content
Conditions/rules/behaviour/standards/requirements govern
what data
what domains
what processes
what relationships
what variations may or can occur
Business rules provide a formal foundation for understanding how a business operates
- with interesting implications for the database design
DBDesign 5 / 12
Business RulesBusiness Rules
An example - an active employee must have a position - no problems with this in modelling and understanding
The ‘position’ is optional (or non-allowable) for an employee with a different status - terminated, retired, on leave
So is the relation between Employee and Position optional or mandatory ? Or when does the relationship alter from mandatory to optional ? Is there some ‘intelligence’ built into the status attribute ?
DBDesign 5 / 13
Business RulesBusiness Rules
Another one:
Automatic re-ordering in a retail business :
Rules exist to determine what levels of stock trigger off automatic re-ordering for stock.
Is this a common rule for all stock levels ?
Are there ‘different’ rules for different items ?
Can the rules alter depending on
season
production or supply variables
competition
new lines
phasing out ‘old’ items ?
DBDesign 5 / 14
Business RulesBusiness Rules
Some more thoughts about Business Rules:
1. They should be declarative
2. There must be only 1 interpretation
3. They must be atomic - it must be a single complete thought
4. Consistent - A set of rules must not be contradictory
5. Non-redundant - The rules must not contain rules stating the same information
6. They must be stated in business terms
7. The rules must be stated by business people. They own the rule and are the source of modification and deletion.
A reference: The Business Rule Book: Classifying, Defining and Modelling Rules (Database research Group Inc.,1994)
DBDesign 5 / 15
Business RulesBusiness Rules
In a project such as the one you are developing, the Business Rules may seem trivial - balance ^< 0, names must not be null, automatic date compliance ….
However in a business environment, the influence of Rules has an impact on the data and processing of data in the database from Conceptual stage, to implementation, to operational use, and to termination or integration into a wider database.
DBDesign 5 / 16
Business RulesBusiness Rules
There are many Business Rules, and the database designers (and the users) need to know much about them
Who is responsible for them ?
What caused the rules to be created ?
Why were they put in place ?
Have the rules changed over time ?
What is the inference of this change (s) on ‘legacy’ data ?
What business processes do they control ?
Where have they been (or will be) implemented in the current application
DBDesign 5 / 17
Business RulesBusiness Rules
Business rules need to be managed - and a Business Rule database provides the means of doing this
The business rule statements can be captures and cross referenced to the business environment
Additional detail on the Rule can be captured :
Business topic,
People, Departments … associated with the rule,
Which information systems and databases implement the rule
DBDesign 5 / 18
Business RulesBusiness Rules
Categories of Business Rule details
1. Basic : the actual business rule statement, its name and any aliases, associated terms, rule management such as versions, related change efforts, milestone dates during the business rule’s life cycle
2. Business Information : Relevance with the business (areas of application), primary focus and purpose, basis of the rule. Also provides links the business rile to associated reference material (policy manual, system documentation, legislation - e.g. the ‘new’ privacy legislation).
DBDesign 5 / 19
Business RulesBusiness Rules
3. Categories. The means by which the rule is classified - enforcement level, type of rule, system aspect
4. Context Information : Cross references into business systems, business workflows and the systems and databases within the applications which support, or are influenced by, the rule
5. Relationships with other Rules : Cross references 2 business rules - one rule may replace, conflict with, or is a subset of another rule (wait for the ‘issue date’ of an Insurance policy).
DBDesign 5 / 20
Business RulesBusiness Rules
6. Parties to Rules : Identifies all parties involved with the business rule and the roles (authorising agent, business sponsor, enforcer, implementer, maintainer)
7. Evaluation : Use of metrics to quantify specific characteristics about the rule, (volatility, confidence level, sensitivity, effectiveness).
DBDesign 5 / 22
ModelsModels
A Model is : an abstract representation a subject which looks and behaves like all or part of the original
The model may be physical (a mock up of a container, space shuttle, car ….)
A drawing, blueprint
It may be conceptual - mathematical model of weather forecasting, behaviour of vehicles at speed, financial projections
Modelling is the process of creating an abstract representation of a subject.
DBDesign 5 / 23
ModelsModels
Models are used to
Studied more cheaply than the full scale process or subject
Time freeze activities of process - and rerun
Allow manipulation, modification, alteration of the subject without interfering with the original subject
Some interesting examples :
turnstiles at the M.C.G and Docklands
the Tennis Stadium alterations (new roof, courts)
the City Link tunnel system
DBDesign 5 / 24
Process ModellingProcess Modelling
Process modelling is directed at
formalising the construction of models
formalising the analysis and design of application models
The results are:
1. End-user needs are understood and documented
2. Critical issues are recognised
3. Development staff have relevant input to support the successful completion of analysis, design and programming steps
DBDesign 5 / 25
Data ModellingData Modelling
Process Modelling reflects ‘Business Functions’ and their Interdependencies
It does not clearly communicate the DATA which organisations need and use
This is a database design requirement
4 reasons:
1. Processes are dynamic, data is typically static
2. Processes are more volatile than data
3. Database designers need different information than application programmers
4. Data requires a data-oriented method of documentation
DBDesign 5 / 26
Process Modelling / Data ModellingProcess Modelling / Data Modelling
Process Characteristics
Dynamic (movement)
Frequent definition changes
Principal use is in programming
Documentation methods show algorithms, movement of data, flow of controls
Data Characteristics
Static
Definition changes rare
Principal use is in database design
Documentation methods show data definitions and data relationships
DBDesign 5 / 27
Data ModellingData Modelling
The objectives of a Developers model are
1. To accurately and fully show facts about information and to communicate and verify these facts with the user(s)
2. Separation of logical from physical leads to the recognition of the data definitions an organisation requires. These are isolated from changes in storage technology, and access methods
3. Integration across applications (life cycle, creates, updates, how and when data is used)
DBDesign 5 / 28
AdvantagesAdvantages
1. Duplication of data is easier to detect
2. By separating data and use, data modelling reduces the chance of small changes to data or processes causing major alteration to applications and databases
3. Minimises the risk of Inconsistent processing - for example the updating of a customer’s payment history without the flow onto the customer’s credit status
Data modelling provides a secure basis for developers to fully understand the dependencies of data
DBDesign 5 / 29
TerminologyTerminology
A model : an abstract representation of a subject which looks or behaves, in one or more ways, like all or part of the original
A data model : represents the definition, characterisation, and relationships of data in a given environment
A logical data model : a model of the information used in an organisation from the end-user perspective, without regard for its functional or physical aspects
A database design model : a model configured to reflect data usage in a particular physical environment
DBDesign 5 / 30
TerminologyTerminology
Logical design : represents all of the phases which identify the subject from the user perspective (planning , requirements definition, analysis)
Physical design : represents the phases where the user’s view of the application is converted into technical design specifications
End users : commission the building of, and then use, the information system
Technique : is a series of steps applied to a subject to alter its representation. (e.g. data modelling, process modelling, prototyping)
DBDesign 5 / 31
TerminologyTerminology
Tool : a physical or conceptual construct which assists in the application of techniques (CASE products, flow-charting templates, DBMS, Powerpoint ….)
Methodology : is the approach made to apply one or more techniques (e.g. Chen Entity-Relationship modelling)
There are others :
Nijssen’s Information Analysis Method (NIAM)
Semantic Object Modelling (SOM)
There is an article on these methodologies.
DBDesign 5 / 32
TerminologyTerminology
The Entity-Relationship model reflects the real world which consists of collection of basic objects.
These objects are called entities. This is an object which exists in the real world and can be distinguished from other objects ( such as bank accounts, persons, loans).
Entities are described by a set of attributes - such as an account could be described by attributes such as account number and balance.
Another entity (customer) could be described by the attributes customer_name, address, customer_identity_ number
DBDesign 5 / 33
TerminologyTerminology
A Relationship is an association among several entities (or in a special case among attributes in a single entity (such as the Telephone directory).
The set of all entities of the same type is known as an entity set (what else ?)
and the set of all relationships of the same type is known as a relationship set.
DBDesign 5 / 34
More TerminologyMore Terminology
The Relational Model uses a collection of tables to represent» Data» The Relationships between the sets of data
Each table has at least 1 column (normally more than 1)
and each column has a unique name in its table
The relational model is an example of a record-based model. The database is structured in fixed format records, generally with many types
Each record defines a fixed number of attributes (fields). The columns correspond to the attributes of the record type
DBDesign 5 / 35
We’re getting thereWe’re getting there
The diagram shows the relationship of a set of customers and a set of accounts
customer account
Although it’s a bit early, the customer set reflects the attributes
Customer_name, customer_address, and the customer_ID
The account set reflects (or could reflect) the attributes
Account_number, balance
DBDesign 5 / 36
ModellingModelling
Here is some supporting data :
The customer table
customer_id customer_name address
1831567 Johnson, R 19 Green Street Toorak
1723746 Norman, G 13 The Avenue Newport
1189345 Shomari, V 234 West Street Melton
The account table
Account number Balance What is missing ?
73145 900 We can’t tell which customer
45893 1200 owns which account
36587 70
DBDesign 5 / 37
ModellingModelling
• We need another table - depositor. This table allows us to associate customers, accounts and account numbers.
The Depositor table
customer_id account_number
1831567 73145
1723746 45893
1189345 36587
This arrangement allows for more than one customer ‘owning’ an account
And yes, you’re correct - it’s not the only way of modelling the sets
DBDesign 5 / 38
More TermsMore Terms
In real life, modelling requires much analysis.
In the example shown, the address would be broken into subparts - probably number, street/road, suburb, post code
The attribute name could be structured as first_name, initial, last_name, and there would most likely be another attribute title (Ms. Mrs. Dr. Mr. …..)
- these are known as composite attributes
DBDesign 5 / 39
Business EntitiesBusiness Entities
Are persons, objects or events about which Information is, or will be, recorded in the Information Data Base
Many of these Entities can be identified with Business Activities (e.g. suppliers, purchase orders, customer)
And, we gain an insight into Business Rules, which determine how data exists or co-exists with other dataor events.
DBDesign 5 / 40
Critical Success FactorsCritical Success Factors
* Key factors which must be performed well to ensure the success of an organisation
* Also known as Critical Performance Items
production failure rate < 0.01% of total production unitsproduction cost increases <= c.p.i. increasescustomer service complaints < 1% of all customer transactionsabsenteeism < 1% of staff in any 24 hour periodproduct quality => advertised standards (water,power)no more than 1% of trains > 3 mins late at destination
DBDesign 5 / 41
Database DesignDatabase Design
- the process of developing database structures from user requirements for data
Structured Methodology - a number of ordered formalprocesses with known inputs and expected outputs
Objectives
1. derive relationships2. evolve to meet user requirements3. user requests are met within reasonable time limits
DBDesign 5 / 42
Database Design ContextDatabase Design Context
System Analysis
Abstract LOGICAL from PHYSICAL
Real world data (data modelling)
Real world functions & behaviour (procedure modelling)
System Design
Transform new LOGICAL into PHYSICAL model
Design database, programs, screens etc.
Implement
Build new PHYSICAL model and install
DBDesign 5 / 43
The Stages of Database DesignThe Stages of Database Design
Requirements Analysis
Conceptual Design
Implementation Design
Physical Design
DBDesign 5 / 44
Conceptual DesignConceptual Design
- Create an enterprise model
- Normalise user views
- Integrate user views
- Create a database
– Review
DBDesign 5 / 45
Modelling User RequirementsModelling User Requirements
DATA
Entity Relationship Modelling (Semantic)
Normalisation (Minimality)
PROCESS
Data Flow Analysis
Procedure Formation
Functional Analysis
DBDesign 5 / 46
Importance of DataImportance of Data
The structure of data in an information system is fundamental to the effectiveness and scope of the system.
Data structures are more stable than the ways in which data may be used within an organisation.
Data centered system development methods provide more integrated information systems more quickly and efficiently than process centered techniques.
DBDesign 5 / 47
Well Structured DataWell Structured Data
ADVANTAGES
Better integration of applications through shared access to common data.
More adaptable systems because files and databases will follow the natural structures inherent in data rather than application oriented relationships.
Processing should be simpler if data is held in its simplest least redundant form.
Uniform application of Business Rules
DBDesign 5 / 48
Data Focussed DevelopmentData Focussed Development
1. Develop broad data model and broad function model.
2. Partition data model by function.
3. Develop detailed data model by each functional partition.
4. Develop the detailed procedure models by functional partition.
DBDesign 5 / 49
Entity Relationship ModellingEntity Relationship Modelling
Represents major data objects and the relationships between them.
Widely used commercially in Australia.
Originated by Chen in 1976 and adapted in the 'Information Engineering' approach.
An Entity model is composed of textual descriptions of objects and a graphical representation called an Entity Relationship Diagram
DBDesign 5 / 50
Entity Relationship ModellingEntity Relationship Modelling
Entity types form relationship types with one another and each entity type may be described by one or more attribute types.
Customer Sales Order Places
(Placed By)
ENTITY TYPE RELATIONSHIP TYPE ATTRIBUTE TYPES
CUSTOMER
Any person or organisation who purchases goods
Customer Number, Customer Name, Customer Phone
DBDesign 5 / 51
Entity TypeEntity Type
In practice Entity Instance is often shortened to 'Instance' and Entity Type is shortened to 'Entity'.
Entity names should be brief but unambiguous i.e. clearly reflect the meaning .
Entities are often recognised as nouns or noun phrases.
Entities can model any object set of interest i.e. physical, conceptual, events etc.
DBDesign 5 / 52
Relationship TypesRelationship Types
A named association between two entity types.
CUSTOMER
SALES ORDER
Places
(Placed By)
Entity type
Relationship Type
Entity Type
DBDesign 5 / 53
Relationship InstancesRelationship Instances
A relationship instance is an association between two entity instances.
CUSTOMER SALES ORDER
Smith Places OrderNo 651
Jones Places OrderNo 375
A relationship type is a classification of relationship instances.
CUSTOMER SALES ORDER
Places
(Placed By)
DBDesign 5 / 54
Degree of RelationshipDegree of Relationship
Binary
Binary Employee Salary
Order Part
Employee Employee
Computer Op. System
Software
1 1
1 m
1
m
m
n
m
n
p
is paid
contain
is aleader/member
is aleader/member
runs on
Unary
Ternary
DBDesign 5 / 55
Connectivity of RelationshipConnectivity of Relationship
Employee
Project
Leads
(Lead by)
Customer
Sales Order
Places
(Placed by)
Supplier
Item
Supplies
(Supplied by)
One to One One to Many Many to Many
Also known as Maximal Cardinalities
Relationships represent the 'Business Rules' of the organisation.
DBDesign 5 / 56
Nature of RelationshipsNature of Relationships
The entity participation in the relationship may be mandatory or optional. (Minimal Cardinalities)
CUSTOMER SALES ORDER
Places
(Placed By)
A Customer may place many Sales Orders.
A Sales Order must be placed by one Customer.
DBDesign 5 / 57
Logical Data ModellingLogical Data Modelling
Mandatory One
Mandatory Many - at least One
Optional One - No less than One
Optional Many - May be none
There is NO standard notation for E-R Modelling
Modelling symbols - McFadden, Hoffer and Prescott (P 87)
DBDesign 5 / 58
Attribute TypesAttribute Types
A data item or element that describes one entity type.
STUDENT RESULT
Studentno Unit Code Year Semester Grade
S1 CSE4230 2000 1 N
S1 CSE4230 2000 2 HD
PRIMARY KEY or IDENTIFIER
Some or all of the attributes describing an entity type serve to uniquely identify each entity instance.
DBDesign 5 / 59
Value Sets or DomainsValue Sets or Domains
Each attribute type describes one entity type and is based on an underlying domain or value set.
EMPLOYEE
Empno Emp_Gender Emp_Name Emp_Hire_Date
E1 Female Jones 20/6/1997
DEPENDENT
Empno Depend_Name Depend_Gender Birthdate
E1 Katherine Female 23/6/1994
DBDesign 5 / 60
Synonyms or AliasesSynonyms or Aliases
Be careful to differentiate between things and their names.
Item = Part = Product ???
Synonym or Alias
The same object called different names by different people.
This problem concerns entity types, relationship types and attribute types.
DBDesign 5 / 61
HomonymsHomonyms
Different things called the same name by different people ??
Employee Start Date
In the Personnel area this may mean the date an employee started with the company.
In a given department this may mean the date an employee started with the department.
This problem also concerns entity types, relationships types and attribute types.
DBDesign 5 / 62
Identifying Entity TypesIdentifying Entity Types
Entity type names are generally nouns or noun phrases.
Can the entity type be fully defined in one or
two sentences ?
What are some typical attributes describing the entity type ?
Which attributes uniquely identify each entity instance ?
DBDesign 5 / 63
What are the Entities Here ?What are the Entities Here ?
Order No ____________ Order Date __ / __ / __
Customer Name
Customer Address______________________________
ItemNo Description Qty-ord Unit-Price Ext-Price
ABC COMPANY SALES ORDER
DBDesign 5 / 64
Building an E-R ModelBuilding an E-R Model
– List the major entity types in the system and note the relationship types between them.
– Represent the entity types by named rectangles.
– Identify relationship types between entity types including relationship degree and nature.
DBDesign 5 / 65
E-R ModellingE-R Modelling
Normalisation
Verification Attributes
Initial E-R Model
Data AnalysisUser views and Business Rules
DatabaseInitial Study
Database Life CycleProcesses anddatabase transactions
FinalE-R Model
DBDesign 5 / 66
Conceptual Design ToolsConceptual Design Tools
Tools Available Sources of Informationto Designer for the Designer
Entity-Relationship Business rulesDiagram and data constraints
Normalisation Data flow diagrams
Data dictionary Process functionaldescriptions (user views)(output generated by the system analysis and designphases)
Conceptual Model
DefinitionandValidation
DBDesign 5 / 67
Modelling TechniquesModelling Techniques
These are some of the methods used in data modelling
Network Model (Codasyl)
Relational Model
Entity-Set model
Entity-Relationship (based on Chen - many variations)
Smith’s Abstraction
Semantic Data Model
IDEFI
DBDesign 5 / 68
Modelling TechniquesModelling Techniques
Functional Data Model - (Dapter)
Data Designer (Automated Design Tool)
Design Manager (Automated support for describing data models -- leads to database design ……….
And the ‘latest’ is the Unified Modelling Language (UML)
which is very much favoured with Object Oriented approach to software development.
Terms such as Class, Object, Statechart, Activity, Sequence, Collaboration, Use-case, Component and Deployment are included in its use