strategic alliance, www. stral. in fundamentals of good database design by: tejesh patel me, mba,...

29
STRATEGIC ALLIANCE, www . Stral . in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor] MCDBA, MCSE, MCSA, MCP+I, DCNI

Upload: luke-cook

Post on 24-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

STRATEGIC ALLIANCE, www . Stral . in

Fundamentals of Good Database Design

By:Tejesh PatelME, MBA, PGD[Cyber Laws]CISA, CISM, ISO 9000:2000[Lead Auditor]MCDBA, MCSE, MCSA, MCP+I, DCNI

Page 2: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

The Agenda

RDBMS Terminologies Database Design goals Database Modeling Life-Cycle Database Anomalies The Keys Normalisation Denormalisation Database Integrity

Page 3: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

RDBMS TerminologiesA Brief Introduction

Entities : things in the real world that we will store information about in the database

Attributes : Inherent characteristics of an Entity Object : Instance of an entity

Relationships :are the links between Entities

Tables : Physical Representation of an Entity Columns / Fields : Physical Representation of

Attributes Rows / Tuples /Records : Physical Representation

of Instances of an Entity

Page 4: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database Design GoalsThe Ultimate Desires

What information needs to be stored? That is, what things or entities do we need to store information about?

What questions will we ask of the database? How fast shall we get the job done? What shall be the level of redundancy of data stored?

The First Two aspects being of mandatory nature doesnot affect the schema much

The schema design is always the function of the last two aspect, with both having inverse relationship.

Page 5: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database Modelling Life CycleStep-By-Step

1. Write down the problem definition / requirements in detail.2. Identify the nouns, they are the candidates for being entities3. Identify the characteristic properties of these nouns, they are

the candidates for being the attributes of the entities4. Decide various relationships between these entities5. Try to model one table for each entity6. Try to model one field for each attribute7. Carryout Normalisation8. Do Responsible Denormalisation9. Establish relationship with proper cardinality10. Apply Business logic in form of various integrity constraints11. Voila, your database model is ready

Page 6: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database Anomalies

Comedy of Errors By Design

Anomalies : are problems that arise in the data due to a flaw in the database design Insertion Anomalies: occurs when new

data is inserted into flawed schema Deletion Anomalies: occurs when data is

deleted from a flawed schema Update Anomalies: occurs when new

data in flawed schema is changed

Page 7: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database Anomalies:A Sample Data Table

Page 8: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Update Anomalies

Page 9: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Deletion Anomalies

On Deletion of all the products with Categoryid =2, the category itself isAutomatically deleted

Page 10: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Update Anomalies

Page 11: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Normalisation The Solution

Normalization is a process we can use to remove design flaws from a database

sets of rules describing what we should and should not do in our table structures

process consists of breaking tables into smaller tables that form a better design

In this process : Take database design through the different forms

in order form subsumes the one below it At each stage, we add more rules that the

schema must satisfy

Page 12: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

NORMALISATION

Its about the key, the whole key and nothing but the key

Page 13: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

First Normal Form

Requirements All rows shall be uniquely identifiable All the values shall be atomic No columns of repetitive nature No redundant data

CONCEPT : THE KEY

Page 14: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

First Normal Form

Page 15: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Keys The Key to the Database

Unique Key : Ensures that all rows are uniquely identifiable, there can be more than one unique keys in a table

Composite Key : Collection of more than one columns forming a unique key

Candidate Key : A key offering candidature for being a primary key, necessarily a unique key.

Primary Key : One of the candidate keys selected for the primary purpose of establishing relationships

Surrogate Key : A key which is imported into the table to act as a primary key

Page 16: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Primary Keys:The Good, The Bad & The Ugly

Just keep it short Effects database and query size Effects size of index / indices Is very bad when its cascading It is also the foreign key somewhere

Prefer integer over characters Have surrogate instead of large composite key or

large character-based PK Small primary key acting as a foreign key with a

clustering key acts as a good PK For big FKs. use Surrogate key

Consider the key order for the purpose of indexing

Page 17: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Second Normal Form

Requirements Shall be in First Normal Form [1NF] All Non Key members fully and functionally

dependent on the Primary Key

CONCEPT : THE WHOLE KEY

Page 18: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Second Normal Form

Table in First Normal Form

Table Split into Two and now in 2NF

Page 19: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Third Normal Form

Requirements Shall be in Second Normal Form[2NF] No attribute shall be transitively

dependent on the Primary key

CONCEPT : NOTHING BUT THE KEY

Page 20: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Third Normal Form

Page 21: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Higher Normal FormsThe Untouchables

There are higher normal forms (BCNF, fourth, fifth, and so on),

but these are more useful for academic pursuits than practical database design.

3NF is sufficient to avoid the data redundancy problems you will encounter

Page 22: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

NormalisationThe Face Off

Merits Removes Anomalies Removes database redundancy Better for OLTP Systems

Demerits Slow input Even Slower Output Not suited for OLAP Systems

Page 23: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

DenormalisationThe Rationale

The higher the level of normalization, the greater the number of tables in the database;

The greater the number of tables in the database, the more joins are necessary for data manipulation;

Joins slow down performance; Denormalization reduces the number of tables

and, therefore, the reliance on joins, which speeds up performance.

YEP, BUT NOT ALWAYS

Page 24: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

DenormalisationExamples In order to improve report performance, you decide to create a

column named TotalOrderCost that contains a sum of the cost of each order item in tblOrder.

This violates 2NF because TotalOrderCost is dependent on the primary key of tblOrderDetail, not on tblOrder's primary key.

TotalOrderCost is calculated on a form by summing the column TotalCost for each item.

you've broken 2NF to avoid having to join these two tables every time this report needs to be generated.

be careful to avoid update anomalies. Whenever a record in tblOrderDetail is inserted, updated, or deleted, you will need to update tblOrder, or the information stored there will be erroneous.

You include a column, SalesPerson, in the tblInvoice table, even though SalesId is also included in tblInvoice.

This violates 3NF because the two non-key columns are mutually dependent, but it significantly improves the performance of certain commonly run reports.

Once again, this is done to avoid a join to the tblEmployee table, but introduces redundancies and adds the risk of update anomalies.

Page 25: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

DenormalisationExecutive Decision

Denormalised ≠ Unnormalised Consider balancing normalised database [logical

design] with a good physical design for performance rather than denormalising.

If you have to, Break the rules deliberately; have a good reason for denormalizing.

Be fully aware of the tradeoffs this decision entails. i.e. Performance vs. Anomalies

Thoroughly document this decision. Create the necessary application adjustments to

avoid anomalies

Page 26: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database IntegrityControl Points

Domain Integrity:Ensures that values stored within a column are consistent and in accordance with business rules

Entity Integrity : Primary key shall not null be null and shall no allow duplicate

Referential Integrity :shall not contain any unmatched foreign key values

Disallow. The change is completely disallowed. Cascade. For updates, the change is cascaded to all

dependent tables. For deletions, the rows in all dependent tables are deleted.

Nullify. For deletions, the dependent foreign key values are set to Null

Page 27: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database IntegrityControl Devices NOT NULL [Domain Integrity] - A value must be specific for

this column. CHECK [Domain Integrity] - The values supplied must meet

the criteria specified in the constraint. A database developer can define a CHECK constraint to ensure that an entered integer is an even, positive value.

UNIQUE [Domain Integrity]- No values in this column may be duplicates of another. This might be used, for example, in an employee information table to prevent duplicate employee numbers from being entered.

PRIMARY KEY [Entity Integrity] - Defines which column or columns uniquely identify each row in the database. No two rows can have the same values for the primary key.

FOREIGN KEY [Referential Integrity] - Enforces data integrity between tables by creating a reference for specific types of information

Page 28: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

August 12, 2005 Strategic Alliance, www . stral . in

Database IntegrityThe Concept

Page 29: STRATEGIC ALLIANCE, www. Stral. in Fundamentals of Good Database Design By: Tejesh Patel ME, MBA, PGD[Cyber Laws] CISA, CISM, ISO 9000:2000[Lead Auditor]

STRATEGIC ALLIANCE, www . Stral . in

THANK YOUQueries Expected

For offline queries :[email protected]

[email protected] : 93745 13500

Please Visit:

www.stal.in/mcdba for queries related to advance training related to RDBMS modelling, administration and development