ibm data model1

59
Data Modeling Introduction Name Experience in Database Expectation from the class Rajesh Kumar V.S [email protected] w.ilearnonline.co.in

Upload: maheshumbarkar

Post on 21-Oct-2015

20 views

Category:

Documents


1 download

DESCRIPTION

IBM Data Model1

TRANSCRIPT

Page 1: IBM Data Model1

Data Modeling

Introduction Name Experience in Database Expectation from the class

Rajesh Kumar V.S [email protected]

www.ilearnonline.co.in

Page 2: IBM Data Model1

Agenda Data Modeling Different types of models Operational Systems

3rd NF (ER Modeling) Enterprise Data Modeling Modeling Lifecycle Modeling stakeholders Validating models Questions…..

www.ilearnonline.co.in

Page 3: IBM Data Model1

Model Many perspectives of business

How it works can be defined by the process model How the organization is organized can be defined by

the org chart Where the organizations operate can be defined /

explained by location or Geo map To understand the data flow of the application you can

create data flow diagram What information needs to run the business is

defined by Data Model

www.ilearnonline.co.in

Page 4: IBM Data Model1

Types of Models Subject Area / Contextual

Define main terms and definitions for the high level entities. Not every project will have this, only if the scope is big then

we can have this. Conceptual (Overview)

Important entities which holds the application together Business concepts and rules Identifying the relationship between each of the entities.

Logical Complete detail requirements, resolving M:M relationships

identified in Conceptual Model. Detailing all the attributes needed for every entity Providing relationships in terms of PK and FK.

Physical Model Exploiting the RDBMS to take the advantage of the

model.www.ilearnonline.co.in

Page 5: IBM Data Model1

Business to Physical ModelingThe Business

Contextual

Conceptual

Logical

Physical

Usual Data ModelingReverse Engineering

Usually 1:5 to 1:7 ratio of entities betweendifferent models.

www.ilearnonline.co.in

Page 6: IBM Data Model1

About Data Modeling

A data model is the place through which you maintain information about things (entities)

The facts about those things are nothing but rows. It’s a real thing, not a technology It’s stable until business does not change, if business

changes then the data model has to be flexible enough to accommodate that change.

It drives the consistency of dealing with information It should always start simple. Data model is relevant to the business As the business rules and requirements gets complex,

the painful detail comes

www.ilearnonline.co.in

Page 7: IBM Data Model1

About Data Modeling To be a better data modeler you should have both domain

knowledge and good exposure to data modeling concepts It’s a living thing, so expect changes for sure. Its our

visualization of information and proactive vision of business is key to build model which does not change quite often.

Data model is a non technical description of your business in terms of things it need to know about

By following few techniques a business analyst can build great data models

Collect all the facts about business through knowledge transfer sessions, reading about your business and the process at least 10% of your time.

Involve in discussions with various business line mangers to get overview of what they are doing.

www.ilearnonline.co.in

Page 8: IBM Data Model1

Information Visualization

Business

Line of Business

Business Process

Activity

Information

DATA

www.ilearnonline.co.in

Page 9: IBM Data Model1

Information Visualization

Business Line of Business Business Process Activity Information Data

HDFC Credit Card Customer Care Customer verification Verifier, address etc Data

www.ilearnonline.co.in

Page 10: IBM Data Model1

Information Visualization

Business Line of Business Business Process Activity Information Data

IBM ???? ???? ???? ???? ????

www.ilearnonline.co.in

Page 11: IBM Data Model1

Zachman Framework

Page 12: IBM Data Model1

Zachman Framework -- stakeholders

Page 13: IBM Data Model1

Contextual Data Model

It explains the major processes Focuses on specific problem area (SCOPE) Master entities are defined At this level we never talk about the data what you

are going to store (No details) Excellent starting point for the data modeling Vocabulary, Define the terms.

(Classic example of confusing the terms are vendor / supplier).

www.ilearnonline.co.in

Page 14: IBM Data Model1

Conceptual / Logical Model Conceptual Model

Entities are identified Not Normalized M:M relationships No keys Can list the important attributes

Logical data model M:M resolved Shows PK, FK and all attributes Its fully Normalized All attributes are atomic

www.ilearnonline.co.in

Page 15: IBM Data Model1

Generalization of the data model

Customers /markets Products

Services

Any business have to have the following entities through which it can work.What is the product you sell in a Software service company?What is the product you sell in Airlines?

www.ilearnonline.co.in

Page 16: IBM Data Model1

Create simple data model Create a model to hold all the employee information, career path

individual choose. Each career path has certain set of training programs. Individuals can choose training programs based on the current

capabilities and future needs. Assign to be participants of individuals to up coming training

programs based on the request individual mentioned. You should look for manager recommended candidates for the

training program first then fill the requested resources based on the seats in the training program.

Create the set of training programs and associate the proposed candidates based on the business rules.

System should be able to capture the attendance for the training program.

www.ilearnonline.co.in

Page 17: IBM Data Model1

Visualizing the functionality

www.ilearnonline.co.in

Page 18: IBM Data Model1

Attributes – conceptual vs logical

Customer--------------------

ContactsAddressesBank info

Customer_IDCustomer_nm

Effective dtIn_eff_dt

Cust_crdit

Cntc_typ_idShort_dscdescription

Addr_typ_idShort_dscdescription

Cust_idCntc_typ_idContact_nm

Emailphone

Cust_idaddr_typ_idAddress1Address2

CityState

Cust_idAccount_id

Bank_idBank_nmAddress1

City

Conceptual

Logical Model

www.ilearnonline.co.in

Page 19: IBM Data Model1

How to model (Steps)

Requirement gathering and understanding the business process is the foundation to get the right data model.

Gather the information Questions Answers Analyze and confirm Add to model

Make the above 4 steps as an iterative process before freezing the model.

Apply the business scenarios and see whether the data model accommodates the same.

www.ilearnonline.co.in

Page 20: IBM Data Model1

Depicting and Documenting

Customer

Order

Product

Order Line

places

containsspecifies

Data ModelDiagram

Data ModelDocument

Order Data Model

IntroductionEntity definitionAttribute definitionAdditional Bus Rules

www.ilearnonline.co.in

Page 21: IBM Data Model1

About entities

An entity is a person, place, thing, event or any of the interest to the enterprise, about which facts may be recorded

You should name it in a real world term Eventually entity becomes a table in relational database Examples

Employee Region Department Customer

Entities are not supposed to be designed by using the input or output formats (screens and reports). This wont give us enough flexibility in the data model.

www.ilearnonline.co.in

Page 22: IBM Data Model1

Entity Definition

Answers the question Definition Interesting points Exceptions Examples (demonstrate the data)

List of instances Scenarios

www.ilearnonline.co.in

Page 23: IBM Data Model1

Entity example Building

A ground based structure which supports the company business operations by housing employees, equipment, or supplies

Examples: office buildings, retail store, plants and warehouse

Can be owned or leased May or many not have street address

Key Points To avoid misunderstandings between entities we have to

do this Its crucial, so do them early

www.ilearnonline.co.in

Page 24: IBM Data Model1

Entity Types

Kernels / Master The central object in the model

Every thing else either further describes, associates or classifies the kernel entities

Can exists independently Ideally, it’s the starting point for modeling

Associative Relates two other entities Evolves from resolving M:M relationships Important associates shown in conceptual, remaining will be

detailed in the logical model Types

Kind of Kernel entity that classifies or categorizes other entities Example (Attribute Customer Type in customer table , or Account

Type and Account table) Transaction tables

www.ilearnonline.co.in

Page 25: IBM Data Model1

Attributes

A property of a thing that can be expressed as a piece of information – one of the facts about things that must be maintained

Properties of the entities Example for customer entity, following are the attributes

Cust_name Cust_contact Cust_address Cust_city

Questions to ask Is it a fundamental attribute or could it be derived attribute Business Rules association at the attribute level. Exceptions

www.ilearnonline.co.in

Page 26: IBM Data Model1

Relationships in the RDBMS

An association between two things (entities) is called a relation

We have three different types of relationships in RDBMS 1:1 (One to One) – rare 1:M (One to Many) – common M:M (Many to Many) – more in conceptual model, none in

Logical model and Physical model Examples

1:1 (Person to PAN ID) 1:M (Customer to Phone) M:M (Doctor and Patient)

Page 27: IBM Data Model1

M:M relationships

Only RDBMS supports the M:M relationship Only in the conceptual model, we can have this

relationship In the logical model, we have to resolve this

relationship. We resolve M:M by using associate table concept.

doctor Patientdoctor Patient

doc_pat

associate table

www.ilearnonline.co.in

Page 28: IBM Data Model1

M:M relationship Steps in resolving M:M

Split it into two 1:M relationships Example: Customer and Product has M:M

relationship Always M:M is intersected by some other parameter, in

this case it’s the orders. Example: Customer and Survey.

In the case in survey you will have multiple questions, so the set of questions becomes the parameter.

Example: Course and Branches. In this case a specific course can be conducted in more

than one branch. In one branch you will conduct different courses.

www.ilearnonline.co.in

Page 29: IBM Data Model1

Special cases Recursion

Empno and mgrno are stored in the same entity Mgrno is also an employee number Usually this goes as self referential integrity constraint We define this as a foreign key which refers the PK of the

same table. A recursive relationship is fully optional.

www.ilearnonline.co.in

customer account phone

Call records

Page 30: IBM Data Model1

Sample TablesCust_id Cust_name Email id Contact name Cust_since

100 Citi [email protected] Bill H 10-JAN-08

101 HSBC [email protected] Tim D 15-APR-10

102 SBI [email protected] Ram K 12-APR-11

Act_id Cust_id No_of_phones

1234 100 2

1235 101 3

1236 100 2

1237 102 1

Phone_no Act_id

123456776 1234

987654321 1235

456780987 1234

098756782 1234

Call_id Duration Phone_no

56789 14 123456776

56790 15 987654321

--------- --------- ---------

Current system is generating bills based on the customer id. Today, Bank A bought the Bank B, then we should generate the bills bank B tobank A only. How do we change this model.

www.ilearnonline.co.in

Page 31: IBM Data Model1

Normalization It’s a methodology we follow in order to make sure there is

no redundant data available in the data model. It’s a part of your logical model process with in database

design Advantages:

Reduced space in the db Transaction speed increases

Disadvantage Complex queries i.e. a query which has more number of

table joins will tend to impact the query performance.

Cust_idCust_nameCust_dobCust_phoneCust_emailCust_city

customer

•Assume we have 10000 rows in this table. We have customersfrom 5 different cities. Mumbai has 2000 customers.•One of the city name changed again… (Mumbai -- > ……)•In the real world one activity happen. What statement you will issueto record the same change on our data. How many records you changed?

www.ilearnonline.co.in

Page 32: IBM Data Model1

Normalization process Its about breaking entities into their most granular form

1st Normal Form (1NF) Every attribute must be atomic Repeating attributes moved to a separate entity

2nd Normal form (2NF)All the attributes should be functionally / partially dependent on

the primary key / concatenated 3rd Normal form (3NF)

All the attributes should be primarily dependent on the primary key

No repeating elements or groups of elements – 1st NF No partial dependencies on a concatenated key -- 2nd NF No dependencies on non-key attributes – 3rd NF

www.ilearnonline.co.in

Page 33: IBM Data Model1

Table to be normalized

www.ilearnonline.co.in

This excel kind of data is tracking all the transactions happens when a customerrents the vehicle. It has some customer information, vehicle information and therented out and returned back data.

Normalize this table based on your assumptions.

Page 34: IBM Data Model1

Normalized model

www.ilearnonline.co.in

Page 35: IBM Data Model1

Tables – to analyzeStore name Address Jan_sales Feb_Sales Mar_sales Apr_Sales May_sales June_sales

ABC Stores 123, MGRoad, Chennai

345609 94040 45958 748490 849938 84949

BBC Stores 124, 5th Cross, Anna Nagar, CHN

849409 440400 9840940 89989 456655 23455

cust_nm Address Email 1 Email 2 Mobile1 mobile2

Ram Kumar 123, Housur Road, BLR

[email protected] [email protected] 84940390 484040398

Ramesh K 34,ABC St [email protected]

Null 89490490 null

www.ilearnonline.co.in

Page 36: IBM Data Model1

What should we have in requirements

Page 37: IBM Data Model1

Agent Management -- Requirements Agent is resource through we which we sell the products to our end customers. With out agent you cannot sell any product we have. Products have different life spans, minimum payment, minimum terms etc.

Product can be sold during its offer only. Company provides the training to agents when a new product has been

launched. Only agents who completed the training can sell that product. When a customer buys a product through an agent, then we create a contract

called as policy. All the policy numbers will uniquely identify a product, a customer, agent.

We get revenue after getting the contract and have to pay the commission to agent. This commission depends on premium then pay.

System should be able to track the premium payment data and intimate to respective agents to do a follow up for the payments which are due.

Need to stored the customer multiple address, agents family, addresses and bank /account information for payment processing.

Organization should be able to generate the expected commission for next month based on the policy premiums we expect.

www.ilearnonline.co.in

Page 38: IBM Data Model1

Conceptual Model for Agent System

AgentBankfamily

ProductTermsPlans

Comm. plan

Training program

customer policy

Policy transactions

agent commission

www.ilearnonline.co.in

Page 39: IBM Data Model1

Normalization

1NF, 2NF, 3NF are breaking entities into their most granular form

4NF and 5NF also about granularity, but only the granularity of associative entities with 3 or more parents

A B

AB

A B

ABC

C

No 4NF or 5NF concerns Possible 4NF or 5NF concerns

Page 40: IBM Data Model1

4NF and 5NFA B

ABC

C

Assume this is wrong

A B

AB

C

4NF

BC

A B

AB

C

5NF

BC CA

Page 41: IBM Data Model1

Example

Program Worker Role

Assignment

This is OK if…….Any combination is valid

BUT…..Workers have defined rolePrograms require certain rolesWorkers are assigned programs independent of roles

Then we will end up writing lots and lots of coding to make sure to implement thesebusiness rules. If we normalize it some of the rules are automatically taken care of

Page 42: IBM Data Model1

Modeling ApproachTop Down Good way of getting a model We will be thinking out of the box because we talk about entities,

how two entities relates to each other etc Think about the business in broader sense by using the Business

Analysts / subjects. Most of the companies follow the top down approach and get

validate based on the scenarios.

Bottom Up Based on the output we will create the model Kind of reverse engineering (some times) Easy to normalize, but we may miss out on bigger picture.

www.ilearnonline.co.in

Page 43: IBM Data Model1

Mutually exclusive relationship

order

customer

Internaldepartment

In this scenario, an order can be placed only by customer or by internal department

So its mutually exclusive (Solid lines exactly one)

The other example is, payment table in telecom billing system can have credit_card,Cash or check. In one payment you will accept either one of these three, so itsmutually exculsive

)

Page 44: IBM Data Model1

Mutually inclusive relationship

Advertisement

Radio

Televison

In this scenario, an advertisement can be placed in different media

So its mutually inclusive (dotted lines one or more)

)

Internet

Page 45: IBM Data Model1

Claim Processing system

claims

contract

Contracttype

claimstatus

category

product

city

state

region

Agent

customer

Cust_type

www.aroha.co.in www.ilearnonline.co.in

branches

Claimcategory

Claim transaction

Page 46: IBM Data Model1

Modeling time and historyPrinciples

Don’t change the stored data, if you do you loose the history

It becomes as is reporting. Add new records when there is a change This way you track the history of changes to that record

and when it got changed. If they can correct the changes, then capture the

correction date also, so that you have all the information to track back what did happen.

Have audit columns in place to capture the time and history.

Page 47: IBM Data Model1

Logical / Physical Modeling

Logical Data Model Physical Data Model

Business info and rules Model in the database

Entity, Attribute Tables, Columns

Primary Key, Alternate key Primary Key constraint, unique index

Inversion Key Non unique index (for performance)

Rule Check constraint / triggers

Relationship PK /FK

Definition of the entity Comments

www.aroha.co.in www.ilearnonline.co.in

Page 48: IBM Data Model1

Physical Model Implementation All most same as Logical Model, with little variation.

Taking the advantage of the database what you are dealing with.

Make sure all the PK and FK are numbers. (Try a POC before we say this)

Example: Should I use partitions or materialized views or external tables etc.

Should we use columnar Database (Vertica, Opensource db) for certain data marts.

Designing rolling window based on partitions to improve the performance.

Creating views to simulate the scenario rather than increasing the database size (especially small tables)

Page 49: IBM Data Model1

Checklist for Physical Model Logical model as the input. Factors of the DB on which we deploy the tables. (Oracle,

SQL Server, DB2, MySQL etc)

Data Velocity factors (Frequency of activities in the business) Initial Load Data & Incremental load planning Data Volume calculation (Capacity Planning) Identifying importance of tables in terms of joins and

table data volume Deciding partitions, indexes, views, materialized views Implementing back up recovery mechanism for the

system.

Page 50: IBM Data Model1

Channel Partner Payment system hp sells the products through channel partners. The scope of the project is related to only sales. Through presales system hp generates the quotes and provide the

same to channel partners, channel partners sells the products to end customers based on the quote (This is outside of this system)

Certain channel partners can sell only certain products of hp. End customer who buy hp products through channel partner can be

an individual or a company (don’t want to consider end customers as part of the scope).

Based on the sales made by the channel partner, hp have to raise the invoices to channel partner.

System should have the ability to store the grade of the channel partner.

hp provides the commission to channel partners based on the sales they made automatically once in three months.

System should have the ability to store the addresses, different kinds of contacts and bank information of channel partners.

Channel partner payment to our invoices and the payment to channel partner commission are tracked in this system only

Page 51: IBM Data Model1

De-normalization Is a process where we increase the query

performance Especially used for reporting, not used to

increase the transaction processing To get best of both worlds, is to create the

normalized model for faster transaction processing and take the advantage of oracle’s materialized views concept to get your reports run faster

We are making an another copy of the data, but system takes care of it. This way we don’t implement new bugs.

Page 52: IBM Data Model1

Insurance Business Company sells insurance to various customers through agents We sell different kind of insurance policies like Risk, kids education,

endowment, pension etc Nominations must exists in each one of the policy. Max of 2

nominations can exists of every policy. We can surrender the policy if we paid the minimum number of

premiums. This number varies between policies Customers can get loan against the policy they have. Agent commission should be processed by the system. Based on the policy as well as the premium paid, the commission to

agent differs. When we surrender the policy, the charges are applicable. When we pay the premium late the interest should be added to the

payment.

Build a conceptual model and logical model for the above mentioned business.

Page 53: IBM Data Model1

Telephone Billing Customer comes and buys the telephone Customer can have multiple phones. Customers can be either corporate or individuals Billing flexibility should be available based on certain phone

numbers so that I can send the consolidated bill to different groups with in the corporation

Billing cycle can be decided by customer Must maintain different addresses of the customer Always we have to mail the bill to Billing Address, should have

flexibility to send it to an electronic address also. Customer can subscribe for multiple service like wireless,

internet…. Customer can select a specific plan (rate)

Page 54: IBM Data Model1

Retail Case Study

Assume, we want to create a data model which takes care of a retail chain.

We want to create the model which stores the supplier information, order management through which we place orders to all the suppliers.

We have one warehouse from which point, we distribute to various stores in the city and store the point of sale.

Customer can return the goods with in 7 days. Some of the products cannot be taken back.

One employee can work multiple stores.

Page 55: IBM Data Model1

Star Schemas for General Insurance

Training program

AGENT

product time

Trgfact

Enrollfact

customer

Premium fact

Exp revfact

Comm. fact

Paymenttype

policy

channel

claims fact

Coveragetype

location

workflowfact

policystatus

Page 56: IBM Data Model1

Org_cat_idDesc

Org_idOrg_cat_id

DescManager_id

Emp_idEname

JoindateSal

CommOrg_id

Cust_idName

Emp_idAddressPhoneEmailState

Prod_idName

Start_dateEnd_date

Cust_idPolicy_idProd_idStart_dtEnd_dtnotes

Cust_idPolicy_idTrans_id

DateTran_type_id

amt

Tran_type_iddesc

organization

Org_cat

product

Employees

CustomersTransaction_type

Transactions

Cust Policies

Insurance Model - Normalized model

Dpet_idDept_name

Org_id

department

Page 57: IBM Data Model1

Snow flake - Insurance model

Org_idNameCat_id

Manager_nm

Emp_idEname

Join dateSal

Dept_id

Cust_idName

AddressPhone

State_id

Product_idName

Start_dateEnd_date

Cal_date_idDate

Week_id

Org_idEmp_idCust_id

Product_idPolicy_id

Cal_date_idPayment_amtClaims_amt

org_dim

emp_dim

txn_fact

Cal_dim

cust_dim

policy_dim

Cat_idCat_desc

Cat_lookup

Dept_idDept_name

dept_lookup

Week_idCal_weekMonth_id

Week_lookup

State_idState_desc

State_lookup

Month_idCal_monthQuarter_id

month_lookupQuarter_idQuarterYear_id

Quarter_lookup

Year_idyear

Year_lookup

Page 58: IBM Data Model1

Normalized Model (Electronic AtoZ Store)

Call Center

In store POS

Online Sales

Page 59: IBM Data Model1

Thank you

Questions