kushal data warehousing ppt

53
BY BY Kushal Singh Kushal Singh Acute Informatics Pvt Acute Informatics Pvt

Upload: kushal-singh

Post on 09-Jul-2015

441 views

Category:

Education


0 download

DESCRIPTION

In computing, a data warehouse (DW, DWH), or an enterprise data warehouse (EDW), is a database used for reporting (1) and data analysis (2). Integrating data from one or more disparate sources creates a central repository of data, a data warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons.

TRANSCRIPT

Page 1: Kushal Data Warehousing PPT

BY BY Kushal SinghKushal Singh Acute Informatics Pvt Acute Informatics Pvt

Page 2: Kushal Data Warehousing PPT

What is Business Intell igence?

BI is an abbreviation of the two words     Business Intelligence, bringing the right information at the right time to the right 

people in the right format.

Page 3: Kushal Data Warehousing PPT

What is Data What is Data Warehousing?Warehousing?

Data Warehouse is a Data Warehouse is a subject-oriented, subject-oriented, integrated, nonvolati leintegrated, nonvolati le and and time-time-variantvariant collection of data in support of collection of data in support of management’s decisions.management’s decisions.

Page 4: Kushal Data Warehousing PPT
Page 5: Kushal Data Warehousing PPT

What is Business Intelligence?

Page 6: Kushal Data Warehousing PPT

Operational

data source1

 The architecture 

Query Manage

Warehouse Manager

DBMS

Operational

data source 2

Meta-dataHigh

summarized data

Detailed data

Lightly

summarized

data

Operational

data store (ods)

Operational

data source n

Archive/backup

data

Load Manager

Data mining

OLAP(online analytical processing) tools

Reporting, query,

application development,

and EIS(executive information system) tools

End-user

access toolsTypical architecture of a data warehouse

Operational data store (ODS)

Page 7: Kushal Data Warehousing PPT

The benefits of data The benefits of data warehousingwarehousing

• The potential benefits of data warehousing The potential benefits of data warehousing are high returns on investment..are high returns on investment..

• substantial competitive advantage..substantial competitive advantage..• increased productivity of corporate increased productivity of corporate

decision-makers..decision-makers..

Page 8: Kushal Data Warehousing PPT

8

Data Warehouse Data Warehouse CharacteristicsCharacteristics

Key Characteristics of a Data WarehouseKey Characteristics of a Data Warehouse

Subject-orientedSubject-oriented IntegratedIntegratedTime-variantTime-variantNon-volatileNon-volatile

Page 9: Kushal Data Warehousing PPT

9

Subject OrientedSubject Oriented• Example for an insurance company :

PolicyPolicyCustomerCustomer

Data

LossesLosses PremiumPremium

Commercial and Life

Insurance Systems

Commercial and Life

Insurance Systems

Auto and Fire Policy

Processing Systems

Auto and Fire Policy

Processing Systems

Data

Accounting System

Accounting System

Claims Processing

System

Claims Processing

SystemBilling System

Billing System

Applications Area Data Warehouse

Page 10: Kushal Data Warehousing PPT

10

IntegratedIntegrated• Data is stored once in a single integrated location

(e.g. insurance company)

Data WarehouseDatabase

Subject = Customer

Auto PolicyProcessing

System

Auto PolicyProcessing

System

Customer data stored in severaldatabases

Fire PolicyProcessing

System

Fire PolicyProcessing

System

FACTS, LIFECommercial, Accounting

Applications

FACTS, LIFECommercial, Accounting

Applications

Page 11: Kushal Data Warehousing PPT

11

Time - VariantTime - Variant

Data is tagged with some element of time -  creation date, as of date, etc.

Data is available on-line for long periods of time for trend analysis and forecasting. For example, five or more years

Data Warehouse Data

Time Data

  {Key

• Data is stored as a series of snapshots or views which record how it is collected across time.

Page 12: Kushal Data Warehousing PPT

12

Non-Volatile

• Existing data in the warehouse is not overwritten or

updated. External Sources

• Read-Only

DataWarehouseDatabaseData

WarehouseEnvironment

Data Warehouse

Environment

ProductionDatabases

ProductionApplications

ProductionApplications

• Update• Insert• Delete

• Load

Page 13: Kushal Data Warehousing PPT

Comparision of OLTP systems and data Comparision of OLTP systems and data warehousing systemwarehousing system

OLTP systemsOLTP systems Data warehousing systemsData warehousing systems

Hold current dataHold current dataStores detailed dataStores detailed data

Data is dynamicData is dynamicRepetitive processingRepetitive processing

High level of transaction throughputHigh level of transaction throughputPredictable pattern of usagePredictable pattern of usage

Transaction-drivenTransaction-drivenApplication-orentedApplication-orented

Supports day-to-day decisionsSupports day-to-day decisionsServes large number of clerical/operation usersServes large number of clerical/operation users

Holds historical dataHolds historical dataStores detailed, lightly, and highly summarized Stores detailed, lightly, and highly summarized datadataData is largely staticData is largely staticAd hoc, unstructured, and heuristic processing Ad hoc, unstructured, and heuristic processing Medium to how level of transaction throughputMedium to how level of transaction throughputUnpredictable pattern of usageUnpredictable pattern of usageAnalysis drivenAnalysis drivenSubject-orientedSubject-orientedsupports strategic decisionssupports strategic decisionsServes relatively how number of managerial Serves relatively how number of managerial users users

Page 14: Kushal Data Warehousing PPT

OLTPOLTP

Online Transaction Online Transaction ProcessingProcessing

Page 15: Kushal Data Warehousing PPT

OOn n LLine ine TTransaction ransaction PProcessingrocessing

• What is a Transaction ? What is a Transaction ?

– AA Logical unit of workLogical unit of work

– Examples:Examples:– Drawing Money from a bank accountDrawing Money from a bank account– Booking a seat on an airlineBooking a seat on an airline

Page 16: Kushal Data Warehousing PPT

TransactionsTransactions• It is a unit of program execution that It is a unit of program execution that

accesses & possibly updates various data accesses & possibly updates various data items.items.

• A transaction is a A transaction is a logical unit of worklogical unit of work that that performs some useful function for a user. performs some useful function for a user.

• In end of the transaction the system must In end of the transaction the system must be: be: – in the prior state (if the transaction fails) or in the prior state (if the transaction fails) or – the status of the system should reflect the the status of the system should reflect the

successful completion (if the transaction successful completion (if the transaction succeeded). succeeded).

• May take a database from one consistent May take a database from one consistent state to another.state to another.

Page 17: Kushal Data Warehousing PPT

Characteristics of TransactionsCharacteristics of Transactions

AA tomicitytomicityCC onsistencyonsistencyII solationsolationDD urabilityurability

Page 18: Kushal Data Warehousing PPT

OLAPOLAP

Online Analytical ProcessingOnline Analytical Processing

Page 19: Kushal Data Warehousing PPT

Types of OLAP

• ROLAP (Relational Online Analytical ROLAP (Relational Online Analytical Processing)Processing)

• MOLAP (Multidimensional Online MOLAP (Multidimensional Online Analytical Processing)Analytical Processing)

• HOLAP (Hybrid Online Analytical HOLAP (Hybrid Online Analytical Processing)Processing)

Page 20: Kushal Data Warehousing PPT

ROLAPROLAP

• ROLAP (Relational online analytical Processing)

• Used for reporting • Tools: Report studio

Page 21: Kushal Data Warehousing PPT

MOLAPMOLAP

• MOLAP (Multidimensional online Analytical processing)

• Used to build cubes• Tools: Powerplay, Transformer

Page 22: Kushal Data Warehousing PPT

HOLAPHOLAP

• HOLAP (Hybrid online analytical Processing)

• Used for Data modeling• This will support both MOLAP and ROLAP • Tools: Framework manager, Query Studio.

Page 23: Kushal Data Warehousing PPT

DimensionsDimensions

• It’s descriptive information about a It’s descriptive information about a measures like product, location, customer measures like product, location, customer etc.etc.

Page 24: Kushal Data Warehousing PPT

Types of DimensionsTypes of Dimensions

• Confirmed DimensionsConfirmed Dimensions• Degenerated DimensionsDegenerated Dimensions• Junk DimensionsJunk Dimensions

Page 25: Kushal Data Warehousing PPT

FactsFacts

• Fact is containing measures and IDs.Fact is containing measures and IDs.• Ex; Revenue, Cost, Amount etcEx; Revenue, Cost, Amount etc

Page 26: Kushal Data Warehousing PPT

Measure TypesMeasure Types

• Additive Measures:Additive Measures: Which can be added Which can be added across all the dimensionsacross all the dimensions

• Non Additive Measures:Non Additive Measures: Which can not be Which can not be added across all the dimensionsadded across all the dimensions

• Semi Additive Measures:Semi Additive Measures: Which can be Which can be added across some dimensions and which added across some dimensions and which can not be added across some other can not be added across some other dimensionsdimensions

Page 27: Kushal Data Warehousing PPT

Schema’s In Data warehousingSchema’s In Data warehousing

• STAR SHEMASTAR SHEMA• SNOW-FLAKE SCHEMASNOW-FLAKE SCHEMA• STAR-FLAKE SCHEMASTAR-FLAKE SCHEMA

Page 28: Kushal Data Warehousing PPT

28

Star SchemaStar Schema

Region_Dimension_Tableregion _id

NENWSESW

region _doc

NortheastNorthwestSoutheastSouthwest

account _id

100000110000120000130000140000

account _doc

ABC ElectronicsMidway ElectricVictor ComponentsWashburn, Inc.Zerox

Account_Dimension_Table

Product_Dimension_Tableprod_grp_id

102030

prod_id

100140220

prod_grp_desc

Fewer devicesCircuit boardsComponents

prod_desc

Power supplyMotherboardCo-processor

month

01-199602-199603-1996

mo_in_fiscal_yr

456

month_name

JanuaryFebruaryMarch

prod_id

100140220

region_id

SWNESW

account_id

100000110000100000

vend_id

100200300

net-sales

30,00023,00032,000

gross_sales

50,00042,00049,000

Monthly_Sales_Summary_Table

Time_Dimension_Table

Fact Table

Dimension Tables

Vendor_Dimension_Tablevend_id

100200300

vendor_desc

PowerAge, Inc.Advanced Micro DevicesFarad Incorporated

month

01-199602-199603-1996

Page 29: Kushal Data Warehousing PPT

SNOW-FLAKE SCHEMASNOW-FLAKE SCHEMA

Page 30: Kushal Data Warehousing PPT

Factless Fact TableFactless Fact Table• It’s just a bridge between table where we used to join It’s just a bridge between table where we used to join

tables.tables.

• In this scenario we can only track the eventIn this scenario we can only track the event..

Page 31: Kushal Data Warehousing PPT

SCDSCD(Slowly Changing Dimensions)(Slowly Changing Dimensions)

• TYPE 0TYPE 0• TYPE 1TYPE 1• TYPE 2TYPE 2• TYPE 3TYPE 3

Page 32: Kushal Data Warehousing PPT

ETLETL (Extract, Transform and Loading) (Extract, Transform and Loading)

INFORMATICAINFORMATICA

Page 33: Kushal Data Warehousing PPT
Page 34: Kushal Data Warehousing PPT
Page 35: Kushal Data Warehousing PPT
Page 36: Kushal Data Warehousing PPT
Page 37: Kushal Data Warehousing PPT
Page 38: Kushal Data Warehousing PPT
Page 39: Kushal Data Warehousing PPT
Page 40: Kushal Data Warehousing PPT

DesigningDesigning

FRAMEWORK FRAMEWORK MANAGERMANAGER

Relational DatabaseRelational Database & &

DMRDMR

Page 41: Kushal Data Warehousing PPT
Page 42: Kushal Data Warehousing PPT
Page 43: Kushal Data Warehousing PPT
Page 44: Kushal Data Warehousing PPT

REPORTINGREPORTING

IBM COGNOSIBM COGNOS

Page 45: Kushal Data Warehousing PPT
Page 46: Kushal Data Warehousing PPT
Page 47: Kushal Data Warehousing PPT
Page 48: Kushal Data Warehousing PPT
Page 49: Kushal Data Warehousing PPT
Page 50: Kushal Data Warehousing PPT
Page 51: Kushal Data Warehousing PPT
Page 52: Kushal Data Warehousing PPT
Page 53: Kushal Data Warehousing PPT