introduction to data warehousing

21
INTRODUCTION TO DATA WAREHOUSING PRESENTED BY GANESH DHARESHWAR ABHIJAN GHOSH

Upload: girish-dhareshwar

Post on 08-Jun-2015

497 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to data warehousing

INTRODUCTION TO DATA

WAREHOUSING

PRESENTED BY

GANESH DHARESHWAR ABHIJAN GHOSH

Page 2: Introduction to data warehousing

What is data warehousing?

data warehouse is a database used for reporting and analysis Integrated collection of ENTERPRISE-WIDE DATA, oriented to decision

making Provides strategic informationPerforming Information analysis that could not

done by operating system

Page 3: Introduction to data warehousing

Need for data warehousing

Maintain data historyEven if the source transaction systems do not.

Integrate data from multiple source systems, Improve data quality by providing consistent

codes and descriptionsProvides a flexible, conducive and interactive

source of strategic informationPerforming Information analysis that could not

done by operating system

Page 4: Introduction to data warehousing

Data Rich, but Information Poor

• Data is stored, not explored : by its volume and complexity it represents a burden, not a support

• Data overload results in uninformed decisions, contradictory information, higher overhead, wrong decisions, increased costs

• Data is not designed and is not structured for successful management

decision making

Page 5: Introduction to data warehousing

Improving Decision Making

5

Data

Information

Decisions

Data Warehouse

Page 6: Introduction to data warehousing

Operational data storesData focuses on transaction functions such as bank card withdrawals and depositsIt is organised by application It contains the current valuesIt supports day-to-day operational decision

supports information it is detailed , nonredundant and updateable

ODS

Page 7: Introduction to data warehousing

Informational data stores It is organised around subject

such as customer, productIt is summarized, archived,

derivedData is static until refreshedData is nonupdateable

Page 8: Introduction to data warehousing

Difference between operational &informational data stores

OperationalData

Informational data

Data content

Data organizationData stabilityData structure

Access frequencyAccess type

Response time

Current value

By application Dynamic Optimized for transaction

HighRead/update/deleteField by fieldSubsecond(<1s) to2-3s

Summarized, archived, derived

By subjectStatic until refreshedOptimized for complexQueriesMedium to lowRead/aggregateAdded toSeveral second to minute

Page 9: Introduction to data warehousing

Data warehousing is defined as

A data warehouse is a subject-oriented, integrated,

time-variant, non-volatile collection of data in support of management decision

A data warehouse is designed for easy access by users to large amounts of information, and data access is typically supported by specialized analytical tools and applications.

Page 10: Introduction to data warehousing
Page 11: Introduction to data warehousing

Data Warehouse Characteristics

It is database designed for analytical tasks, using data from multiple application

It supports a relatively small numbers of users with relatively long interaction

Its content is periodically updatedIt contains current and historical data to

provide a historical perspective of informationIt contains a few large tables

Page 12: Introduction to data warehousing

Integrated

12

• Data is stored once in a single integrated location(e.g. insurance company)

Data WarehouseDatabase

Subject = Customer

Auto PolicyProcessing

System

Auto PolicyProcessing

System

Customer data stored in severaldatabases

Fire PolicyProcessing

System

Fire PolicyProcessing

System

FACTS, LIFECommercial, Accounting

Applications

FACTS, LIFECommercial, Accounting

Applications

Page 13: Introduction to data warehousing

Time - Variant

13

Data is tagged with some element of time - creation date, as of date, etc.

Data is available on-line for long periods of time for trend analysis and forecasting. For example, five or more years

Data Warehouse Data

Time Data

{Key

• Data is stored as a series of snapshots or views which record how it is collected across time.

Page 14: Introduction to data warehousing

Non-Volatile

14

• Existing data in the warehouse is not overwritten or

updated. External Sources

• Read-Only

DataWarehouseDatabaseData

WarehouseEnvironment

Data Warehouse

Environment

ProductionDatabases

ProductionApplications

ProductionApplications

• Update• Insert• Delete

• Load

Page 15: Introduction to data warehousing

Subject Oriented

15

• Example for an insurance company :

PolicyPolicyCustomerCustomer

Data

LossesLosses PremiumPremium

Commercial and Life

Insurance Systems

Commercial and Life

Insurance Systems

Auto and Fire Policy

Processing Systems

Auto and Fire Policy

Processing Systems

Data

Accounting System

Accounting System

Claims Processing

System

Claims Processing

SystemBilling System

Billing System

Applications Area Data Warehouse

Page 16: Introduction to data warehousing

Data Warehouse Architecture

It is based on a relational database management system server that function as the central repository for informational data

Page 17: Introduction to data warehousing

17

Conversion & Interface OLAP

Cubes

Ad-hocReporting

CannedReports

Data MartsStaging AreaODS

Operational System Data Warehouse

Page 18: Introduction to data warehousing

Data Warehouse Architecture

The source data for it is operational applicationDuring processing data is transformed into an

integrated structure and formatThe transformation process may involve

conversion, summarization, filtering and condensation of data

Page 19: Introduction to data warehousing
Page 20: Introduction to data warehousing

References:

Introduction to data warehousing.wikipedia.org/wiki/Data_warehousewww.slideshare.net/datacleaners11/datawarehousingppt

www.4shared.com/office/pLEWhceH/Data_Warehousing.html

www.cse.iitb.ac.in/dbms/Data/Talks/krithi-talk-impact.ppt

Page 21: Introduction to data warehousing