introduction to data warehousing
Post on 08-Jun-2015
497 Views
Preview:
TRANSCRIPT
INTRODUCTION TO DATA
WAREHOUSING
PRESENTED BY
GANESH DHARESHWAR ABHIJAN GHOSH
What is data warehousing?
data warehouse is a database used for reporting and analysis Integrated collection of ENTERPRISE-WIDE DATA, oriented to decision
making Provides strategic informationPerforming Information analysis that could not
done by operating system
Need for data warehousing
Maintain data historyEven if the source transaction systems do not.
Integrate data from multiple source systems, Improve data quality by providing consistent
codes and descriptionsProvides a flexible, conducive and interactive
source of strategic informationPerforming Information analysis that could not
done by operating system
Data Rich, but Information Poor
• Data is stored, not explored : by its volume and complexity it represents a burden, not a support
• Data overload results in uninformed decisions, contradictory information, higher overhead, wrong decisions, increased costs
• Data is not designed and is not structured for successful management
decision making
Improving Decision Making
5
Data
Information
Decisions
Data Warehouse
Operational data storesData focuses on transaction functions such as bank card withdrawals and depositsIt is organised by application It contains the current valuesIt supports day-to-day operational decision
supports information it is detailed , nonredundant and updateable
ODS
Informational data stores It is organised around subject
such as customer, productIt is summarized, archived,
derivedData is static until refreshedData is nonupdateable
Difference between operational &informational data stores
OperationalData
Informational data
Data content
Data organizationData stabilityData structure
Access frequencyAccess type
Response time
Current value
By application Dynamic Optimized for transaction
HighRead/update/deleteField by fieldSubsecond(<1s) to2-3s
Summarized, archived, derived
By subjectStatic until refreshedOptimized for complexQueriesMedium to lowRead/aggregateAdded toSeveral second to minute
Data warehousing is defined as
A data warehouse is a subject-oriented, integrated,
time-variant, non-volatile collection of data in support of management decision
A data warehouse is designed for easy access by users to large amounts of information, and data access is typically supported by specialized analytical tools and applications.
Data Warehouse Characteristics
It is database designed for analytical tasks, using data from multiple application
It supports a relatively small numbers of users with relatively long interaction
Its content is periodically updatedIt contains current and historical data to
provide a historical perspective of informationIt contains a few large tables
Integrated
12
• Data is stored once in a single integrated location(e.g. insurance company)
Data WarehouseDatabase
Subject = Customer
Auto PolicyProcessing
System
Auto PolicyProcessing
System
Customer data stored in severaldatabases
Fire PolicyProcessing
System
Fire PolicyProcessing
System
FACTS, LIFECommercial, Accounting
Applications
FACTS, LIFECommercial, Accounting
Applications
Time - Variant
13
Data is tagged with some element of time - creation date, as of date, etc.
Data is available on-line for long periods of time for trend analysis and forecasting. For example, five or more years
Data Warehouse Data
Time Data
{Key
• Data is stored as a series of snapshots or views which record how it is collected across time.
Non-Volatile
14
• Existing data in the warehouse is not overwritten or
updated. External Sources
• Read-Only
DataWarehouseDatabaseData
WarehouseEnvironment
Data Warehouse
Environment
ProductionDatabases
ProductionApplications
ProductionApplications
• Update• Insert• Delete
• Load
Subject Oriented
15
• Example for an insurance company :
PolicyPolicyCustomerCustomer
Data
LossesLosses PremiumPremium
Commercial and Life
Insurance Systems
Commercial and Life
Insurance Systems
Auto and Fire Policy
Processing Systems
Auto and Fire Policy
Processing Systems
Data
Accounting System
Accounting System
Claims Processing
System
Claims Processing
SystemBilling System
Billing System
Applications Area Data Warehouse
Data Warehouse Architecture
It is based on a relational database management system server that function as the central repository for informational data
17
Conversion & Interface OLAP
Cubes
Ad-hocReporting
CannedReports
Data MartsStaging AreaODS
Operational System Data Warehouse
Data Warehouse Architecture
The source data for it is operational applicationDuring processing data is transformed into an
integrated structure and formatThe transformation process may involve
conversion, summarization, filtering and condensation of data
References:
Introduction to data warehousing.wikipedia.org/wiki/Data_warehousewww.slideshare.net/datacleaners11/datawarehousingppt
www.4shared.com/office/pLEWhceH/Data_Warehousing.html
www.cse.iitb.ac.in/dbms/Data/Talks/krithi-talk-impact.ppt
top related