Download - Data warehousing
A producer wants to knowA producer wants to know
2
Which are our lowest/highest margin
customers ?
Which are our lowest/highest margin
customers ?
Who are my customers and what products are they buying?
Who are my customers and what products are they buying?
Which customers are most likely to go to the competition ?
Which customers are most likely to go to the competition ?
What impact will new products/services
have on revenue and margins?
What impact will new products/services
have on revenue and margins?
What product prom--otions have the biggest
impact on revenue?
What product prom--otions have the biggest
impact on revenue?
What is the most effective distribution
channel?
What is the most effective distribution
channel?
Data, Data everywhereData, Data everywhere• An MNC can’t find the data it needs
o data is scattered over the networko many versions, subtle differences
3
MNC can’t get the data it needs need an expert to get the data
MNC can’t understand the data found available data poorly documented
MNC can’t use the data found results are unexpected data needs to be transformed from one
form to other
What are the users What are the users saying...saying...
• Data should be integrated across the enterprise
• Summary data has a real value to the organization
• Historical data holds the key to understanding data over time
• What-if capabilities are required
4
What is a Data What is a Data Warehouse?Warehouse?
A single, complete and consistent store of current and historic data, obtained from a variety of different sources, and made available to end users in a way they can understand and use in business context.
5
What is Data What is Data Warehousing?Warehousing?
A process of transforming data into information and making it available to users in a timely enough manner to make a difference.
Technique for assembling and managing data, (from various sources) for the purpose of answering business questions. Thus making decisions that were not previous possible
6
Data
Information
Warehouses are very Large Warehouses are very Large Data BasesData Bases
• Terabytes -- 10^12 bytes:
• Petabytes -- 10^15 bytes:
• Exabytes -- 10^18 bytes:
• Zettabytes -- 10^21 bytes:
• Zottabytes -- 10^24 bytes:
7
Walmart -- 24 Terabytes
Geographic Info. Systems
National Medical Records
Weather images
Intelligence Agency Videos
Data Warehouse PropertiesData Warehouse Properties
o Subject Oriented
o Used to analyze business
o Summarized and refined
o Snapshot data
o Integrated Data
o Ad-hoc access
o Knowledge User (Manager)
o Thousands of Users
8
Components of the Components of the WarehouseWarehouse
• Data Extraction and Loading
• The Warehouse
• Analyze and Query -- OLAP Tools
• Metadata
9
Data Warehouse Data Warehouse ArchitectureArchitecture
10
Data Warehouse Engine
Optimized Loader
ExtractionCleansing
AnalyzeQuery
Metadata Repository
RelationalDatabases
LegacyData
Purchased Data
ERPSystems
Data Warehouse and Data Data Warehouse and Data
MartsMarts
11
OLAPData MartLightly summarizedDepartmentally structured
Organizationally structuredAtomicDetailed Data Warehouse Data
From the Data Warehouse From the Data Warehouse to Data Martsto Data Marts
12
DepartmentallyStructured
IndividuallyStructured
Data WarehouseOrganizationallyStructured
Less
More
HistoryNormalizedDetailed
Data
Information
Application AreasApplication Areas
13
Industry ApplicationFinance Credit Card AnalysisInsurance Claims, Fraud Analysis
Telecommunication Call record analysisTransport Logistics managementConsumer goods promotion analysisData Service providersValue added dataUtilities Power usage analysis