datawarehouse & bi introduction
DESCRIPTION
Data warehouse and Business Intelligence IntroductionTRANSCRIPT
Dataware Housing &
Business IntelligenceAn Overview
By Shivmohan Purohit
Agenda
• Introduction• Data Warehousing• Online Analytical Processing• Data Mining• Q & A
2
What a firm/ Organization want to know….
3
Which are our lowest/highest margin
customers ?
Which are our lowest/highest margin
customers ?
Who are my customers and what products are they buying?
Who are my customers and what products are they buying?
Which customers are most likely to go to the competition ?
Which customers are most likely to go to the competition ?
What impact will new products/services
have on revenue and margins?
What impact will new products/services
have on revenue and margins?
What product prom--otions have the biggest
impact on revenue?
What product prom--otions have the biggest
impact on revenue?
What is the most effective distribution
channel?
What is the most effective distribution
channel?
What is a Data Warehouse?
A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context.
4
What is Data Warehousing?
A process of transforming data into information and making it available to users in a timely enough manner to make a difference
5
Data
Information
Data Warehousing -- It is a process
• Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previous possible
• A decision support database maintained separately from the organization’s operational database
6
Data Warehousing• A data warehouse is a
– subject-oriented– integrated– time-varying– non-volatile
collection of data that is used primarily in organizational decision making.
7
• A data warehouse is organized around the major subjects of the organization such as customer, supplier, product, sales, etc..,
• Data warehouse provides a simple and concise view around a particular subject by excluding data that are not useful to the decision support process.
8
Data Warehousing
Type of DW Users
9
Explorers: Seek out the unknown and previously unsuspected rewards hiding in the detailed data
Farmers: Harvest information from known access paths
Tourists: Browse information
Application-Orientation vs. Subject-Orientation
10
Application-Orientation
Operational Database
LoansCredit Card
Trust
Savings
Subject-Orientation
DataWarehouse
Customer
VendorProduct
Activity
Functioning of Data warehousing
11
Data Source Cleaning Transformation
Data Warehouse
New Update
Data Warehouse Architecture
12
Data Warehouse Engine
Optimized Loader
ExtractionCleansing
AnalyzeQuery
Metadata Repository
RelationalDatabases
LegacyData
Purchased Data
ERPSystems
Star Schema
• A single fact table and for each dimension one dimension table
• Does not capture hierarchies directly
13
T ime
prod
cust
city
fact
date, custno, prodno, cityname, ...
Snowflake schema
• Represent dimensional hierarchy directly by normalizing tables.
• Easy to maintain and saves storage
14
T ime
prod
cust
city
fact
date, custno, prodno, cityname, ...
region
OLAP(Online analytical processing)
• A data warehouse stores data , but OLAP transform the data warehouse data into specific meaningful information.
• Therefore OLAP provides a user friendly environment for interactive data analysis.
15
OLAP OPERATION on the Multidimensional data
–Roll-up(GROUP)–Drill down(Less)–Slice and Dice(Pie)–Pivot(rotate)
16
Multi-dimensional Data
• “Hey…I sold $100M worth of goods”
17
MonthMonth1 1 22 3 3 4 4 776 6 5 5
Pro
du
ctP
rod
uct
Toothpaste Toothpaste
JuiceJuiceColaColaMilk Milk
CreamCream
Soap Soap
Regio
n
Regio
n
WWS S
N N
Dimensions: Dimensions: Product, Region, TimeProduct, Region, TimeHierarchical summarization pathsHierarchical summarization paths
Product Product Region Region TimeTimeIndustry Country YearIndustry Country Year
Category Region Quarter Category Region Quarter
Product City Month WeekProduct City Month Week
Office DayOffice Day
“Slicing and Dicing”
18
Product
Sales Channel
Regio
ns
Retail Direct Special
Household
Telecomm
Video
Audio IndiaFar East
Europe
The Telecomm Slice
Roll-up and Drill Down
• Sales Channel• Region• Country• State • Location Address• Sales Representative
19
Roll
Up
Higher Level ofAggregation
Low-levelDetails
Drill-D
ow
n
Nature of OLAP Analysis• Aggregation -- (total sales,
percent-to-total)• Comparison -- Budget vs.
Expenses• Ranking -- Top 10, quartile
analysis• Access to detailed and aggregate
data• Complex criteria specification• Visualization
20
Data Mining
• Data mining is sorting through data to identify patterns and establish relationships.
21
Data Mining (cont.)
22
Data Mining works with Warehouse Data
• Data Warehousing provides the Enterprise with a memory
23
• Data Mining provides the Enterprise with intelligence
24
Cleaning and Integration Databases
Data Warehouse
Flat Files
PatternsKnowledge
Selection and transformation
Data Mining
Data Mining Process
Thanks
Shivmohan Purohit
Q &A Discussion