concepts and components chapters 1, 2, 7, 14, 15 1
TRANSCRIPT
![Page 1: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/1.jpg)
1
DATA WAREHOUSINGConcepts and Components
Chapters 1, 2, 7, 14, 15
![Page 2: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/2.jpg)
2
Agenda
Background Data Warehouse vs Operational Data
Store Characteristics of a Data Warehouse Improvements in Data Warehousing Relationship to Business Intelligence
![Page 3: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/3.jpg)
3
Evolution of Decision Support Technologies Business people need information to make plans,
decisions, and assess results 60's
Batch reports 70’s
DSSs 80’s
Info Centers 90’s
Early DWs 2000's
Business Intelligence Issues:
Dependency on IT resources Based on OLTP or extracts Functionality often pre-programmed "Big Data" Analytics
![Page 4: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/4.jpg)
4
DW vs. Business Intelligence Short:
DW = populating structures with data BI = using DW data
Long: DW = body of historical data, separate from the
operations of the organization, used to create BI BI = the delivery of timely, accurate, and
useful information to decision-makers Broad:
BI = a broad category of applications, technologies, and organizational processes for gathering, storing, accessing, and analyzing data to help business users make better decisions
![Page 5: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/5.jpg)
5
Need for Decision-Optimized Data Storage Business people need information to
make plans, decisions, and assess results What were sales volumes by region and
product category for the last 3 years? Which of two new medications will result in
the best outcomes (higher recovery rate and shorter hospital stay)?
Data captured by complex operational systems (OLTPs) optimized to support well-defined transaction requirements
Difficult to get needed information from data grounded in OLTPs
![Page 6: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/6.jpg)
6
Operational vs. Informational Data Operational (ie,
OLTP)Informational (ie, OLAP)
Data Content Current Values Historical, derived, summarized
Data Structure
Optimized for transactions
Optimized for complex queries
Data Volume MB/GB of data GB/TB/PB… of data
Access Frequency
High Medium to low
Access Type Read, update, delete Read-only
Usage Predictable, repetitive Ad hoc, random, heuristic
Response Time
Sub-seconds Several seconds to minutes
Users Large number; operational & data workers
Relatively smaller number; data & knowledge workers
![Page 7: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/7.jpg)
7
Data Warehouse“… a subject-oriented, integrated, nonvolatile,
and time variant collection of data in support of management decisions.” Managing the Data Warehouse, W. H. Inmon, John
Wiley & Sons, December, 1996.
“… a copy of transaction data specifically structured for query and analysis.” The Data Warehouse Toolkit, R. Kimball, John Wiley
& Sons, February, 1996.
Enterprise data, transformed, integrated, accumulated over time, optimized for decision-making, and accessible via analytical tools
![Page 8: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/8.jpg)
8
Characteristics of a DW (ala Inmon) Subject-Oriented
As opposed to business-process oriented Integrated
Multiple sources, internal and external Critical part of DW implementation
Time-Variant History, time periods important
Non-Volatile DW data not changed once stored
![Page 9: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/9.jpg)
9
Subject Orientation Data organized based on:
How users refer to it Subject areas of interest to users Areas important for tracking success, performance
Often based on transactions
Graphic Source: http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=22&ved=0CHUQFjALOAo&url=http%3A%2F%2Fwww.csun.edu%2F~hcmgt004%2FDWPresSp05.ppt&ei=U1S9VMuOJsTCgwSes4HADg&usg=AFQjCNGvjoMayU2-79jot8f2zb4J_IbZGQ&bvm=bv.83829542,d.eXY&cad=rja
![Page 10: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/10.jpg)
10
Characteristics of a DW, cont… Subject-Oriented
Needs are business subject-focused Integrated
Multiple sources, internal and external Time-Variant
History, time periods important Non-Volatile
DW data not changed once stored Data Granularity
![Page 11: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/11.jpg)
11
Data Granularity*
Level of detail stored in database Operational focus Analytical focus
Examples: Life Insurance Policy vs. Life Insurance
Coverage Product Category vs. Product Sales
High granularity (eg, transactional grain) is most flexible
* See http://www.kimballgroup.com/2007/07/keep-to-the-grain-in-dimensional-modeling/ for excellent (and brief) description of granularity
![Page 12: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/12.jpg)
12
Challenges in Early DW Implementation1. Improper or infeasible architectures,
approaches2. Insufficient attention to
organizational strategy and culture3. Early information delivery tools too
complex for business users4. Storage technology made it difficult
to store much detail or history, and slow to process
![Page 13: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/13.jpg)
13
Improved Technology User-friendly tools for analysis,
visualization Excel Tableau Reporting Services, …
Improved technology for accessing, aggregating, partitioning data
Advances in processing technology Parallel processing In-Memory data warehouses
Advances in storage technology RAID Solid State
![Page 14: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/14.jpg)
14
Improved Architectures
Based on Data Marts Conformed dimensions BI-emphasis
![Page 15: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/15.jpg)
15
Data Warehouse vs. Data Marts
Enterprise Data Warehouse Information about ALL subjects
important to the organization
Pros Cons
Requires a corporate effort Requires a corporate effort
Single, central storage Longer (costlier) to implement
Centralized control, architecture
Higher risk of failure
![Page 16: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/16.jpg)
16
Data Warehouse vs. Data Marts, cont…
Pros Cons
Faster implementation Can introduce redundant data
Earlier return on investment
Can make data mart integration more complex
Less risk of failure
Gives project team time to learn, grow
Data Marts Subsets of data warehouse that focus on
a selected subject area; typically departmental in nature
![Page 17: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/17.jpg)
17
Data Warehouse Architecture: Basic
![Page 18: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/18.jpg)
18
Data Warehouse Architecture: Types
![Page 19: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/19.jpg)
19
BI Architecture
Source: Chaudhuri et. al. , An Overview of Business Intelligence Technology, Communications of the ACM, 54(8), August 2011, pp. 88-98.
![Page 20: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/20.jpg)
20
BI Architecture, cont…
Source: Oracle Corporation. Information Management and Big Data: A Reference Architecture, Oracle White Paper, February 2013, p. 12.
![Page 21: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/21.jpg)
21
BI Architecture as “LDW”
Source: http://skylandtech.net/2014/09/22/a-modern-data-warehouse-architecture-part-1-add-a-data-lake/
![Page 22: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/22.jpg)
22
Architecture Components
1. Data Sources2. Data Staging (Movement)3. Data Storage (Warehouse) 4. Data Analysis/Discovery (Mid-tier)5. Information Delivery (Front-end
Presentation)
![Page 23: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/23.jpg)
23
1. Data Sources
Identifying required business data from Production Internal, Personal Archived External
![Page 24: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/24.jpg)
24
2. Data Staging Extract
From source systems Transform
Cleanse Supplement Convert Combine…
Load Populate data warehouse/mart tables
![Page 25: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/25.jpg)
25
3. Data Storage
Data Warehouse / Data Mart Relational database for structured data Non-relational (e.g., Hadoop) data store
for "loosely-structured" data Metadata
Relational database Catalog Extended properties Custom tables
External products/tools Spreadsheets…
![Page 26: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/26.jpg)
26
4. Data Analysis: Supporting Knowledge Discovery
Forms of Discovery
Know What Info to Look For
Know How to Explore Info
Layout-Led Discovery
X X
Data-Led Discovery
X
Model-Led Discovery
Layout-Led Discovery Pre-Designed Reports
Data-Led Discovery OLAP Analysis
Model-Led Discovery Data Mining
![Page 27: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/27.jpg)
27
Pre-Defined Reports
Information pushed to user Content and layout pre-determined Can be parameter-driven Can support some drill-down May also include basic report
development
![Page 28: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/28.jpg)
28
OLAP Online Analytical Processing
Providing On-Line Analytical Processing to User Analysts, E. F. Codd, Codd & Date, Inc 1993.
Short Definition: Class of applications or tools that support
ad-hoc analysis of multidimensional data Longer Definition:
“…technology that enables [users]… to gain insight into data through…fast, consistent, interactive access [to]…information that has been transformed…to reflect the real dimensionality of the enterprise…” OLAP Council (www.olapcouncil.org)
![Page 29: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/29.jpg)
29
Data Mining
Search for patterns in large amounts of data Making connections/associations with
data Predicting future outcomes
OLAP vs. Data Mining “Report on the past” vs. “Predict the
future”
Part of Knowledge Discovery…
![Page 30: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/30.jpg)
30
5. Information Delivery
![Page 31: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/31.jpg)
31
Examples of Uses of BI/DW
Industry Use
Retail Customer Loyalty
Customer Service Effectiveness
Financial Fraud Detection
Profitability by product/LOB
Airlines Route Profitability
Customer Profitability
Manufacturing Cost Reduction Opportunities
Product Shipments
Non Profits Giving Campaign Effectiveness
Salvation Army Bell Ringer Effectiveness
Government Manpower Planning
School Academic Performance
Medical Patient Risk for Disease
![Page 32: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/32.jpg)
32
More Examples…
Industry Use
Marketing Churn Reduction
Brand/Product/Company Perception
Sales Cross-Selling Opportunities
Technology/Consulting
Estimate future engagements
![Page 33: Concepts and Components Chapters 1, 2, 7, 14, 15 1](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649d945503460f94a7b36e/html5/thumbnails/33.jpg)
33
Next Time…
Data Warehouse Design (Dimensional Modeling)