the data warehousethe data warehousesmiertsc/4397cis/roiger_chapter06.pdf · the data warehousethe...
TRANSCRIPT
![Page 1: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/1.jpg)
The Data WarehouseThe Data Warehouse
Chapter 6
![Page 2: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/2.jpg)
6.1 Operational Databasesp
![Page 3: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/3.jpg)
Data Modeling and Normalizationg
O O R l i hi• One-to-One Relationships• One-to-Many Relationships• Many-to-Many Relationships
![Page 4: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/4.jpg)
Data Modeling and Normalizationg
• First Normal FormFirst Normal Form• Second Normal Form
Thi d N l F• Third Normal Form
![Page 5: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/5.jpg)
Type IDYear
Make
Income Range
Customer ID
Vehicle - Type Customer
Figure 6.1 A simple entity-relationship diagram
![Page 6: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/6.jpg)
The Relational ModelThe Relational Model
![Page 7: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/7.jpg)
Table 6.1a • Relational Table for Vehicle-Type
Type ID Make Year
4371 Chevrolet 19956940 Cadillac 20004595 Chevrolet 20014595 Chevrolet 20012390 Cadillac 1997
![Page 8: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/8.jpg)
Table 6.1b • Relational Table for Customer
C t ICustomer IncomeID Range ($) Type ID
0001 70–90K 23900002 30–50K 43710003 70 90K 69400003 70–90K 69400004 30–50K 45950005 70–90K 2390
![Page 9: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/9.jpg)
Table 6.2 • Join of Tables 6.1a and 6.1b
C t ICustomer IncomeID Range ($) Type ID Make Year
0001 70–90K 2390 Cadillac 19970002 30–50K 4371 Chevrolet 19950003 70 90K 6940 Cadillac 20000003 70–90K 6940 Cadillac 20000004 30–50K 4595 Chevrolet 20010005 70–90K 2390 Cadillac 1997
![Page 10: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/10.jpg)
6.2 Data Warehouse Designg
![Page 11: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/11.jpg)
The Data Warehouse
“A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile g , ,collection of data in support of management’s decision making process g g p(W.H. Inmon).”
![Page 12: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/12.jpg)
Granularity
Granularity is a term used to describe the level of detail of stored information.
![Page 13: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/13.jpg)
Dependent
ExternalData
pData Mart
Extract/Summarize Data
OperationalDatabase(s)
Decision Support SystemDataWarehouse
ETL Routine(Extract/Transform/Load)
IndependentData Mart
ReportData Mart
Figure 6.2 A data warehouse process model
![Page 14: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/14.jpg)
Entering Data into the WarehouseEntering Data into the Warehouse
• Independent Data Mart• ETL (Extract, Transform, Load Routine)ETL (Extract, Transform, Load Routine)• Metadata
![Page 15: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/15.jpg)
Structuring the Data Warehouse:Structuring the Data Warehouse: Two Methods
• Structure the warehouse model using the star schema
• Structure the warehouse model as a multidimensional arraymultidimensional array
![Page 16: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/16.jpg)
The Star SchemaThe Star Schema
• Fact Table• Dimension TablesDimension Tables• Slowly Changing Dimensions
![Page 17: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/17.jpg)
Purchase Key Category1 Supermarket2 Travel & Entertainment
Purchase Dimension
3 A t & V hi l Time Dimension
.
...
3 Auto & Vehicle4 Retail5 Restarurant6 Miscellaneous
Time Key Month10 Jan
Time DimensionYearQuarterDay
15 2002...
.
.
.
.
.
.
.
.
.
.
.
.. .
Cardholder Key Purchase Key1 2
Fact TableAmountTime KeyLocation Key
101 14.5015 4 115 8.251 2 103 22.40...
.
.
.
.
.
.
.
.
.
.
.
.
Location Key Street10 425 Church St
Location DimensionRegionStateCity
SCCharleston 3..
.
...
.
...
GenderMale
.Female
Income Range50 - 70,000
.70 - 90,000
Cardholder Key Name1 John Doe
. .2 Sara Smith
Cardholder Dimension
Figure 6.3 A star schema for credit card purchases
. . .. ...
.
...
.
.
![Page 18: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/18.jpg)
The Multidimensionality of the St S hStar Schema
![Page 19: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/19.jpg)
Cardholder Ci
1 2,10)
PurchaseKey A(C i,1
,2
Time Key
Location Key
Figure 6.4 Dimensions of the fact table shown in Figure 6.3
![Page 20: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/20.jpg)
Additional Relational SchemasAdditional Relational Schemas
• Snowflake Schema• Constellation Schema
![Page 21: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/21.jpg)
Time Key Month5 Dec
Time DimensionYearQuarterDay
431 20018 Jan 13 2002
10 J 15 2002Promotion Key Description
Promotion DimensionCost
1 t h 15 25...
.
.
.
.
.
.
.
.
.
.
.
.
10 Jan 15 2002...
.
.
.
.
.
.
1 watch promo 15.25
Purchase DimensionPurchase Key Category
1 Supermarket2 Travel & Entertainment
Purchase Dimension
3 Auto & Vehicle4 Retail5 Restarurant
Cardholder Key Purchase Key1 2
Purchase Fact TableAmountTime KeyLocation Key
101 14.5015 4 115 8.25
6 Miscellaneous
Cardholder Key Promotion Key1 1
Promotion Fact TableResponseTime Key
5 Yes2 1 5 No
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 2 103 22.40...
.
.
.
.
.
.
.
.
.
Location Key Street5 425 Church St
Location DimensionRegionStateCity
SCCharleston 3...
.
.
.
.
.
.
.
.
.
.
.
.
GenderMale
.
.
Female
Income Range50 - 70,000
.
.
70 - 90,000
Cardholder Key Name1 John Doe
.
...
2 Sara Smith
Cardholder Dimension
Figure 6.5 A constellation schema for credit card purchases and promotions
. .. .
![Page 22: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/22.jpg)
Decision Support: Analyzing theDecision Support: Analyzing the Warehouse Data
• Reporting Data• Analyzing Data• Knowledge Discoveryg y
![Page 23: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/23.jpg)
6 3 On-line Analytical Processing6.3 On-line Analytical Processing
![Page 24: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/24.jpg)
OLAP O iOLAP Operations
• Slice – A single dimension operation• Dice – A multidimensional operationDice A multidimensional operation• Roll-up – A higher level of generalization
ill d A l l f d il• Drill-down – A greater level of detail• Rotation – View data from a new perspective
![Page 25: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/25.jpg)
Month = Dec.
Region = TwoCategory = Vehicle
Count = 110Amount = 6,720Region Two
Dec.
Sep.
Oct.
Nov.
May
Jun.
Jul.
Aug.
pM
onth
Mar.
Feb.
Apr.
y
Jan.
FourThreeTwo
Supe
rmar
ket
Mis
cella
neou
s
Res
taur
ant
Trav
el
Ret
ail
Vehi
cle
Category
RegionOne
wo
Figure 6.6 A multidimensional cube for credit card purchases
Category
![Page 26: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/26.jpg)
Concept Hierarchy
A mapping that allows attributes to beA mapping that allows attributes to beviewed from varying levels of detail.
![Page 27: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/27.jpg)
Region
State
CityCity
Street Address
Figure 6.7 A concept hierarchy for location
![Page 28: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/28.jpg)
Month = Oct./Nov/Dec.Category = SupermarketRegion = OneCategory = Supermarket
Q4
Q2
Q3
Tim
e
On
FourThreeTwoQ1
Supe
rmar
ket
isce
llane
ous
Res
taur
ant
Trav
el
Ret
ail
Vehi
cle
RegionOne
S
Mi
Category
Figure 6.8 Rolling up from months to quarters
![Page 29: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/29.jpg)
6.4 Excel Pivot Tables for Data AnalysisAnalysis
![Page 30: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/30.jpg)
Creating a Simple Pivot Table
![Page 31: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/31.jpg)
Figure 6.9 A pivot table template
![Page 32: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/32.jpg)
Figure 6.10 A summary report for income range
![Page 33: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/33.jpg)
Figure 6.11 A pie chart for income range
![Page 34: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/34.jpg)
i bl f h iPivot Tables for HypothesisTestingTesting
![Page 35: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/35.jpg)
Figure 6.12 A pivot table showing age and credit card insurance choice
![Page 36: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/36.jpg)
Figure 6.13 Grouping the credit card promotion data by age
![Page 37: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/37.jpg)
Figure 6.14 PivotTable Layout Wizard
![Page 38: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/38.jpg)
Creating a Multidimensional Pivot Table
![Page 39: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/39.jpg)
Watch Promo = No
Magazine Promo = YesLife Insurance Promo = Yes
No
ch P
rom
o
YesWat
c
No
agazineo
No
Yes
Yes
Life Insurance Promo
MagaPromo
Figure 6.15 A credit card promotion cube
![Page 40: The Data WarehouseThe Data Warehousesmiertsc/4397cis/Roiger_Chapter06.pdf · The Data WarehouseThe Data Warehouse Chapter 6. 6.1 Operational Databases. ... Data Warehouse ETL Routine](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e763f71bb1011641e2f60e3/html5/thumbnails/40.jpg)
Figure 6.16 A pivot table with page variables for credit card promotions