ensemble modeling and data vault 2014
DESCRIPTION
Data Modeling for the Agile Data Warehouse means breaking down business concepts into lower level tables. This process of Unified Decomposition results in concept Ensembles - like the popular Data Vault modeling approach.TRANSCRIPT
Data Vault Modeling DW2.0 & Unstructured Data Big Data Agile DW Ensemble Modeling
Ensemble Modeling
& Data Vault
© 2014 Genesee Academy, LLC USA +1 303 526 0340 Sweden 072 736 8700 [email protected] www.GeneseeAcademy.com
2014
© 2014 Genesee Academy, LLC
Ensemble Modeling & Data Vault
A G E N D A • Ensemble Modeling • &Unified Decomposition • Data Vault Ensemble • Colors of Data Vault • Data Vault Hubs, Links and
Satellites • More Information
2
Author, Advisor, Speaker & Industry Analyst; President Genesee Academy LLC, Principal at
About Hans Hultgren:
Book available on Amazon.com
gohansgo
© 2014 Genesee Academy, LLC
A Saga of Data Warehousing
Once upon a time data warehousing was becoming more popular and everyone was eager to build their own. But whenever they tried they failed. They called upon their best to fix this but they just couldn’t solve the problem.
They discovered that meeting the needs of the data warehouse meant that the tables got too big and too hard to work with. They just could not handle changes over time. If the smallest thing changed it always meant they had to change the entire table. When just a single attribute was updated they had to insert a record for all of the attributes. All seemed lost.
But around the world there were rebels who questioned the conventional wisdom. And their voices were finally heard: Why not separate the things that change from the things that don’t change?
3
© 2014 Genesee Academy, LLC
Ensemble Modeling™
4
All the parts of a thing taken together, so that each part is considered only in relation to the whole.
• The constellation of component parts acts as a whole – an Ensemble.
• With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts.
© 2014 Genesee Academy, LLC
Based on Unified Decomposition™
5
• With the EDW, we break things out into parts for flexibility, agility, and generally to facilitate the capture of things that are either interpreted in different ways or changing independently of each other.
• At the same time a core premise of data warehousing is integration and moving to a common standard view of unified concepts. So we also want to tie things together – Unify.
© 2014 Genesee Academy, LLC
THE DATA VAULT ENSEMBLE: APPLYING THE ENSEMBLE
6
© 2014 Genesee Academy, LLC
The Data Vault Ensemble
7
• The Data Vault Ensemble conforms to a single key – embodied in the Hub construct.
• The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History
© 2014 Genesee Academy, LLC
The Data Vault modeling approach
HUB
LINK
SAT
SAT
3NF Data Vault Dimensional
8
Details / Context
Core Concept Business Keys
Associations / Relationships
Entity Dim
© 2014 Genesee Academy, LLC
Modeling Comparison
Facts contain all three types of data… Dimensions can also contain all types
*** Requires complex loading routines for key dependencies…
Region
Sale Fact
Customer
Product Vendor Employee
Store
Details
Business Keys
Associations
Start Schema and Snow Flake Models:
9
© 2014 Genesee Academy, LLC
Modeling Comparison
Region Store
Sale
Sale LI
Customer
Product
Vendor
Employee
3rd Normal Form has the same issue: each construct – or Entity – typically contains a business key, one or more associations and also details (context, descriptive data)…
10
© 2014 Genesee Academy, LLC
Link
Colors of the Data Vault
Region
Vendor
Link Store
Sat
Product
Link
Customer
Employee
Sale
Sat
Sat Sat Sat Sat Sat Sat Sat
Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat
Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat
Sat Sat Sat
Sat Sat Sat Sat Sat Sat
Link
11
Sat Sat Sat Sat
Sat Sat
Sat Sat
Sat Sat
Sat Sat Sat
Sat Sat Sat Sat Sat Sat
© 2014 Genesee Academy, LLC
Data Vault means thinking differently
12
Customer
Customer
• The minimal construct then for an “entity” such as “Customer” is now a
Hub with a set of Satellites
© 2014 Genesee Academy, LLC
Data Vault Modeling Process
• The Modeling Process for creating a Data Vault model includes three primary steps:
1) Identify and Model your Core Business Concepts • Business Interviews is at the heart of this step
What do you do? What are the main things you work with?
• Also find best/target Natural Business Key 2) Identify and Model your Natural Business Relationships
• Specific Unique Relationships • Be considerate of the Unit of Work and Grain
3) Analyze and Design your Context Satellites • Consider Rate of Change, Type of Data
and also the Sources of your data during design process
13
© 2014 Genesee Academy, LLC
Hubs
– A Hub Construct in Data Vault • contains Business Key • only the Business Key • contains No Context • is always 1:1 with EWBK
– A Hub Table contains only • Business Key • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source
14
Record source
Date/Time Stamp
Business Key
H_Customer_SID
H_Customer
© 2014 Genesee Academy, LLC
Links
– A Link Construct in Data Vault • contains Relationship • only a Relationship • contains No Context • is always 1:1 with Relationship
– A Link Table contains only • 2-n FKs for the Relationship • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source
15
L_Cust_Class_SID
H_Customer_SID
H_Sequence2_SID
Date/Time Stamp
Record source
L_Cust_Class Record source
Date/Time Stamp
Business Key
H_Customer_SID
H_Customer
– Unique – Specific – Natural
Business Relationship
© 2014 Genesee Academy, LLC
Satellites
– A Satellite Construct in Data Vault • contains Context only • has no FKs (no relationships) • Designed by * Rate of Change
* Type of Data * System…
– A Satellite Table contains only • Business Key FK + • Load Date / Time Stamp • Context Data… • Record Source
16
Context A Context B Context C
H_Customer_SID
Record source Context D
Date/Time Stamp
S_Customer
Record source
Date/Time Stamp
Business Key
H_Customer_SID
H_Customer
© 2014 Genesee Academy, LLC
About Data Vault Ensemble
17
Estimated 800 Data Vault based Data Warehouses around the world
© 2014 Genesee Academy, LLC
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com [email protected]
gohansgo
Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com
HansHultgren
18
Online video-lesson training
DataVaultAcademy.com
DataVaultAcademy