introduction to data vault - dama oregon 2012

70
Introduction Data Vault Model & Methodology © Dan Linstedt, 2011-2012 all rights reserved Prepared for: DAMA Oregon, July 2012 1

Upload: empowered-holdings-llc

Post on 19-Jan-2015

5.591 views

Category:

Business


5 download

DESCRIPTION

DAMA, Oregon Chapter, 2012 presentation - an introduction to Data Vault modeling. I will be covering parts of the methodology, comparison and contrast of issues in general for the EDW space. Followed by a brief technical introduction of the Data Vault modeling method.After the presentation i I will be providing a demonstration of the ETL loading layers, LIVE!You can find more on-line training at: http://LearnDataVault.com/training

TRANSCRIPT

Page 1: Introduction To Data Vault - DAMA Oregon 2012

1

IntroductionData Vault Model &

Methodology© Dan Linstedt, 2011-2012 all rights reserved

Prepared for: DAMA Oregon, July 2012

Page 2: Introduction To Data Vault - DAMA Oregon 2012

2

Who’s Using It?

Page 3: Introduction To Data Vault - DAMA Oregon 2012

3

The Experts Say…“The Data Vault is the optimal choice for modeling the EDW in the DW 2.0 framework.” Bill Inmon

“The Data Vault is foundationally strong and exceptionally scalable architecture.”

Stephen Brobst

“The Data Vault is a technique which some industry experts have predicted may spark a revolution as the next big thing in data modeling for enterprise warehousing....” Doug Laney

Page 4: Introduction To Data Vault - DAMA Oregon 2012

4

More Notables…

“This enables organizations to take control of their data warehousing destiny, supporting better and more relevant data warehouses in less time than before.”

Howard Dresner

“[The Data Vault] captures a practical body of knowledge for data warehouse development which both agile and traditional practitioners will benefit

from..”Scott Ambler

Page 5: Introduction To Data Vault - DAMA Oregon 2012

5

Agenda• Introduce Yourselves…• What is a Data Vault? Where does it come from?• Pros & Cons of Data Modeling for EDW• Current EDW Issues & Pains• Consequences of Implementing the Pains…• How do we “Fix” This?• Keys to Success• When “NOT” to use a Data Vault• Ontologies and Data Vault• A Working Example• Query Performance (PIT & Bridge)• Conclusion (break)• Live Demo

Page 6: Introduction To Data Vault - DAMA Oregon 2012

6

Introduce Yourselves• Your Expectations?• Your Questions?• Your Background?• Areas of Interest?• What are the top 3 pains your

EDW/BI solution is experiencing?

• About Me…o http://www.LinkedIn.com/dlinstedt

• Learn More Data Vault on-line at:o http://LearnDataVault.com/training

Page 7: Introduction To Data Vault - DAMA Oregon 2012

7

Where did it come from? What is it?

Defining the Data Vault Space

Page 8: Introduction To Data Vault - DAMA Oregon 2012

8

Data Warehousing Time Line

The Data Vault Model & Methodology took 10 years of R&D to become consistent, flexible, and scalable.

Page 9: Introduction To Data Vault - DAMA Oregon 2012

9

What IS a Data Vault? (Business

Definition)

• Data Vault Modelo Detail orientedo Historical traceabilityo Uniquely linked set of

normalized tableso Supports one or more

functional areas of business

• Data Vault Methodology– CMMI, Project Plan– Risk, Governance, Versioning– Peer Reviews, Release Cycles– Repeatable, Consistent,

Optimized– Complete with Best Practices

for BI/DW

• Data Vault Architecture– 3 Tier Architecture (for

including Batch & Unstructured Data)

– 2 Tier Architecture (for Real-Time only)

Page 10: Introduction To Data Vault - DAMA Oregon 2012

10

The Data Vault Model

Customer

Sat

Sat

Sat

F(x)

Customer

Product

Sat

Sat

Sat

F(x)

Product

Order

Sat

Sat

Sat

F(x)

Order

Elements:•Hub•Link•Satellite

Link

F(x)

Sat

Records a history of the interaction

Hub = List of Unique Business KeysLink = List of Relationships, AssociationsSatellites = Descriptive Data

Page 11: Introduction To Data Vault - DAMA Oregon 2012

11

Data Vault Methodology

1

2

3

4

5

Process unpredictable and poorly controlled

Risk assessments / analysis, managed processes, basic alignment efforts

Defined Business Processes, Defined Goals, Defined Objectives

Metrics, Estimates vs Actuals, Function Point Analysis, Identification of broken processes

Optimized business processes, repeatable, scalable, fault-tolerant. Automatable (generatable)

Follows: SEI/CMMI Level 5, PMP, Six Sigma, TQM, and Agile elements

Page 12: Introduction To Data Vault - DAMA Oregon 2012

12

Data Vault Architecture

Sales

Finance

Contracts

StagingEDW

(Data Vault)

StarSchemas

ErrorMarts

ReportCollections

Enterprise BI SolutionSOA

(real-time)

(batch)

(batch)

ComplexBusiness

Rules

The business rules are moved closer to the business, improving IT reaction time, reducing cost and minimizing

impacts to the enterprise data warehouse (EDW)

• Repeatable• Consistent• Fault-tolerant• Supports phased release

• Scalable• Auditable

FUNDAMENTAL GOALS

UnstructuredData

(Hadoop NoSQL)

Page 13: Introduction To Data Vault - DAMA Oregon 2012

13

Star Schemas, 3NF, Data Vault:

Pros & ConsDefining the Data Vault Space

Why NOT use Star Schemas as an EDW?Why NOT use 3NF as an EDW?

Why NOT use Data Vault as a Data Delivery Model?

Page 14: Introduction To Data Vault - DAMA Oregon 2012

14

Star Schema Pros/Cons as an EDW

PROS• Good for multi-dimensional

analysis• Subject oriented answers• Excellent for aggregation points• Rapid development /

deployment• Great for some historical

storage

CONS• Not cross-business functional• Use of junk / helper tables• Trouble with VLDW• Unable to provide integrated

enterprise information• Can’t handle ODS or

exploration warehouse requirements

• Trouble with data explosion in near-real-time environments

• Trouble with updates to type 2 dimension primary keys

• Trouble with late arriving data in dimensions to support real-time arriving transactions

• Not granular enough information to support real-time data integration

Page 15: Introduction To Data Vault - DAMA Oregon 2012

15

3nf Pros/Cons as an EDWPROS• Many to many linkages• Handle lots of information• Tightly integrated information• Highly structured• Conducive to near-real time

loads• Relatively easy to extend

CONS• Time driven PK issues• Parent-child complexities• Cascading change impacts• Difficult to load• Not conducive to BI tools• Not conducive to drill-down• Difficult to architect for an

enterprise• Not conducive to spiral/scope

controlled implementation• Physical design usually

doesn’t follow business processes

Page 16: Introduction To Data Vault - DAMA Oregon 2012

16

Data Vault Pros/Cons as an EDW

PROS• Supports near-real time and

batch feeds• Supports functional business

linking• Extensible / flexible• Provides rapid build / delivery

of star schema’s• Supports VLDB / VLDW• Designed for EDW• Supports data mining and AI• Provides granular detail• Incrementally built

CONS• Not conducive to OLAP

processing• Requires business

analysis to be firm• Introduces many join

operations

Page 17: Introduction To Data Vault - DAMA Oregon 2012

17

Analogy: The Porsche, the SUV and the Big Rig

• Which would you use to win a race?• Which would you use to move a house?• Would you adapt the truck and enter a race with Porches and expect to

win?

Page 18: Introduction To Data Vault - DAMA Oregon 2012

18

Current EDW Issues and Pains

Business Rule Processing, Lack of Agility, and Future proofing your new solution

Page 19: Introduction To Data Vault - DAMA Oregon 2012

19

Current EDW Project IssuesThis is NOT what you want happening to your project!

THE GAP!!

Page 20: Introduction To Data Vault - DAMA Oregon 2012

20

2 Tier EDW Architecture

• Quality routines• Cross-system dependencies• Source data filtering• In-process data manipulation

• High risk of incorrect data aggregation• Larger system = increased impact• Often re-engineered at the SOURCE• History can be destroyed (completely re-computed)

Sales

Finance

Contracts

Staging(EDW)

StarSchemas

Enterprise BI Solution

(batch)

Conformed DimensionsJunk Tables

Helper TablesFactless Facts

ComplexBusiness

Rules+Dependencies

Complex Business Rules #2

Staging + History

Page 21: Introduction To Data Vault - DAMA Oregon 2012

21

#1 Cause of BI Initiative Failure

Re-EngineeringFor

Every Change!Let’s take a look at one example…

Page 22: Introduction To Data Vault - DAMA Oregon 2012

22

Re-Engineering

Customer

CustomerTransactions

Sales

Finance

Current Sources

Source

Join

BusinessRules

Data Flow (Mapping)

CustomerPurchases

** NEW SYSTEM**

IMPACT!!

Page 23: Introduction To Data Vault - DAMA Oregon 2012

23

Federated Star Schema Inhibiting

Agility

Time

Effort& Cost

High

Low

Start MaintenanceCycle Begins

Changing and Adjusting conformed dimensions causes an exponential rise in the cost curve over time

RESULT: Business builds their own Data Marts!

Data Mart 1

Data Mart 2

Data Mart 3

The main driver for this is the maintenance costs, and re-engineering of the existing system which occurs for each new “federated/conformed” effort. This increases delivery time, difficulty, and maintenance costs.

Page 24: Introduction To Data Vault - DAMA Oregon 2012

24

What are the ROOT Causes?

The root causes of RE-ENGINEERING are:

1. Putting business rules up-stream of the EDW

2. Conforming Dimensions

Page 25: Introduction To Data Vault - DAMA Oregon 2012

25

Consequences of Implementing the

Pains…Business rules up-stream of your EDW and Conforming

Dimensions to store ALL history

Page 26: Introduction To Data Vault - DAMA Oregon 2012

26

Deformed Dimensions• Deformity: The URGE to continue “slamming data” into an existing

conformed dimension until it simply cannot sustain any further changes, the result: a deformed dimension and a HUGE re-engineering cost / nightmare.

Re-Engineering the Load Processes EACH

TIME!

…………………………………… ………………… ………………… ………………… ………………… ………………… …………………

V1Comple

xLoad

90 days, $125k

Business Change

………………………………………………………………………………………………………………………………………………………………………………………………………………………………

V2

Complex

Load

120 days, $200k

Business Change

………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ………………… ……………………………………

V3

Complex

Load

180 days, $275k

Business Change

Business Wants a Change!Business said: Just add that to the existing Dimension, it will be easy right?

Page 27: Introduction To Data Vault - DAMA Oregon 2012

27

Dimension-itis• DimensionItis: Incurable Disease, the symptoms are the creation

of new dimensions because the cost and time to conform existing dimensions with new attributes rises beyond the business ability to pay…

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………... …………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………... …………………...

…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...…………………...

…………………...…………………...…………………...…………………...…………………...…………………...…………………...

Business Says: Avoid the re-engineering costs, just “copy” the dimensions and create a new one for OUR department…

What happens when we (IT) give in to this? …

Page 28: Introduction To Data Vault - DAMA Oregon 2012

28

Result: Silo Data Junkyards!

• Business Says: Take the dimension you have, copy it, and change it… This should be cheap, and easy right?

Customer_IDCustomer_NameCustomer_AddrCustomer_Addr1Customer_CityCustomer_StateCustomer_ZipCustomer_PhoneCustomer_TagCustomer_ScoreCustomer_RegionCustomer_StatsCustomer_PhoneCustomer_Type

First Star

Customer_IDCustomer_NameCustomer_AddrCustomer_Addr1Customer_CityCustomer_StateCustomer_ZipCustomer_PhoneFact_ABCFact_DEFFact_PDQFact_MYFACT

Customer_IDCustomer_NameCustomer_AddrCustomer_Addr1Customer_CityCustomer_StateCustomer_ZipCustomer_PhoneCustomer_TagCustomer_ScoreCustomer_RegionCustomer_StatsCustomer_PhoneCustomer_Type

Customer_IDCustomer_NameCustomer_AddrCustomer_Addr1Customer_CityCustomer_StateCustomer_ZipCustomer_PhoneCustomer_TagCustomer_ScoreCustomer_RegionCustomer_StatsCustomer_PhoneCustomer_Type

Customer_IDCustomer_NameCustomer_AddrCustomer_Addr1Customer_CityCustomer_StateCustomer_ZipCustomer_PhoneCustomer_TagCustomer_ScoreCustomer_RegionCustomer_StatsCustomer_PhoneCustomer_Type

We built our own because IT costs too much…

SALES

We built our own because IT took too long…

FINANCE

We built our own because we needed customized dimension data…

MARKETING

Business ChangeTo Modify Existing Star = 180 days, $275k

Page 29: Introduction To Data Vault - DAMA Oregon 2012

29

Accountability In Question?Corporate Fraud Accountability Title XI consists of seven sections. Section 1101 recommends a name for this title as “Corporate Fraud Accountability Act of 2002”. It identifies corporate fraud and records tampering as criminal offenses and joins those offenses to specific penalties. It also revises sentencing guidelines and strengthens their penalties. This enables the SEC to temporarily freeze large or unusual payments.

Source 1

Source 2

Source 3

Business RulesChangeData!

Staging

HR Mart

Sales Mart

Finance Mart

Are changes to data ON THE WAY IN to the EDW equivalent to records tampering?

Page 30: Introduction To Data Vault - DAMA Oregon 2012

30

How do we “fix” this?Answer: Move the business rules downstream, AND no-

longer be forced to conform dimensions.

Page 31: Introduction To Data Vault - DAMA Oregon 2012

31

It’s Not Just a Data Model…

SUCCESS!

Model Methodology

Page 32: Introduction To Data Vault - DAMA Oregon 2012

32

Move the Business Rules

Downstream• No “Conforming” of Dimensions on the way in to the EDW• Hold on… We do distinguish between HARD and SOFT

business rules…

Page 33: Introduction To Data Vault - DAMA Oregon 2012

33

Hard & Soft Business RulesHard Business Rules Soft Business Rules

• Data Domain Alignment (Data Type Matching)

• Normalization (where necessary)

• System Column Computation

• Any requirement the business user states, that, when applied, CHANGES the data or CHANGES the meaning of the data (the grain or interpretation)

• Simple example that will knock the socks off your feet!

Page 34: Introduction To Data Vault - DAMA Oregon 2012

34

Progressive Agility and Responsiveness of

IT

Time

Effort& Cost

High

Low

Start MaintenanceCycle Begins

Foundational Base Built

New Functional Areas AddedInitial DV Build Out

Re-Engineering does NOT occur with a Data Vault Model. This keeps costs down, and maintenance easy. It also reduces complexity of the existing architecture.

Page 35: Introduction To Data Vault - DAMA Oregon 2012

35

NO Re-Engineering

Customer

CustomerTransactions

Sales

Finance

Current Sources

StageCopy

StageCopy

HubCustome

r

HubAcct

HubProduc

t

Link Transacti

on

Data Vault

CustomerPurchases

** NEW SYSTEM**

StageCopy

IMPACT!!

NO IMPACT!!!NO RE-ENGINEERING!

Page 36: Introduction To Data Vault - DAMA Oregon 2012

36

Keys to SuccessBringing the Data Vault to Your Project

Page 37: Introduction To Data Vault - DAMA Oregon 2012

37

Key: Flexibility

Adding new components to the EDW has NEAR ZERO impact to:• Existing Loading Processes• Existing Data Model• Existing Reporting & BI Functions• Existing Source Systems• Existing Star Schemas and Data Marts

No Re-

Engineeri

ng!

Page 38: Introduction To Data Vault - DAMA Oregon 2012

38

Case In Point:

Result of flexibility of the Data Vault Model allowed them to merge 3 companies in 90 days – that is ALL systems, ALL DATA!

Page 39: Introduction To Data Vault - DAMA Oregon 2012

39

Key: Scalability in Architecture

Scaling is easy, its based on the following principles• Hub and spoke design• MPP Shared-Nothing Architecture• Scale Free Networks

Page 40: Introduction To Data Vault - DAMA Oregon 2012

40

Case In Point:

Result of scalability was to produce a Data Vault model that scaled to 3 Petabytes in size, and is still growing today!

Page 41: Introduction To Data Vault - DAMA Oregon 2012

41

Key: Scalability in Team Size

You should be able to SCALE your TEAM as well!With the Data Vault methodology, you can:

Scale your team when desired, at different points in the project!

Page 42: Introduction To Data Vault - DAMA Oregon 2012

42

Case In Point:(Dutch Tax Authority)

Result of scalability was to increase ETL developers for each new source system, and reassign them when the system was completely loaded to the Data Vault

Page 43: Introduction To Data Vault - DAMA Oregon 2012

43

Key: Productivity

Increasing Productivity requires a reduction in complexity.The Data Vault Model simplifies all of the following:• ETL Loading Routines• Real-Time Ingestion of Data• Data Modeling for the EDW• Enhancing and Adapting for Change to the Model• Ease of Monitoring, managing and optimizing

processes

Page 44: Introduction To Data Vault - DAMA Oregon 2012

44

Case in Point:Result of Productivity was: 2 people in 2 weeks merged 3 systems, built a full Data Vault EDW, 5 star schemas and 3 reports.

These individuals generated:• 90% of the ETL code for moving the data

set• 100% of the Staging Data Model• 75% of the finished EDW data Model• 75% of the star schema data model

Page 45: Introduction To Data Vault - DAMA Oregon 2012

45

The Competing Bid?

The competition bid this with 15 people and 3 months to completion, at a cost of $250k! (they bid a Very complex system)

Our total cost? $30k and 2 weeks!

Page 46: Introduction To Data Vault - DAMA Oregon 2012

46

Results?

Changing the direction of the river takes less effort than stopping the flow

of water

Page 47: Introduction To Data Vault - DAMA Oregon 2012

47

Now you know WHY to use a Data Vault….

But are there times when you shouldn’t use a Data Vault?

Page 48: Introduction To Data Vault - DAMA Oregon 2012

48

When NOT to use the Data Vault

A review of some reasons why not to use a Data Vault Model

Page 49: Introduction To Data Vault - DAMA Oregon 2012

49

When NOT to Use the Data Vault

• You have:o a small set of point solution requirementso a very short time-frame for deliveryo To use the data one-time, then throw it awayo a single source system, single source applicationo A single business analyst in the entire company

• You do NOT have:o audit requirements forcing you to keep historyo multiple data center consolidation effortso near-real-time to worry abouto massive batch data to integrateo External data feeds outside your controlo Requirements to do trend analysis of all your datao Pain – that forces you to reengineer every time you ask for a

change to your current data warehousing systems

Page 50: Introduction To Data Vault - DAMA Oregon 2012

50

Ontologies & Data VaultHub, Link, Satellite - Definitions

Page 51: Introduction To Data Vault - DAMA Oregon 2012

51

Business Keys = Ontology

Business Keys should be arranged in an ontology

In order to learn the dependencies of the data

set

Drug Label Code

Product Number

Firm Name

NDA Application #

Drug Listing

Patent Use Code

Patent Number

Dose Form Code

NOTE: Different Ontologies represent different business

views of the data!

Page 52: Introduction To Data Vault - DAMA Oregon 2012

52

Associations = Ontological Hooks

Business Keys are associated by many linking factors, these links comprise the associations in the

hierarchy.

Product Number

Firm Name

NDA Application #

Drug ListingFirms Generate

Product Listings

Firms Manufacture

Products

Listings for Products are in NDA Applications

Page 53: Introduction To Data Vault - DAMA Oregon 2012

53

Descriptors = Context

Descriptors provide the context at a specific point in time – they are the warehousing portion of

the Data Vault

Firm Name Firm Locations

Drug ListingFirms Generate

Product Listings

Listing Formulatio

n

Product NumberFirms

Manufacture Products

ProductIngredientsStart & End of

manufacturing

Page 54: Introduction To Data Vault - DAMA Oregon 2012

54

A working ExampleNational Drug Codes + Orange Book of Drug Patent

Applications

http://www.accessdata.fda.gov/scripts/cder/ndc/default.cfm http://www.fda.gov/Drugs/InformationOnDrugs/ucm129662.htm

Page 55: Introduction To Data Vault - DAMA Oregon 2012

55

Hub Table Structures

SQN = Sequence (insertion order)LDTS = Load Date (when the Warehouse first sees the data)

RSRC = Record Source (System + App where the data ORIGINATED)

Page 56: Introduction To Data Vault - DAMA Oregon 2012

56

Link Table Structures

Note: A Link is really no different than a factless fact!

Page 57: Introduction To Data Vault - DAMA Oregon 2012

57

Satellite Table Structures

SQN = Sequence (parent identity number)LDTS = Load Date (when the Warehouse first sees the data)

LEDTS = End of lifecycle for superseded recordRSRC = Record Source (System + App where the data ORIGINATED)

Page 58: Introduction To Data Vault - DAMA Oregon 2012

58

In Review…• Data Vault is…

o A Data Warehouse Model & Methodologyo Hub and Spoke Designo Simple, Easy, Repeatable Structureso Comprised of Standards, Rules & Procedureso Made up of Ontological Metadatao AUTOMATABLE!!!

• Hubs = Business Keys• Links = Associations / Transactions• Satellites = Descriptors

Page 59: Introduction To Data Vault - DAMA Oregon 2012

59

Why do we build Links this way?

Page 60: Introduction To Data Vault - DAMA Oregon 2012

History Teaches Us…If we model for ONE relationship in the EDW, we BREAK the

others!

60

Portfolio

Customer

M

M

5 yearsFrom now X

Portfolio

Customer

M

1

10 Years ago X

Portfolio

Customer

1

MToday:

Hub Portfolio

Hub Customer

1

M

The EDW is designed to handle TODAY’S relationship, as soon as history is loaded, it breaks the model!

This situation forces re-engineering of the model, load routines, and queries!

Page 61: Introduction To Data Vault - DAMA Oregon 2012

61

History Teaches Us…If we model with a LINK table, we can handle ALL the

requirements!

Portfolio

Customer

M

M

5 years from now

Portfolio

Customer

1

MToday:

Portfolio

Customer

M

1

10 Years ago This design is flexible, handles past, present, and future relationship changes with NO RE-ENGINEERING!

Hub Portfolio

Hub Customer

1

M

LNKCust-Port

M

1

Page 62: Introduction To Data Vault - DAMA Oregon 2012

Base EDW Created in CorporateFinancials in USA

HubHub

SatSatSatSat

HubHub

SatSatSatSat

LinkLink

SatSatSatSat

Applying the Data Vault to Global DW

HubHub

SatSatSatSatLinkLink

Manufacturing EDW in China

HubHub

SatSatSatSat

Planning in Brazil

LinkLink

HubHub

SatSatSatSatLinkLink

62

Page 63: Introduction To Data Vault - DAMA Oregon 2012

63

Query PerformancePoint-in-time and Bridge Tables, overcoming query issues

Page 64: Introduction To Data Vault - DAMA Oregon 2012

64

PIT Table Architecture

Hub Custome

r

HubOrder

Hub Product

Link Line Item

SatelliteLine Item

Sat 1

Sat 2

Sat 3

Sat 4

PIT Sat

Sat 1

Sat 2

Sat 3

Sat 4

PIT Sat

Sat 1

Sat 2

PARENT SEQUENCELOAD DATE{Satellite 1 Load Date}{Satellite 2 Load Date}{Satellite 3 Load Date}{…}{Satellite N Load Date}

Satellite: Point In Time

PrimaryKey

Page 65: Introduction To Data Vault - DAMA Oregon 2012

65

PIT Table Example

SQN LOAD_DTS NAME1 10-14-2000 Dan L1 11-01-2000 Dan Linedt1 12-31-2000 Dan Linstedt

SAT_CUST_CONTACT_NAMESQN LOAD_DTS CELL1 10-14-2000 999-555-12121 10-15-2000 999-111-12341 10-16-2000 999-252-28341 10-17-2000 999.257-28371 10-18-2000 999-273-5555

SAT_CUST_CONTACT_CELLSQN LOAD_DTS ADDR1 08-01-200026 Prospect1 09-29-2000 26 Prosp St.1 12-17-2000 28 November1 01-01-2001 26 Prospect St

SAT_CUST_CONTACT_ADDR

SQN LOAD_DTS SAT_NAME_LDTS SAT_CELL_LDTSSAT_ADDR_LDTS1 08-01-2000 NULL NULL 08-01-20001 09-01-2000 NULL NULL 08-01-20001 10-01-2000 NULL NULL 09-29-20001 11-01-2000 11-01-2000 10-18-2000 09-29-20001 12-01-2000 11-01-2000 10-18-2000 09-29-20001 01-01-2001 12-31-2000 10-18-2000 01-01-2001

Snapshot Date

Page 66: Introduction To Data Vault - DAMA Oregon 2012

66

BridgeTable Architecture

Hub Seller

Hub Product

Link

Satellite

Sat 1

Sat 2

Sat 3

Sat 4

Bridge

Hub Parts

Link

Satellite

UNIQUE SEQUENCELOAD DATE{Hub 1 Sequence #}{Hub 2 Sequence #}{Hub 3 Sequence #}{Link 1 Sequence #}{Link 2 Sequence #}{…}{Link N Sequence #}{Hub 1 Business Key}{Hub 2 Business Key}{…}{Hub N Business Key}

Satellite: BridgePrimary

Key

Page 67: Introduction To Data Vault - DAMA Oregon 2012

67

Bridge Table Data Example

SQN LOAD_DTS SELL_SQN SELL_ID PROD_SQN PROD_NUMPART_SQN PART_NUM1 08-01-2000 15 NY*1 2756 ABC-123-9K 525 JK*2*42 09-01-2000 16 CO*24 2654 DEF-847-0L 324 MN*5-23 10-01-2000 16 CO*24 82374 PPA-252-2A 9938 DD*2*34 11-01-2000 24 AZ*25 25222 UIF-525-88 7 UF*9*05 12-01-2000 99 NM*5 81 DAN-347-7F 16 KI*9-26 01-01-2001 99 NM*5 81 DAN-347-7F 24 DL*0-5

Snapshot Date

Bridge Table: Seller by Product by Part

Page 68: Introduction To Data Vault - DAMA Oregon 2012

68

Conclusions

Page 69: Introduction To Data Vault - DAMA Oregon 2012

69

Where To Learn More• The Technical Modeling Book:

http://LearnDataVault.com/

• On-Line Training direct from me:http://LearnDataVault.com/training

• The Discussion Forums: & eventshttp://LinkedIn.com – Data Vault Discussions

• Contact me:http://DanLinstedt.com - web [email protected] - email

Page 70: Introduction To Data Vault - DAMA Oregon 2012

70

LIVEDEMONSTRATION

Physical Demonstration, Loading Processes and Execution