the final frontier

Post on 13-Jan-2015

150 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Agile Data Warehouse

The Final Frontier

Terry Bunio

Agile Data Warehouse The Final Frontier

@tbunio

bornagainagilist.wordpress.com

www.protegra.com

Terry Bunio

The Prime Directive

Captain, we need more visualization!

The United Federation of Planets

3

4

5

Spectre of the Agility 2

Data Warehouse Architecture 1

Data Warehouse

• Definition

– “a database used for reporting and data analysis.

It is a central repository of data which is created

by integrating data from multiple disparate

sources. Data warehouses store current as well

as historical data and are commonly used for

creating trending reports for senior management

reporting such as annual and quarterly

comparisons.” – Wikipedia.org

Data Warehouse

• Can refer to:

– Reporting Databases

– Operational Data Stores

– Data Marts

– Enterprise Data Warehouse

– Cubes

– Excel?

– Others

Relational

• Relational Analysis

– Third Normal Form

– OLTP

– Normalized tables optimized for modification

Dimensional

• Dimensional Analysis

– Star Schema

– OLAP

– Facts and Dimensions optimized for retrieval

• Facts – Business events – Transactions

• Dimensions – context for Transactions

– Accounts

– Products

– Date

Relational

Dimensional

Kimball-lytes

• Bottom-up - incremental

– Operational systems feed the Data

Warehouse

– Data Warehouse is a corporate dimensional

model that Data Marts are sourced from

– Data Warehouse is the consolidation of Data

Marts

– Sometimes the Data Warehouse is generated

from Subject area Data Marts

Inmon-ians

• Top-down

– Corporate Information Factory

– Operational systems feed the Data

Warehouse

– Enterprise Data Warehouse is a corporate

relational model that Data Marts are sourced

from

– Enterprise Data Warehouse is the source of

Data Marts

The gist…

• Kimball‟s approach is easier to implement as

you are dealing with separate subject areas,

but can be a nightmare to integrate

• Inmon‟s approach has more upfront effort to

avoid these consistency problems, but takes

longer to implement.

Spectre of the Agility

Incremental - Kimball

•In Segments

•Detailed Analysis •Development

•Deploy

•Long Feedback loop •Considerable changes

•Rework •Defects

Waterfall - Inmon •Detailed Analysis •Large Development

•Large Deploy

•Long Feedback loop •Extensive changes

•Many Defects

Data Warehouse

Project

Popular Agile Data Warehouse Pattern

• Analyze data requirements department by

department

• Create Reports and Facts and Dimensions

for each

• Integrate when you do subsequent

departments

The problem

• Conforming Dimensions

– A Dimension conforms when it is in

equivalent structure and content

– Is an account defined by Marketing the same

as Finance?

• Probably not

– If the Dimensions do not conform, this severly

hampers the Data Warehouse

Where is she?

Where is the true Agility?

• Iterations not Increments

• Brutal Visibility/Visualization

• Short Feedback loops

• Just enough requirements

• Working on enterprise priorities – not just for

an individual department

Our Mission

• “Data... the Final Frontier. These are the

continuing voyages of the starship Agile.

Her on-going mission: to explore strange

new projects, to seek out new value and

new clients, to iteratively go where no

projects have gone before.”

The Prime Directive

The Prime Directive

• Is a vision or philosophy that binds the

actions of Starfleet

• Can an Data Warehouse project truly be

Agile without a Vision of either the Business

Domain or Data Domain?

– Essentially it is then just an Ad Hoc Data

Warehouse. Separate components that may fit

together.

– How do we ensure we are working on the right

priorities for the entire enterprise?

A new model

Agile Enterprise Data Model

• Confirms the major entities and the

relationships between them

– 30-50 entities

• Confirms the Business and Data Domains

• Starts the definition of a Data Model that will

be refined over time

– Completed in 1 – 4 weeks

An Agile Enterprise Data Model

• Is just enough to understand the domain so

that the iterations can proceed

• Is not mapping all the attributes

• Is not BDUF

• Is a User Story Map for the Data Domain

• Contains placeholders for refinement

Agile Enterprise Data Model is a Data Map

Agile Enterprise Data Model

• Is

– Our vision

– Our User Story Map for the Data Domain

– Guides our solution

– Our Prime Directive

– A Data Model

Kimball or Inmon?

Spock

• Hybrid approach

– It is only logical

– Needs of the many outweigh the needs of the

few – or the one

Spock Approach

Agile Enterprise

Data Model

Spock Approach

• Agile Enterprise Data Model

• Operational Data Store

• Dimensional Data Warehouse

• Reporting can then be done from:

– Operational Data Store

– Dimensional Data Warehouse

– New Data Marts

Benefits of Spock Approach

• Agile Enterprise Data Model

– Validates knowledge of Data Domain

– Ensure later increments don’t uncover data

that was previously unknown and hard to

integrate

• Minimizes rework

– True iterations

• Confirm at high level and then refine

Benefits of Spock Approach

• Operational Data Store

– Model data relationally to provide enterprise

level operational reports

– Consolidate and cleanse data before it is

visible to end-users

– Is used to refine the Agile Enterprise Data

Model

Benefits of Spock Approach

• Dimensional Data Warehouse

– Model data dimensionally to validate domain

– Able to provide data to analytical reports

– Able to provide full historical data and context

for reports

– Able to provide clients with the ability to

generate their own reports easily

How do we work iteratively on

a Data Warehouse?

Increments versus iterations

• Increments

– Series by series – department by department

• Iterations

– Story by story – episode by episode

• Enterprise prioritization

– Work on the highest priority for the enterprise

– Not just within each series/department

Captain, we need more Visualization!

His pattern indicates 2 dimensional thinking

Data visualization

Data Visualization

• Is required to:

– Provide a visual data backlog

– Provide a visualization of the data

requirements across dimensions

– Plan iterations

– Lead into the creation of FACT tables and

Dimensions in a Data Warehouse

• For an Agile Data Warehouse we must think

and visualize in more dimensions

• We must create a User Story Map for Data

Requirements that is both intuitive and

informing

– Primarily for clients!

– They should be able to look at it and

understand what is currently being worked on

We need a bigger metaphor

Invoices

Sales

Operations

Master

Data

Payments

Bills

Transactions

Data Hexes

Bill Payment Hex

• Can have up to six dimensions of how

payments are sliced, diced, and aggregated

• Concentric hexes allow for the planning of

iterations

Cardassian Union

Be careful how you spell that…

Data Modeling Union

• For too long the Data Modelers have not

been integrated with Software Developers

• Data Modelers have been like the

Cardassian Union, not integrated with the

Federation

Issues

• This has led to:

– Holy wars

– Each side expecting the other to follow their

schedule

– Lack of communication and collaboration

• Data Modelers need to join the „United

Federation of Projects‟

Tools of the trade

Version Control

Version Control

• If you don‟t control versions, they will control

you

• Data Models must become integrated with

the source control of the project

– In the same repository of project trunk and

branches

Our Version Experience

• We are using Subversion

• We are using Oracle Data Modeler as our

Modeling tool.

– It has very good integration with Subversion

– Our DBMS is SQL Server

• Unlike other modeling tools, the data model

was able to be integrated in Subversion with

the rest of the project

Shameless plug

• Free

• Subversion Integration

• Supports Logical, Relational, and

Dimensional data models

• Since it is free, the data models can be

shared and refined by all 60+ members of

the development team

• Currently on version 889

Adaptability

Change Tolerant Data Model

• Only add tables and columns when they are

absolutely required

• Leverage Data Domains so that attributes

are created consistently and can be changed

in unison

Change Tolerant Data Model

• Don‟t model the data according to the

application‟s Object Model

• Don‟t model the data according to source

systems

• These items will change more frequently

than the actual data structure

• Your Data Model and Object Model should

be different!

Re-Factoring – Read It

Create the plan for how you

will re-factor

Plan for:

• Versioning – Major and minor

• Adaptability – How the data model will adapt

to major changes

• Refinement – How will iterations be planned

and executed

• Re-Factoring – How will the data design be

re-factored

Assimilate

Assimilate

• Assimilate Version Control, Adaptability,

Refinement, and Re-Factoring into core

project activities

– Stand ups

– Continuous Integration

– Check outs and Check Ins

• Make them part of the standard activities –

not something on the side

Our experience

Current Stardate

• We are reaching the end of Operational Data

Store ETL Development

– ODS has been refined as we progress – no

major changes

– Data Warehouse Dimensional Model is also

complete

– Initial Reports analysis is complete and report

backlog will soon be started on

Summary

• Use an Agile Enterprise Data Model to

provide the vision

• Strive for Iterations over Increments – Spock

Approach assists in this

• Use Data Hexes to provide brutal visibility

and Iteration planning

• Plan and Integrate processes for Versioning,

Adaptability, Refinement, and Re-Factoring

What doesn‟t change?

Leadership

Leadership

• “If you want to build a ship, don't drum up

people together to collect wood and don't

assign them tasks and work, but rather teach

them to long for the endless immensity of the

sea.” ~ Antoine de Saint-Exupery

Leadership • “[A goalie's] job is to stop pucks, ... Well, yeah, that's

part of it. But you know what else it is? ... You're

trying to deliver a message to your team that things

are OK back here. This end of the ice is pretty well

cared for. You take it now and go. Go! Feel the

freedom you need in order to be that dynamic,

creative, offensive player and go out and score. ...

That was my job. And it was to try to deliver a

feeling.” ~ Ken Dryden

Agile Data Warehouse

The Final Frontier

Terry Bunio

@tbunio bornagainagilist.wordpress.com www.protegra.com

top related