a walk through the kimball etl subsystems with oracle data integration

Post on 09-Jan-2017

1.397 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Michael Rainey | Oracle OpenWorld 2015

A Walk Through the Kimball ETL Subsystems with Oracle Data Integration Solutions

1

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Introduction

2

• Michael Rainey- Data Integration Practice Lead - America - Oracle Data Integration expert

• GoldenGate and Oracle Data Integrator • BI Applications 11g

- Blogger, instructor, speaker - Oracle ACE

@mRainey

info@rittmanmead.com www.rittmanmead.com @rittmanmead

About Rittman Mead

3

•World’s leading specialist partner for technical excellence, solutions delivery and innovation in Oracle Data Integration, Business Intelligence, Analytics and Big Data

•Providing our customers targeted expertise; we are a company that doesn’t try to do everything… only what we excel at

•70+ consultants worldwide including 1 Oracle ACE Director and 3 Oracle ACEs

•Founded on the values of collaboration, learning, integrity and getting things done

Optimizing your investment in Oracle Data Integration

•Comprehensive service portfolio designed to support the full lifecycle of any analytics solution

info@rittmanmead.com www.rittmanmead.com @rittmanmead 4

Visual Redesign Business User Training

Ongoing SupportEngagement Toolkit

Average user adoption for BI platforms is below 25%

Rittman Mead’s User Engagement Service can help

info@rittmanmead.com www.rittmanmead.com @rittmanmead

What’s Most Important for YOU in Data Integration?

5

• Big data?

info@rittmanmead.com www.rittmanmead.com @rittmanmead

What’s Most Important for YOU in Data Integration?

5

• Big data?• Cloud?

info@rittmanmead.com www.rittmanmead.com @rittmanmead

What’s Most Important for YOU in Data Integration?

5

• Big data?• Cloud?

• Financial Reporting on “one version of the truth”?

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Let’s take a walk…and talk about ETL

6

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Wait! What are Kimball ETL Subsystems?Do you all know of Ralph Kimball?

7

www.kimballgroup.com

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Wait! What are Kimball ETL Subsystems?Do you all know of Ralph Kimball?

7

www.kimballgroup.com

Ralph Kimball founded the Kimball Group. Since the mid-1980s, he has been the DW/BI industry’s thought leader on the dimensional approach and trained more than 20,000 students. Prior to working at Metaphor and founding Red Brick Systems, Ralph co-invented the first commercially-available workstation with a graphical user interface at Xerox’s Palo Alto Research Center (PARC). Ralph has his Ph.D. in Electrical Engineering from Stanford University.

info@rittmanmead.com www.rittmanmead.com @rittmanmead

The Kimball GroupDo you all know of Ralph Kimball?

8

info@rittmanmead.com www.rittmanmead.com @rittmanmead

The Kimball 34 Subsystems of ETL

9

• Extracting Data - Data Profiling - Change Data Capture System - Extract System

info@rittmanmead.com www.rittmanmead.com @rittmanmead

The Kimball 34 Subsystems of ETL

10

• Cleaning and Conforming Data - Data Cleansing System - Error Event Schema - Audit Dimension Assembler - Deduplication System - Conforming System

info@rittmanmead.com www.rittmanmead.com @rittmanmead

The Kimball 34 Subsystems of ETL

11

• Delivering Data for Presentation - Slowly Changing Dimension

Manager - Surrogate Key Generator - Hierarchy Manager - Special Dimensions Manager - Fact Table Builders - Surrogate Key Pipeline - Late Arriving Data Handler

- Multi-Valued Dimension Bridge Table Builder

- Dimension Manager System - Fact Provider System - Aggregate Builder - OLAP Cube Builder - Data Propagation Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead

The Kimball 34 Subsystems of ETL

12

• Managing the ETL Environment - Job Scheduler - Backup System - Recovery and Restart System - Version Control System - Version Migration System - Workflow Monitor - Sorting System

- Lineage & Dependency Analyzer

- Problem Escalation System - Parallelizing / Pipelining System - Security System - Compliance Manager - Metadata Repository Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Oracle Data Integration Solutions

13

Copyright*©*2015,*Oracle*and/or*its*affiliates.*All*rights*reserved.**|* Oracle*Open*World*2015* 1*

NoETL*Engine*100%*NaEve*Data*TransformaEon*

Data$Integrator$

Big$Data$Prepara/on$

GoldenGate$

Data$Quality$

Data$Service$Integrator$

Metadata$Management$

NonIinvasive*CDC,*RealEme*streaming*

data*delivery*

Profile,*Cleanse,*Match,*and*

Remediate*Data*

Prepare,*Secure,*Enrich*and*Publish*Unstructured*Data*

Catalog,*Trace*and*View*Models*across*

the*Enterprise*

Federate*Data*Across*DBs,*Services*and*ApplicaEons*

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Now let’s take a walk through the ETL Subsystems

14

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Who’s coming with us?

15

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Data model - where we’re going

16

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Data model - where we’re going

16

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Data model - where we’re going

16

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data

17

• Data Profiling- Oracle Enterprise Data Quality • Change Data Capture System• Extract System- Oracle Data Integrator - Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data

17

• Data Profiling- Oracle Enterprise Data Quality • Change Data Capture System• Extract System- Oracle Data Integrator - Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

18

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

18

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

18

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

18

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

18

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

19

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

19

• Small dataset due to sampling percentage

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

19

• Small dataset due to sampling percentage

• _projectid looks like a primary key

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

19

• Small dataset due to sampling percentage

• _projectid looks like a primary key

• Investigate school_district blanks

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Data Profiling with EDQ

19

• Small dataset due to sampling percentage

• _projectid looks like a primary key

• Investigate school_district blanks

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

20

• Extract from many different systems? Yes!

- Multiple technologies OOTB - Custom technologies can be added • Data Server - connection to the

data source- Physical Schema - Logical Schema

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

20

• Extract from many different systems? Yes!

- Multiple technologies OOTB - Custom technologies can be added • Data Server - connection to the

data source- Physical Schema - Logical Schema

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

21

• Models - Based on a single data

source • Datastores- Logically represent a

table, file, XML, etc - Reverse engineer or

build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

21

• Models - Based on a single data

source • Datastores- Logically represent a

table, file, XML, etc - Reverse engineer or

build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead

City

Extracting Data - Oracle Data Integrator

21

• Models - Based on a single data

source • Datastores- Logically represent a

table, file, XML, etc - Reverse engineer or

build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead

City

Extracting Data - Oracle Data Integrator

21

• Models - Based on a single data

source • Datastores- Logically represent a

table, file, XML, etc - Reverse engineer or

build manually

State

info@rittmanmead.com www.rittmanmead.com @rittmanmead

City

Extracting Data - Oracle Data Integrator

21

• Models - Based on a single data

source • Datastores- Logically represent a

table, file, XML, etc - Reverse engineer or

build manually

StateZip Code

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

22

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

22

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Oracle Data Integrator

22

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Changed Data Only

23

• Change Data Capture- Extract only the changed data since the last ETL extract • Methods- Audit columns - Timed extract - Full “diff compare” - Database log scraping

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - Changed Data Only

23

• Change Data Capture- Extract only the changed data since the last ETL extract • Methods- Audit columns - Timed extract - Full “diff compare” - Database log scraping

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - CDC with Oracle GoldenGate

24

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - CDC with Oracle GoldenGate

25

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - CDC with Oracle GoldenGate

25

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - CDC with Oracle GoldenGate

25

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data - CDC with Oracle GoldenGate

25

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data

26

• Data Profiling- Oracle Enterprise Data Quality • Change Data Capture System• Extract System- Oracle Data Integrator - Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Extracting Data

26

• Data Profiling- Oracle Enterprise Data Quality • Change Data Capture System• Extract System- Oracle Data Integrator - Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming Data

27

• Data Cleansing System- ODI & EDQ • Error Event Schema- Built on ODI E$ tables • Audit Dimension Assembler• Deduplication System- EDQ • Conforming System

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - Data Cleansing System

28

• ODI - Check Knowledge Module- Check logical constraints - “Bad” data moves to error table • EDQ- Data cleansing audit processors

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

29

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

29

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

29

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

29

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - ODI Constraints

30

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - EDQ Data Cleansing

31

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - EDQ Data Cleansing

31

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - EDQ Data Cleansing

31

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - EDQ Data Cleansing

31

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - Error Event Schema

32

Image From: Ralph Kimball. “The Data Warehouse Lifecycle Toolkit.” iBooks. https://itun.es/us/lon6z.l

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - Error Event Schema

32

SNP_LP_RUN

SNP_CONDSNP_KEYSNP_JOIN

SNP_CHECK_TAB

Image From: Ralph Kimball. “The Data Warehouse Lifecycle Toolkit.” iBooks. https://itun.es/us/lon6z.l

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - Error Event Schema

32

SNP_LP_RUN

SNP_CONDSNP_KEYSNP_JOIN

SNP_CHECK_TAB

E$ Tables

Image From: Ralph Kimball. “The Data Warehouse Lifecycle Toolkit.” iBooks. https://itun.es/us/lon6z.l

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - Deduplication System

33

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming - Deduplication System

33

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Cleaning and Conforming Data

34

• Data Cleansing System- ODI & EDQ • Error Event Schema- Built on ODI E$ tables • Audit Dimension Assembler• Deduplication System- EDQ • Conforming System

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data

35

• Slowly Changing Dimension Manager

• Surrogate Key Generator• Hierarchy Manager• Special Dimensions Manager• Fact Table Builders• Surrogate Key Pipeline• Late Arriving Data Handler

•Multi-Valued Dimension Bridge Table Builder•Dimension Manager System•Fact Provider System•Aggregate Builder•OLAP Cube Builder•Data Propagation Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data

36

• Slowly Changing Dimension Manager- ODI Integration Knowledge Module - Set SCD behavior type for each

target column • Surrogate Key Generator- Database Sequence objects and ODI Sequences • Fact Table Builder- Lookups in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - Slowly Changing Dimension in ODI

37

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - Slowly Changing Dimension in ODI

37

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - Slowly Changing Dimension in ODI

37

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - SCD in ODI - Surrogate Keys

38

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - SCD in ODI - Surrogate Keys

38

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - SCD in ODI - Surrogate Keys

38

Additional audit columns

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - SCD in ODI - Surrogate Keys

38

Additional audit columns

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - Fact Table Builder

39

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data - Fact Table Builder

39

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Delivering Data

40

• Slowly Changing Dimension Manager- ODI Integration Knowledge Module - Set SCD behavior type for each

target column • Surrogate Key Generator- Database Sequence objects and ODI Sequences • Fact Table Builder- Lookups in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment

41

• Job Scheduler• Backup System• Recovery and Restart System• Version Control System• Version Migration System• Workflow Monitor• Sorting System

• Lineage & Dependency Analyzer

• Problem Escalation System• Parallelizing / Pipelining

System• Security System• Compliance Manager• Metadata Repository Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment - Job Scheduler

42

• Create ODI schedule on execution object

- Tied to an agent and context

• Limited flexibility- Custom Fiscal Month end,

for example

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment - Job Scheduler

43

• Alternative to ODI scheduler - external scheduling tool- ODI Scenarios and Load Plans can be executed via command

line script or web service

./startloadplan.sh LOAD_EDW GLOBAL 6 -AGENT_URL=http://localhost:20910/oraclediagent

info@rittmanmead.com www.rittmanmead.com @rittmanmead

12.2.1

Managing the ETL Environment - Version Control/Migration

44

• ODI 12.2.1 Lifecycle Management- Integrated with Subversion - Deployment Archives for code

migration between environments

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment - Workflow Monitor

45

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment - Workflow Monitor

45

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment - Workflow Monitor

45

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment - Workflow Monitor

45

Drilldown from ODI Session to SQL detailed activity report

Obtain real-time and historical agent statistics

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Managing the ETL Environment

46

• Job Scheduler• Backup System• Recovery and Restart System• Version Control System• Version Migration System• Workflow Monitor• Sorting System

• Lineage & Dependency Analyzer

• Problem Escalation System• Parallelizing / Pipelining

System• Security System• Compliance Manager• Metadata Repository Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Where did we end up?

47

• The Kimball ETL Subsystems will guide your data warehouse program

• Oracle Data Integration can help you fully implement the ETL Subsystems

- Extract, Load, Transform with ODI and GoldenGate

- Profile and cleanse data with Enterprise Data Quality

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Where did we end up? One version of the truth…

48

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Questions?

49

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Questions?

50

• Websites:- www.kimballgroup.com - www.rittmanmead.com/blog • Contact:- info@rittmanmead.com - michael.rainey@rittmanmead.com • Twitter- @rittmanmead - @mRainey

info@rittmanmead.com www.rittmanmead.com @rittmanmead

Rittman Mead Sessions

51

No Big Data Hacking—Time for a Complete ETL Solution with Oracle Data Integrator 12c [UGF5827] Jérôme Françoisse | Sunday, Oct 25, 8:00am | Moscone South 301

Empowering Users: Oracle Business Intelligence Enterprise Edition 12c Visual Analyzer [UGF5481] Edelweiss Kammermann | Sunday, Oct 25, 10:00am | Moscone West 3011

A Walk Through the Kimball ETL Subsystems with Oracle Data Integration Solutions [UGF6311] Michael Rainey | Sunday, Oct 25, 12:00pm | Moscone South 301

Oracle Business Intelligence Cloud Service—Moving Your Complete BI Platform to the Cloud [UGF4906]

Mark Rittman | Sunday, Oct 25, 2:30pm | Moscone South 301

Oracle Data Integration Product Family: a Cornerstone for Big Data [CON9609]

Mark Rittman | Wednesday, Oct 28, 12:15pm | Moscone West 2022

Developer Best Practices for Oracle Data Integrator Lifecycle Management [CON9611]

Jérôme Françoisse | Thursday, Oct 29, 2:30 pm | Moscone West 2022

top related