etl implementation strategy

16
8/9/2019 ETL Implementation Strategy http://slidepdf.com/reader/full/etl-implementation-strategy 1/16  ETL Implementation Strategy

Upload: satyajc

Post on 01-Jun-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 1/16

 

ETL Implementation Strategy

Page 2: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 2/16

 

Contents

Buy or Build ETL

Major Factors Involved in Evaluating an ETL

 An Ideal ETL Tool

ETL Implementation

ETL Process Example

Required Feature in ETL tool

Page 3: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 3/16

 

Buying ETL ToolBuying ETL Tool

Advantages

Reduced development time

 A ide range o! !eatures availa"le

Reusa"le across !uture p#ases involving data trans!ormations it#in Project

Disadvantages

Time needed to learn t#e product

Training costsMay not do everyt#ing e need $to "e supplemented it# in%#ouse

development&

Page 4: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 4/16

 

Building ETL ToolBuilding ETL Tool

Advantages

'o up%!ront purc#asing costs

'o training costs

(peci!ically designed !or t#e purpose o! t#e project

Disadvantages

Time needed !or design) development) testing and documentation

May not #ave all t#e !eatures o! an o!! t#e s#el! product*ig# maintenance

“Recommended to buy a tool than building one or pro!ect"

Page 5: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 5/16

 

#actors Involved in Evaluating an#actors Involved in Evaluating an ETL

Ease o use

Database Connectivity

$pdate Capabilities

Surrogate %ey Support

Change Data Capture

 Intelligent &ueries

 'ulti(source )oins

Aggregate Capabilities

Tool Integration

 'etadata Support

Customi*ation 'ethods

 Logging

Scheduling #eatures

&uality Assessment

Tool Architecture

+rice

Page 6: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 6/16

 

CriteriaCriteria

Page 7: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 7/16

 

CriteriaCriteria

Page 8: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 8/16

 

ETL EnvironmentETL Environment 

+reating an ETL environment requires six "asic in!rastructure components,

A ,et-or% Environment to connect source data systems to t#e

are#ouse plat!orm

A RDB'S !or t#e are#ouse

A Sort.'erge utility to integrate data !rom t#e various source systems

A method to perorm calculations 

A $tility to Schedule and Run ETL batch cycles "ased on events ortimelines

A Change 'anagement utility to manage updates and version control o!

programs and scripts

Page 9: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 9/16

 

Implementation 'ethodologyImplementation 'ethodology 

-e!ine "usiness requirements !or t#e are#ouse project

 Analyse t#e source systems

-evelop p#ysical data model

-esign ETL processes

Page 10: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 10/16

 

Design StagesDesign Stages

  -esigning process !ollos a systematic staged apprac#. /arious stages create asingle processing met#od !or initial and successive loads to t#e dataare#ouse. It involves 0 stages #ic# provide a modular and adjusta"letrans!ormation process !or t#e target ta"le t#at can adapt easily to c#anges int#e source systems or t#e are#ouse model desig

Stage /0 Source veriication 

per!orms t#e access and extraction o! data !rom t#e source system and "uilds atemporal vie o! t#e data at t#e time o! extraction

Stage 10 Source alteration per!orm a variety o! trans!ormations unique to t#e source) depending on "usiness

requirements

Stage 20 Common interchange 

applies "usiness rules and1or trans!ormation logic t#at is !requent across multipletarget ta"les

Stage 30 Target load determination 

per!orms !inal !ormatting o! data to produce load%ready !iles !or t#e target ta"le2identi!ies and segregates ros to "e inserted vs. updated $i! applica"le&2 appliesremaining tec#nical meta data tagging2 and processes data into t#e R-BM(

Stage 40 Aggregation 

!inal stage) uses t#e load% ready !iles !rom (tage 3 to "uild aggregation ta"lesneeded to improve query per!ormance against t#e are#ouse

Page 11: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 11/16

 

ETL +rocess E5ampleETL +rocess E5ample 

Stage /0 Source 6eriication

source system is a #uman resources $*R& ERP system

target is an organi4ation dimension ta"le t#at #appens to use type 5 sloly

c#anging dimensions

6or7ing !iles

ne

organi4ation

records

#I7$RE /8 (ource veri!ication) alteration) and common interc#ange stages.

Page 12: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 12/16

 

ETL +rocess E5ampleETL +rocess E5ample

Stage 10 Source Alteration 

e append data !rom secondary sources.

  In t#is case t#e *R ERP Region ta"le 8 to t#e primary organi4ational

extract !ile

6or7ing !ile

ne

organi4ation

records

Page 13: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 13/16

 

ETL +rocess E5ampleETL +rocess E5ample

Stage 20 Common Interchange 

e !ind t#at t#e region name values stored in t#e *R ERP system do not

con!orm to t#e esta"lis#ed enterprise de!initions t#us e need to use t#e

merge in!rastructure utility to update t#e organi4ation record region names

to re!lect t#e enterprise versions 999.

Page 14: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 14/16

 

ETL +rocess E5ampleETL +rocess E5ample

Stage 30 Target Load Determination 

e compare t#e current load o! organi4ation records against t#ose

previously loaded in earlier "atc# cycles

Page 15: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 15/16

 

ETL +rocess E5ampleETL +rocess E5ample

Stage 40 Aggregation

e !lag ne ros !or insertion, current load%cycle records t#at #ave relevant

columns t#at do not matc# t#eir corresponding organi4ation dimension ta"le

ros) ne region names) or manager I-s

Page 16: ETL Implementation Strategy

8/9/2019 ETL Implementation Strategy

http://slidepdf.com/reader/full/etl-implementation-strategy 16/16

 

Re9uired eatures in ETL toolRe9uired eatures in ETL tool 

 Arc#itecture :li7e *u" spo7e or

client server scala"le and extensi"le tec#nology%

scale up as data gros

+lient plat!orm support : indos

571;01;< etc

(erver plat!orm support : (un

(olaris) *P%=>)AI> etc.(upport !or ERP sources

(upport !or parallelism

+ode generator

-ata trans!ormation met#od

(upport !or managing and "uilding

aggregates

(upport !or various industry standard

data types

-ata ?uality !unctionality !eature

Exception #andling capa"ility

ETL process management

Bac7up and recovery !eature

Metadata capture support/ieing metadata

(ecurity o! metadata

6e" integration support

(upport !or versioning

Installation procedure

(upport !or s#ara"le repository

(upport !or designing data marts

(upport !or importing data models

!rom modeling tools

(upport !or di!!erent 7ind o!

trans!ormations Adapta"ility

(upport !or grot#

 A"ility to #andle various source

types

(upport !or external loader