sql server etl framework

16
SQL Server ETL-framework Generate and execute SSIS packages Jeroen Nijs Senior DWH/BI Consultant Eindhoven The Netherlands [email protected]

Upload: nijs

Post on 10-Aug-2015

86 views

Category:

Software


5 download

TRANSCRIPT

Page 1: Sql server etl framework

SQL Server ETL-framework

Generate and execute SSIS packages

Jeroen NijsSenior DWH/BI ConsultantEindhovenThe [email protected]

Page 2: Sql server etl framework

2

Requirements

• Generate SSIS Packages (fixed patterns)• Meta data• Project Deployment Model• Use SQL Server Agent to execute SSIS packages• Control• Logging• Restart possibility• Simple• Understandable

Page 3: Sql server etl framework

3

Improvements

• Controlled Parallel Execution of SSIS packages• Dependencies between SSIS packages• Possibility to set execution order based on runtime• Masterpackage in Control Domain• Use of environment variables

Page 4: Sql server etl framework

4

Domains

Logging

SSIS PackagesControl

Page 5: Sql server etl framework

SSIS packages

Types:• Source to Staging area• Staging area to Historical data layer (Persistent Staging Area)• Historical data layer to Dimensions• Historical data layer and Dimensions to Facts• Starmodel to dedicated Starmodel

5

Page 6: Sql server etl framework

SSIS packages

Meta data• Sourcetable / File• Staging table• CreateStagingTable.sql

• MD5• MD5Hash_Formula.sql

• Historical data layer table• CreateHistoricTable.sql

6

Page 7: Sql server etl framework

SSIS packages

Source to Staging• SSIS package with DELTA-load• SRC2STG_MEDICAT_RECDEEL

• SSIS package without DELTA-load• SRC2STG_MEDICAT_MEDICIJN

• BIML Script• SRC2STG.biml

7

Page 8: Sql server etl framework

SSIS packages

MetaData• SQL Server• Database: Control• Schema: Meta

8

Page 9: Sql server etl framework

SSIS packages

Staging to Historical Data Layer• SSIS package• STG2HIS_MEDICAT_RECDEEL

• BIML Script• STG2HIS.biml

9

Page 10: Sql server etl framework

Logging

• What• When• Result• Conditions• Levels:• Job• Step• Details

• Number of records processed• SQL Server• Database: Control• Schema: Log

10

StatusDescriptionEErrorIInfoNNumbersPParameterRRunningSSuccessfulTTableWWarning

Page 11: Sql server etl framework

Control

• What• Sequence• Dependencies• Conditions• Levels:• Application• Package• Parameters

• SQL Server• Database: Control• Schema: Control

11

LoadStrategyDELTADELTA_KEYSFULLKEYS

Page 12: Sql server etl framework

Control

Storage Procedures and Functions:

• Schema Control• Add, Set, Delete and Get• ControlMaintenance.sql• LastStartDateOfSuccessfulTableLoad• UpdateParameterTypeSelectDateFrom• UpdateParameterTypeSelectDateFrom2 (dependent on other package)

Disable/Enable:• Application• ApplicationPackage• PackageParameter

12

Page 13: Sql server etl framework

Control

Storage Procedures: Execute

• Control.GetEnvironmentReferenceID• Control.GetEnabledApplicationPackages• Control.ExecuteCatalogPackage

SQL Server Agent:• Job• Steps• Parameters

13

Page 14: Sql server etl framework

Master Package

14

• Status of application• Packages to execute• Environment variable• Parallel execution• Number of parallel tasks (0..n)• Dependencies between packages

Page 15: Sql server etl framework

Future improvements/extensions

15

• Generate staging tables• Generate tables for Historical Data Layer• Improve the generation of SSIS packages with BIML Script by using more MetaData• Documentation (inclusive the sequence of execution)• Reports: status Control domain, Logging• Convert SQL into SP: determine the sequence of package execution based on runtime• When the source is cleaned up , the no longer existing resource records in the historical

data layer should not be regarded as deleted .

Page 16: Sql server etl framework

SQL Server ETL-framework

16

Jeroen NijsSenior DWH/BI ConsultantEindhovenThe [email protected]