automate data warehouse etl testing and migration testing the agile way

17
Data Warehouse/ETL Testing Migration Testing The Agile Way ETL Testing & Monitoring Platform

Upload: sandesh-gawande

Post on 10-Jan-2017

648 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Automate data warehouse etl testing and migration testing the agile way

Fast forward Data

Warehouse/ETL Testing Migration

TestingThe Agile

Way

ETL Testing & Monitoring Platform

Page 2: Automate data warehouse etl testing and migration testing the agile way

Author: Sandesh GawandeCTO -iCEDQTorana, Inc.Email: [email protected]: 203 666 4442Twitter: @sandesh_gawandeSkype: Sandesh.Gawande.ToranaIncLinkedIn: https://www.linkedin.com/in/sandesh-gawande-1a25757

About: Torana, Inc. Year established 2005 Stamford, CT Developers of Big Data Integration &

Data Migration, Data warehouse/ETL Testing Software

Fortune 500 customers in Banking, Insurance, Healthcare, e-commerce, Manufacturing

Page 3: Automate data warehouse etl testing and migration testing the agile way

Development & QA Testing

Customer Finds It Post Release

Leads to Litigation, Fines, Reputation Loss

1x

100x

10,000xFind the issues

earlyor the costs are too High

Page 4: Automate data warehouse etl testing and migration testing the agile way

Finding Issues in QA Stage is the Best, but QA is…

Not Agile With waterfall approch its too late...

Not Automated Manual data checks Wasted Time No repeatability or consistency No way to test millions of rows Wrong focus on creating scripts rather than the business

problem Cannot reconcile data across systems (e.g. Files vs. Database)

Not Collobrative QA teams work in isolation No feedabck to developers or business users Disoragnized

No Transparancy Late Discovery of Issues Project Failure or High Costs

Not Agile

Manual & Slow

Not Collaborative Or Feedback

No Transparency or Compliance

Page 5: Automate data warehouse etl testing and migration testing the agile way

But Why is it so Difficult to Automate ETL Testing?

ETL Processes don’t have screens Conventional QA Automation product were designed for

screen based testing New Concepts

Source Data + Transformation = Target Data Quality of an ETL Process = Expected Data vs. Actual Data Most developers are from traditional software development New to concepts such as data reconciliation for ETL Testing Mix up of QA/QC concept with Data Quality

High Volume of Data (Millions of rows) Since the source data and target data could be in two

different systems reconciliation is difficult

Page 6: Automate data warehouse etl testing and migration testing the agile way

Introducing iCEDQ…

Automated & Fast

Agile

Collaborative

Feedback & Transparency

An automation platform for:A.Data Warehouse

Testing• ETL Testing• MDM Testing• Data Integration Testing

B.Data Migration TestingC.Data Monitoring

Page 7: Automate data warehouse etl testing and migration testing the agile way

iCEDQ has in-Memory Rules Engine

It Tests ETL Transformations by Validating the output data generated by ETL Validation Rule

Reconciliation Rule

It Tests ETL Transformations by Reconciling Source Data Vs. Target Data …

ETL

Page 8: Automate data warehouse etl testing and migration testing the agile way

Data Warehouse/ETL Test AutomationData Sources ETL Data Warehouse

Tech Validation Test Biz Validation Test

Biz Reconciliation Test

Tech Reconciliation Test

Validation Tests Reconciliation Tests Technical Validation Rule Business Validate Rules Technical Reconciliation Rules Business Reconciliation Rules

Validate incoming data before processing. Test for… • Data format• Nulls• Data types• more

Business rules based validation will indicate if there is an data issue becuase of ETL processes, Source data or wrong requirements...• Check if Net Amount =? Gross

Amout – (Taxes + Fees + Commisions)

These rules test specific to an ETL process which is doing transformation…• An ETL processes calculating end

of day balances from daily transactions can be tested. Sum of todays transactions =? Today’s End of Day balance – Yesterday’s End of Day balance

These tests are designed to test the overall system independent of the ETL processes, Source data or business requirements

A

Page 9: Automate data warehouse etl testing and migration testing the agile way

Data Migration Test AutomationLegacy System ETL New System

Initial Reconciliation Test

Post Reconciliation Test Initial Migration Testing Post Migration Testing

1st Create the data structures in the target system. Ex. Table, columns. 2nd copy the initial data from the legacy system to the new database• iCEDQ can validate the tables, columns, data types &

precision• Reconcile the legacy vs. target data to make sure they have

the same initial state

Once the initial state is populated & tested. The post migration phase involves. Feeding the same data or triggering of same business processes in legacy system and the new system.• iCEDQ can reconcile the data to make sure the after running

the business processes the data generated same• Because regardless of the system change, unless there is a

business rule change the net output from business point of view must be same

B

Page 10: Automate data warehouse etl testing and migration testing the agile way

Production Data Monitoring AutomationSource Stage Stage Data

WarehouseData Warehouse Data

MartsData Marts Reports /

Extracts

ProcessLoad StageCustomer

ProcessLoad Stage

Policy

ProcessLoad Stage

Claims

Process Load Dim Customer

ProcessLoad Daily

Claims

ProcessLoad Month

Policy

ProcessLoad P&L

Process Load Dim Customer

Process Load Month

Claims

Start

Stop

Monitoring in Series Monitoring in ParallelEmbed iCEDQ Rules in the batch process• If Audit Fails the users are notified and the process can be

stopped automatically

The Audit Rules are run in parallel to the batch process• If Audit Fails the users are notified but the process is not

stopped automatically

iCEDQ

C

Page 11: Automate data warehouse etl testing and migration testing the agile way

User

Sto

ry

Tech Requirements Mapping Document ETL Process

Audit Requirements Test Case

iCEDQ Rule 1

iCEDQ Rule 2

Test processes in parallel to the development pipeline

No reasons to wait!

iCEDQ is Agile

Development Pipeline

QA Pipeline

Page 12: Automate data warehouse etl testing and migration testing the agile way

iCEDQ-Central Repository & Collaborative

Centralized Repository for Rules Library

An collaborative environment to work together

Work together regardless of the Location time Role

Page 13: Automate data warehouse etl testing and migration testing the agile way

iCEDQ-Feedback & Transparency

Dashboard Fails & Custom Reports Integration with ALM & Issue

management Auto Notification Ability to drill down to an defect Audit Logs & execution histroy…

Page 14: Automate data warehouse etl testing and migration testing the agile way

iCEDQ-What changed?

Before AfterNO Reconcile Across Files & database YES

Very Complicated SQL NO SQL or Simple SQLTest millions of rows

CostTest Coverage

NO Repeatability & Consistency YESNO Scheduling YES

Desktop Based Test Execution Server BasedNO Transparency & Reporting YES

Cost of DefectNO Regression Testing & Audit YESNO Production Monitoring YES

1000… Millions…100% 60%

High Low

HighLow

Page 15: Automate data warehouse etl testing and migration testing the agile way

Who uses iCEDQ?

Stock Exchange Banks Insurance Manufacturing Healthcare E-Commerce Manufacturing …

Page 16: Automate data warehouse etl testing and migration testing the agile way

iCEDQ Healthcare Client

iCEDQ Usage iCEDQ was used for Migration Testing

Test provider data migration from Mainframe to MDM

iCEDQ Enterprise Data Warehouse Testing Test Members Data, Enrolment data, Plans

Data, Claims Data load from Legacy to (Enterprise Data warehouse)EDW & Health Rules to EDW

iCEDQ to Validate External Feeds Test data feeds to State of Maryland, CMS

(Centers for Medicare & Medicaid Services)

iCEDQ Feedback Helped Finalize Requirements

It found anomalies in the requirements and mapping documents and provided feedback

Helped Test Automation It was able to automatically reconcile feeds

from legacy as well as new system. This was impossible to test manually

Transparency to management It linked with defect management system

and auto generated status

Page 17: Automate data warehouse etl testing and migration testing the agile way

Fast forward Data Warehouse

& Migration Testing

ETL Testing & Monitoring Platform