table29 data validation 95

18
1

Upload: franky-lao

Post on 05-Dec-2014

1.915 views

Category:

Documents


0 download

DESCRIPTION

Informatica DVO

TRANSCRIPT

Page 1: Table29 Data Validation 95

1

Page 2: Table29 Data Validation 95

2

Data Validation Overview

Page 3: Table29 Data Validation 95

3

What is Data Validation?

Identifying errors in data sets that have

been Moved or Transformed to ensure

they are Complete and Accurate and

meet Expectations or Requirements.

Page 4: Table29 Data Validation 95

4

How most companies test today

• Many companies perform testing manually by writing SQL scripts, using Excel, or hand coding testing logic into their integration processes

• Reconciliation is done manually (if done at all)− basic SQL scripts (row counts, aggregates) or manually written mappings/logic

• Customers estimate data testing SHOULD take 25-30% of all hours spent on Data Integration− Most customers admit they do not do enough data validation, resulting in

poorer data quality and higher project risk

• PowerCenter upgrades can take up to weeks or months to complete due to manual testing effort− It takes one day to upgrade the ETL software

Page 5: Table29 Data Validation 95

5

Problems with Manual Testing• Takes a long time and is expensive

− Time is spent writing queries and waiting for them to run and then searching through the results

• Error-prone manual process− “Stare and compare”

• Cannot perform thorough testing− Time/Cost pressure leads to “try it here and there” approach− Testing ends when the deadline is reached, done or not

• Usual problems associated with writing custom code− No audit trail− No reuse− No methodology

Page 6: Table29 Data Validation 95

6

Current Approach: Like a Photo Hunt

Page 7: Table29 Data Validation 95

7

Current Approach: Stare and Compare

Data Set #1 Data Set #2

Page 8: Table29 Data Validation 95

8

What is the Data Validation Option?

DVO is a independent (black box)

testing solution that provides

automation, repeatability and

auditability to virtually any data

testing or reconciliation process.

Page 9: Table29 Data Validation 95

9

Some DVO use cases Data Being Transformed

• ETL Reconciliation• Data Masking• ETL Testing• Application Migration

Data is Identical• ETL version upgrade• ETL Migration• Database migration• Application Retirement

Page 10: Table29 Data Validation 95

10

Two Value Propositions for DVO

Production Reconciliation

Protect the integrity of data that is loaded into

production systems.

Erroneous data due to failed loads, faulty logic or operational issues is caught in a proactiveautomated manner and can be

addressed as needed

Development & Test

Provide automation for unit and regression testing of integration logic.

Ensure that data produced by DI code meets requirements and

expectations

Ensure the integrity of data as it moves throughthe IT environment.

Page 11: Table29 Data Validation 95

11

Two Value Propositions for DVOEnsure the integrity of data as it moves through

the IT environment.

Page 12: Table29 Data Validation 95

12

How DVO works with PowerCenter

Data Validation OptionRepository

& WarehouseDVO ClientsReportsDatabase

Views

Id: namename: stringPrice: integerDate in: dateDate out: dateSalary: float

V_Summary

Id: namename: stringPrice: integerDate in: dateDate out: dateSalary: float

V_Tests

Id: namename: stringPrice: integerDate in: dateDate out: dateSalary: float

V_Results

Enterprise Data

Repository and Integration Services

Repository

PowerCenter

Define Tests

Execute TestsData

Accessed

Results

Page 13: Table29 Data Validation 95

13

Key Features of DVO• Broad data connectivity

• DBMS (Oracle, SQL Server, DB2, Sybase, Teradata, Netezza)• Mainframe (DB2 z/OS, DB2 AS/400, IMS, Adabas, MF Flat files, VSAM)• SalesForce.com , SAP transparent tables, SAS, ODBC and Flat files

• Numerous built-in tests• COUNT, COUNT_DISTINCT, COUNT_ROWS, MIN, MAX, AVG, SUM• SET AinB, SET BinA, SET AeqB• VALUE, OUTER VALUE, Expressions

• Model ETL constructs• LOOKUPs, Arbitrary SQL Relationships

• Other• Run from GUI or CLI (DVOCmd)• Built-in reporting

Page 14: Table29 Data Validation 95

14

Comparing DVO with Manual Testing

Page 15: Table29 Data Validation 95

15

• Have created a test suite of over 1000 Tests

• Testers can manage the testing environment

• Can test large volumes of data• Testing time reduced from 1

week to 1 day (80% less)• Spend “free time” on higher

level tasks

SAAS provider of Sales Compensation and analytics• Data absolutely has to be correct as it affects peoples’

paychecks• Very high visibility of the data with users• Trust in the data is key

KEY BUSINESS IMPERATIVE AND IT INITIATIVE

INFORMATICA ADVANTAGE RESULTS/BENEFITS• New release every ~1 month• 1 Full week of data testing

by QA team per release• Developers wrote SQL for

testing the data• Testers would execute the

SQL, track errors and work with Developers to resolve

• And who was testing the SQL to make sure it was correct?

THE CHALLENGE

Technology CompanyReduced data testing time by 80%

with Data Validation Option

• With DVO they are able to test 100,000s rows of data in regression tests

• Developers no longer required to write SQL

• Testers are now empowered and independent of developers

Informatica Confidential – Under NDA

Development and Test

Page 16: Table29 Data Validation 95

16

• DVO found where data was missing

• Found thousands of missing records due to bad coding, & improperly rerun failed jobs

• Reloaded all missing data in two weeks

• They are looking to implement ongoing incremental validation for all new data loaded into tables

Good data is essential to good business decisions. Their calculations of portfolio risk and value must be correct.

• Spends “hundreds of millions” purchasing troubled debt in the USA• The data and risk calculations on those assets must be correct.• Bad data could cost them “millions” and put them out of business.

KEY BUSINESS IMPERATIVE AND IT INITIATIVE

INFORMATICA ADVANTAGE RESULTS/BENEFITS• Business users were

complaining about missing data in the systems.

• Data errors can lead to very costly bad business decisions.

• They were doing manual testing via developer-written mappings and PL/SQL

• Other products available today could not meet their requirements

THE CHALLENGE

Financial Services CompanyEnsures DW is Complete and Accurate

with Data Validation Option

• With DVO they are able to perform detailed reconciliations across source and target systems.

• With DVO, they have a complete audit trail.

Informatica Confidential – Under NDA

Production Reconciliation

Page 17: Table29 Data Validation 95

17

• Identified errors due to faulty DI logic, and error handling process

• Ensured incorrect records no longer being use in marketing campaigns

• Bad customer data no longer reaching customer in portal

Customer and contact hub is pivotal to efficient business operationsMillions of records processed across various systemsEnsure BAs, line managers and customers had access to accurate and complete data based on their needs

KEY BUSINESS IMPERATIVE AND IT INITIATIVE

INFORMATICA ADVANTAGE RESULTS/BENEFITS• No easy way to reconcile

data in systems to identify bad data or identify extent of errors

• Incorrectly augmented data in systems

• Gold record data didn’t always match across systems

• Faulty records propagated downstream.

THE CHALLENGE

Mid-size Technology CompanyReconciling MDM data using

Data Validation Option

• DVO reconciled data across systems (e.g. SalesForce and Hub) and found:

• 1000s of missing records between systems

• Incorrectly augmented D&B data

• Improperly coded golden records

Informatica Confidential – Under NDA

Production Reconciliation

Page 18: Table29 Data Validation 95

18