data quality management in sap netweaver bi - webinar powerpoint
TRANSCRIPT
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 2
First-Hand Information
In this presentation, we will examine the following topics:
Definition of Data Quality Management in respect to data warehousing in SAP NetWeaver BIOverview of tools SAP NetWeaver BI provides in order to support data quality initiativesPresentation of new features in the area of Data Quality Management in SAP NetWeaver 2004s
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 3
Agenda
Overview Data Quality Management
Data Validation
Error Handling
Error Resolution
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 4
Definition of Data Quality
IntegrityBetween objects /
systems
TimelinessIs data up to date?
RelevanceComprehensibilityMeaningfulness
CompletenessBetween objects /
systemsOn record level
ConsistencyPlausibilityRedundancy
AccuracyUniformityUniquenessCertaintyCorrectness
Data Quality criteria:
have to be defined by business and IT
have to be regarded on technical and semantic level
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 5
Data Quality Process in SAP NetWeaver BI
Data ValidationDefine data quality and define necessary checksUse automated checks in SAP NetWeaver BI and implement customer checks where necessaryConsider data quality in your Enterprise Data Warehouse design
Error ResolutionResolve data quality issues after data loadingThis can involve deletion, repairing and reloading of dataThis should also include periodic analysis of the data in your BI objects including repair options
Error HandlingDefine reaction on errors during data loadingRetain invalid data for manual or automated correction and subsequent updating to InfoProvidersGet detailed information on type of error and place where it occurred
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 6
Agenda
Overview Data Quality Management
Data Validation
Error Handling
Error Resolution
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 7
Data Validation - Scope
Data validation in SAP NetWeaver BI answers the following questions:
check what?Technical qualitySemantic quality (Business rules)
check where and when?During data loading
In source system, in data transfer, in transformation, in InfoProvider update
On persistent dataOn PSA data, on InfoProvider data
check how?Built-in in data transfer (automatic checks)Implemented in transformation (business rule based)Scheduled
What?
Where?
How?
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 8
Data Validation – Checks in SAP NetWeaver BI
Technical SemanticField level
Data type & conversion exits
CodepageTechnical consistency in respect to BI technology (RSRV)
Plausibility (empty field, plausible value,..)
Technical consistency of data contained in BI Objects (RSRV)
Reconciliation
Referential integrity & Master data check (SID)
Record level
Plausibility on correlation between characteristics and key figures
Table level
Records sent = records updated, no aggregation allowed
Plausibility on aggregation and calculation of multiple data records
Duplicate records in PSADuplicate master data records
Built-in To be implemented
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 9
Data Validation – Built-In Checks
DataSource
OthersBI ServiceAPI
InfoProvider
Transformation
InfoSource
Transformation
InfoSource
Transfer Rules
Update Rules SAP BW 3.xSAP
NetWeaver 2004s
SAP NetWeaver BI
Enhanced!
New!
Duplicates on master data
Referential Integrity
Duplicates in transaction data
Data Type & Conversion Exits
Completeness
Completeness
Codepages
Semantic checks
(based on business
rules)
Technical checks
Source system
Source system
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 10
Focus area: Checking Data Types and Conversion Exits
In SAP NetWeaver 2004s, checks on Data Type and Conversion Exits can be enabled per field of the DataSource.
plausible date fieldsplausible time fieldscharacter values in data type NUMC fields andcompliance with the ALPHA conversion routine.
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 11
Data Validation – Checks to be implemented
Source system
DataSource
OthersBI ServiceAPI
Data Warehouse(Data Acquisition Layer)
Transformation
InfoSource
Transformation
InfoSource
Transfer Rules
Update RulesSAP BW
3.x
SAP NetWeaver
2004s
Source system
SAP NetWeaver BI
Data Warehouse(Integration Layer)
Operational Reporting
Data Marts
Reconciliation& Audit Trace
Data Integrity PSA
Reconciliation& Audit Trace
Custom check points
Customer-defined checks
Semantic checks
(based on business
rules)
Technical checks
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 12
Focus area: Implement semantic checks I
Implement semantic checks in customer transformations (SAP BW 3.x: in the update or transfer rules). These checks can call the monitor and therefore the Error Handling)
From update rules (BW 3.x) append to table MONITOR or See “How To… Create monitor entries from an update routine”
From transfer rules (BW 3.x) append to table G_T_ERRORLOGprocess single record using field RECORD
From transformations append table MONITOR (for monitoring only) and / or raise an exception (for storage in the error stack)
Process single record using field RECORDTo skip records from processing, raise exception cx_rsrout_skip_record. You can abort the whole data package by raising exception cx_rsrout_abort.
Check on master data completenessMaster Data completeness (on attributes and / or texts) can be checked using Transformations / DTPs (SAP BW 3.x: Export DataSources)
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 13
Focus area: Implement semantic checks II
Use custom check points during extractionWrite check point data to separate table and use this for validation.See “How To... Perform Data Load Consistency Checks in BW”
Build Audit Trace in Data ModelingAudit Traces (source, timestamp,…) allow for trace back to source system. You only use them on objects in the Data Warehouse Layer for simplified integrity checks against source systems.
Data Integrity Checks on Data Packages in PSACan be achieved using available APIs on PSA
Build Reconciliation procedureDependent on the scenario this can include checking the technical data transfer only or additional check on semantics (aggregation, calculation in extraction).See “How To…Validate Data In the InfoCube By Comparing to Data In the PSA” and “How To…Reconcile data between SAP source systems and SAP NetWeaver BI”
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 14
Focus area: Data Reconciliation in SAP NetWeaver BI
SAP Source system
SAP NetWeaver BI
MultiProvider
MultiProvider
DataStore Object VirtualProvider
OriginalDataSource
ReconciliationDataSource
Generic DataSourceOriginal DataSource with direct accessReconciliation DataSource in Business Content
First DataSources have been shipped in Business Content for SAP NetWeaver 2004s BI
no customer defined transformations!
New in SAP NetWeaver
2004s!
* Can be enhanced by customer specific exception sending proactive alerts when data quality issue occurs
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 15
Agenda
Overview Data Quality Management
Data Validation
Error Handling
Error Resolution
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 16
Error Handling in SAP NetWeaver BI
Error Handling in SAP NetWeaver BI consists of:
What to do in case of errors?Abort data loading or continue data loading on valid dataReport or do not report on valid dataConfigure up to which threshold invalid data is acceptable
Termination after a certain number of errors
How to monitor invalid data?Show error status in original requests in PSA and in separate error stack
How to correct the invalid data?Manual correction or automated correction (to be implemented) of invalid data in error stack (in SAP BW 3.x: error request)More complex error resolution scenario involving the source system or PSA or other objects in your Data Warehouse
See more details in next chapter “Error Resolution”
Recommendation: Correct errors as early as possible in the data flow!
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 17
Error Handling in SAP BW 3.x
DataSource
InfoSource
Transfer Rules
Update Rules
InfoProvider
Source system
PSA
Original Request
Manual or automated correction
Error Request
InfoPackage
Valid recordsInvalid records
Corrected records
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 18
Error Handling in SAP NetWeaver 2004s - I
Source system
DataSource
InfoProvider
DTP
Transformation
InfoSource
Transformation
No error handling available in InfoPackagesInvalid data can be written to error stack Termination after certain number of errors
can be configured (like in SAP BW 3.x)
Error Stack
Invalid records
Data Transfer Process
Corrected records
Manual or automated correction
Error DTP Valid records
All records
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 19
Error Handling in SAP NetWeaver 2004s - II
Data Transfer Process
DataSource
Error Stack
Data Store ObjectMaintenance of Error Stack key in Data Transfer Process
Semantic key can be definedMore key fields = potentially less records in error stack
New records with the same key will be filtered out.
In the same request and in subsequent requestsOnce a request is deleted in the InfoProvider, the related data records in error stack are automatically deleted.
If Data Store Objects are connected, their key is taken as initial default
Add. KEY
Error DTP
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 20
Focus Area: Data Transfer Process and Temporary Storage
Temporary storage in the data transfer process forEfficient restart of data transfer process in case of error
Temporary storage is used for restart in case of complete abort of processReloading of corrected records from Error Stack is done using the Error DTP
Easy monitoring of invalid data records
Activate temporary storage for each
sub-step of the Data Transfer Process
Identify the detail level of temporary storage
Configure automatic deletion of
temporary storage
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 21
Error Handling
No Error handling
(InfoPackage)No update,
no reporting
Update valid records, no reporting
Update valid records, reporting
possible
Monitor entry X X X X
Abort of update X X
Upd. valid records X X
Marked in tem. storage X X X
Update into Error Stack X X
Color of Request red red red green
Error Handling – SummaryCheck table for the impact of the error handling settings for erroneous records
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 22
Agenda
Overview Data Quality Management
Data Validation
Error Handling
Error Resolution
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 23
Error Resolution in SAP NetWeaver BI
SAP NetWeaver BI offers the following error resolution options:
Automated correction of invalid data during data loading and transformation (to be implemented)
Retain invalid data records in error stack (error request) and correct them manually (or automatically)
and subsequent transfer of corrected data records from the error stack to the InfoProvider using the Error DTP functionality (created automatically within the DTP maintenance)
Correction of invalid data without deletionby scheduling a (Full) repair request against „Overwriting“ Data Store Objects of your Data Warehouse Layerby loading (additive) cancellation records to relevant InfoProviders
(Selective) Deletion of invalid dataand reconstruct (corrected) data from source tables (full upload)and reconstruct data from the delta queue (repeat delta update) and reconstruct (corrected) data from PSA and reconstruct (corrected) data from your Data Warehouse Layer
Analysis and Repair of BI Objects (RSRV)
During data load
Error Handling
After data load
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 24
Focus area: Overwrite data in SAP NetWeaver 2004s
Order Status Quantity719 C 12 KG720 C 10 KG721 O 5 KG
Data in DataStore
Object
Data in source system
(Aggregation type = ‘Overwrite”)DTP
Delta Update
Order Status Quantity719 C 12 KG720 C 8 KG721 O 5 KG
Invalid data gets overwritten!
Order Status Quantity720 C 8 KG
Data in DataSource /
PSA
InfoPackage Full Update
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 25
Focus area: Analysis and Repair of BI Objects
Transaction RSRVChecks the consistency of data and objects stored in SAP NetWeaver BISome tests are capable to repair inconsistencies and errors
Individual test packages can be created combining different elementary testsTest Packages can be scheduled periodically using program RSRV_JOB_RUNNER in process chainsExamples for tests:
Unauthorized characters in characteristic valuesCheck characteristic values with conversion exitConsistency of the time dimension for an InfoCubePSA Duplicate Record CheckAnalysis of texts of BEx Objects for codepage errors
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 26
Summary
Data Quality in Data Warehousing consists of several technical and semantic criteria. It has to be defined on project basis by business and IT department.
SAP NetWeaver BI offers various tools for Data Validation & Error Handling that will assist you to easily detect and correct invalid data.
Your Enterprise Data Warehouse design plays an important role in providing a reliable Data Ware-house Layer for efficient Data Quality Manage-ment.
Some situations require more complex error resolution scenarios. If so, choose the appropriate error resolution measure by correcting the error as early as possible in your data flow.
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 27
Outlook
Integrate the existing BI based solution capabilities byAddressing IQ management integrated into process platform based on the Enterprise Information Management approach to avoid and, if necessary, correct problems as early as possible in the information lifecycle. Providing one IQ design time enabling ESA integrated model driven approach and make IQ models shareable/reusable in ALL application domains (process platform, application development etc. ). Consolidate this way all existing solutions in the platform in one framework.
Offer IQM as an integrated solution, supporting the complete information control loop by providing necessary:
modelsprocessestechnical capabilities ( own / partners )tight application integration
© SAP AG 2006, Data Quality Management in SAP NetWeaver BI / 28
Copyright 2006 SAP AG. All Rights Reserved
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, System i, System i5, System p, System p5, System x, System z, System z9, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, Informix, i5/OS, POWER, POWER5, POWER5+, OpenPower and PowerPC are trademarks or registered trademarks of IBM Corporation.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology.
Java is a registered trademark of Sun Microsystems, Inc.
JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape.
MaxDB is a trademark of MySQL AB, Sweden.
SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.
The information in this document is proprietary to SAP. No part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of SAP AG.
This document is a preliminary version and not subject to your license agreement or any other agreement with SAP. This document contains only intended strategies, developments, and functionalities of the SAP® product and is not intended to be binding upon SAP to any particular course of business, product strategy, and/or development. Please note that this document is subject to change and may be changed by SAP at any time without notice.
SAP assumes no responsibility for errors or omissions in this document. SAP does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.
SAP shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. This limitation shall not apply in cases of intent or gross negligence.
The statutory liability for personal injury and defective products is not affected. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third-party Web pages nor provide any warranty whatsoever relating to third-party Web pages.