db vs etl test

9
 Difference between ETL and Database testing 0 inShare 0 ETL testing applies to Data Warehouses or Data integration projects while Database Testing applies to any database holding data (typically transaction systems). Here are the high level tests done in each: ETL Testing : Primary goal is to check if the data moved properly as expected Database Testing : Primary goal is check if the data is following the rules/standards defined in the Data Model ETL Testing :  Verify that the counts in the source and target are matching  Verify that the data is matching between source and target  Verify that the transformed data is as per expectation  Verify that the data is incrementally getting updated  Verify that the foreign - primary key relations are preserved during the ETL  Verify if there are any duplicates in the loaded data Database Testing :  Verify that the foreign - primary key relations are maintained and there are no orphan records  Verify that the data in the columns had valid values. eg. If there is a domain (encoded list) defined for a column, check if the values in the column meet that requirement  Verify that the data in columns is accurate. eg. If you have an age column, does it have values more than 100  Verify if data is missing in columns where required. eg. If is a column is expected to have data always, check if there are any null values  

Upload: 1raju1234

Post on 03-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 1/8

 

Difference between ETL and Database testing0

inShare

0

ETL testing applies to Data Warehouses or Data integration projects while Database

Testing applies to any database holding data (typically transaction systems). Here are the

high level tests done in each:

ETL Testing : Primary goal is to check if the data moved properly as expected

Database Testing : Primary goal is check if the data is following the rules/standardsdefined in the Data Model

ETL Testing :

  Verify that the counts in the source and target are matching

  Verify that the data is matching between source and target

  Verify that the transformed data is as per expectation

  Verify that the data is incrementally getting updated

  Verify that the foreign - primary key relations are preserved during the ETL

  Verify if there are any duplicates in the loaded data

Database Testing :

  Verify that the foreign - primary key relations are maintained and there are no orphan

records

  Verify that the data in the columns had valid values. eg. If there is a domain (encoded list)

defined for a column, check if the values in the column meet that requirement

  Verify that the data in columns is accurate. eg. If you have an age column, does it have

values more than 100

  Verify if data is missing in columns where required. eg. If is a column is expected to have

data always, check if there are any null values

 

Page 2: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 2/8

Nadia A.

Nadia A. R.

Software Test Engg at Virtus-IT

ETL basically stands for Extract Transform Load - which simply implies the process where

you extract data from Source Tables, transform them in to the desired format based oncertain rules and finally load them onto Target tables. There are numerous tools that help

you with ETL process - Informatica, Control-M being a few notable ones.

So ETL Testing implies - Testing this entire process using a tool or at table level with the help

of test cases and Rules Mapping document.

In ETL Testing, the following are validated -

1) Data File loads from Source system on to Source Tables.

2) The ETL Job that is designed to extract data from Source tables and then move them to

staging tables. (Transform process)3) Data validation within the Staging tables to check all Mapping Rules / Transformation

Rules are followed.

4) Data Validation within Target tables to ensure data is present in required format and

there is no data loss from Source to Target tables.

 

 Siraj

Siraj K.

Module Lead

Both ETL and Database Testing are same..Whatever we achieve with Data Warehouse

can be achieved by Database also...We all know that data warehouse is nothing but the set

of databases which follows some characterstics like : subject oriented,Non

Volatile,Integrated and Time Varient.

The only difference is that1) The data in database is Normalized in Nature and the datra in DWH is Denormalized in

Nature.

2) Database Testing uses ER Model whereas DWH uses Dimensional Modeling technique.

3) Database Testing is done on OLTP system and DWH testing is done on OLAP systems.

4) Database uses only .sql files whereas ETL/DWH can use any type of file to import data.

Except that their is no other difference between them.

Arpita C.Business Technology Analyst at Deloitte U.S. India offices

Page 3: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 3/8

But people say ETL testing is more valued than Functional testing..Being a ETL tester

myself,I dont find any difference in terms of salary for the 2 types of testing.

In what perspective,we then say ETL testing is more valuable?

 Sandeep

Sandeep P.

Associate - Projects at Cognizant Technology Solutions

ETL testing has slight difference to database testing. In ETL testing, we test the

workflows (in informatica tool), the worklets and sessions. We run each session separately

and check the functionality of that session. When an ETL fails , we can backtrack/ gets the

logs and can find the error message. We can test the performance of ETL by running

multiple workflows holding huge data simultaneously (provided these ETL workflows source

the data from same point)

 Siraj

Siraj K.

Module Lead

Hey Sandeep, i think you are little bit confused or may be i am wrong. As per your

comment,we test workflows,worklet and session in ETL testing which is different thanDatabase testing. But those workflows,session are doing the same thing which database is

doing.In database testing also, we are testing the functionality as per the requirement given.

I don't think its a valid difference.

I think the only difference is :

* Database testing is done on OLTP systems and ETL testing is done on OLAP systems. We

use OLTP as a source in ETL whereas in Databases OLTP systems are target tables.

* Database testing is done on normalized system but ETL testing is done on Denormalized

systems

* No. of joins ,tables and Index are more in Database testing as compared to ETL

Testing.

* Most important, Data warehouse is used to separate transactions from reports and

databases is transactional and con't use for Reporting and analysis.

* Data is less in databases as comapred to Data warehouse because duplicates can exist in

Data warehouse because of its denormalized stuctures. So, we are testing on huge amount

of data in ETL as compared to databases.

* In database testing, we are testing on a model which is created based on ER modelling but

Page 4: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 4/8

ETL testing model is created based on Dimensional modelling.

Like this, we have so many difference but all based on normalized and denormalized

concepts. Other than this i don't thing their is any difference in Database and Data

warehouse. Incase, you want to know the more difference,than let me know will tell you

many such difference.

Plz correct me if i was wrong!

 Suresh Kumar

Suresh Kumar D.Technical Architect at Hexaware Technologies

In DWH, it deals with historic data (normally more than 2 years of data), which will be

given as in input to ETL process, where the data is being extracted from various source files,

cleansed, transformed and loaded in to target tables, with the help of ETL tools like

Informatica, Ab-initio etc..

As Sandeep said, It is very important to validate the workflow and check whether the

transformation logics are correclty applied on the historical data. Since DWH has huge

amount data, we need to identify Performance bottlenecks (volume testing, stress testing)

by running the workflows multiple times and monitoring the log files for errors/warnings.

In DB, there is no ETL logic involved to load the data in to DB. It deals with current data,

which is entered by the user via any UserInterface screen.

 Pawan

Pawan D.

QA Team Lead at NTT DATA Americas

I think there is a lot of confusion among the people in understanding the difference

between Database testing and ETL Testing . Let me explain with an example : Let us say you

walk into a super market and buy some groceries and when you come to the billing counter

, the person will generate a bill by scanning those items . The system he uses to generate

these bills are called as OLTP ( online transaction processing systems) ,similarly if you walk in

to a bank and try to open an account ,your account gets registered in these OLTP systems.

Now , similarly like you there will be many more customers for the bank and also super

market , for a higher management to make some decision on their products such as thesales of some particular product in a particular store , a bank high level management they

Page 5: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 5/8

may want to upgrade all their silver card customers to gold card customers, to achieve this

all this data from various sources needs to be integrated and dumped in a datawarehouse

which are called as OLAP systems .

If you are doing any testing on OLTP systems it is known as Database testing and if you are

doing any testing on OLAP systems it is known as ETL testing .

In database testing we generally check the following :

1) Whether the expected tables are created with expected fields as per the metadata

described in the data mapping document.

2) Insert , update and delete operations performed on the front end system and check on

the back end database whether the same is reflected

3) Ensure no data truncations.

In etl testing we check for the following :

1) Lets say there are 100 rows in your source , you will verify that whether all this 100 rows

has moved to the target ( ie Row count ) . This is called Data completeness

2) Verify that the data has been integrated properly for all these 100 rows . This is called

Data accuracy.

Apart from the above checks there are many other things which comes into picture like SCD

implementations, null handling etc.

You can contact me on [email protected] , if you need more information

 Ponnusamy

Ponnusamy S.

ETL Test Lead at Shaw Communications

I see people are giving very good information about ETL. Here is the information I know.In my view testing is application of testers knowledge, skills and experties with a goal of

finding defects in the system before going to production. For this we need to thoroughly

understand the system and its functions.

The data warehouse concept emerged for strategic decision making at the executive

management level. Business need to analyse and understand the trend for forecasting

about future and to lead the business in right direction. For this they need data collected

over the years in the past by way of many many transactions by/ for customers. These pile

of historical data is sitting in the operational or transactional databases or file systems of

many organisations. These systems processing day-to-day transaction and store the data innormalized database is called OLTP system.

Page 6: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 6/8

 

But this is not much useful to the business community as it is for strategic decision making.

These data might be scattered across many different geographical locations, over long

duration generated by historically different systems.

ETL process extracts these heterogeneous data from different sources, validates, cleans

using some business rules and transforms to homogeneous data and load it into the target

database i.e data warehouse. These data is denormalised for easy analytical processing and

reporting.

As a tester we need to validate the data flow between source to target by proving that all

the source data are transformed as per the business rule and available in target database

accurately and completely without any data loss or duplication etc.

 Amit

Amit K.

Test Automation Expert

For me..Database testing is a subset of ETL testing.As ETL process involves

Extraction,Transformation and Loading process to be tested.Hence data integrity is checked

between different systems.Different system could be a different database e.g. DB2 and

Oracle.Hence the testing of ETL requires appropriate knowledge of disparate system as well

as appropriate transformation logic involved for testing the same.However database testingis mostly limited to one database with or without transformation.Hence,database testing

can be considered as a part of ETL testing however scope of ETL testing is much beyond the

database testing.

 SirajSiraj K.

Module Lead

Amit, database testing is not limited to one database with or without transformation.

We can extract data from different systems also for doing database testing. The only

difference in Database and DWH is the type of data which it contains. Database contains

normalized and transactional data but once we create a Data warehouse, it contains only

denormalized and Analytical data. We use data warehouse to seperate the reports from the

transaction.Data is less in databases as comapred to Data warehouse because duplicates

can exist in Data warehouse because of its denormalized stuctures.

Page 7: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 7/8

 Amit

Amit K.Test Automation Expert

Hi Siraj..I feel there is some gap in understanding...in ETL if source and Target both

system uses relational databse then where are we dealing with denormalized data.My

understanding of ETL is, it requires two(+) system for ETL to work and two(+) system could

be anything.Hence,Normalization and Denormalization of relation should not be considered

as a factor to differentiate between Database and ETL Testing...

 Pawan

Pawan D.

QA Team Lead at NTT DATA Americas

Hi Siraj, when extraction happens from different systems it should be termed as ETL

Testing ( because the words extraction and etl are co-related) , because if you would like to

extract data from different systems and load into another database or datawarehouse you

need to use etl tools. Instead if you meant we are integrating the data from different

systems to a consolidated system by using some procedures with out the use ofintermediate tools like Datastage or Informatica then it is called data migration and testing

you do on the system is still called database testing .

 Rajeev

Rajeev G.Sr. SVT Member at QUINTILES

I hope this will draw a line between DW Testing & DB testing. I believe ETL Testing is a

subset of DB testing. Both are challenging and important to business. Requires loads of

technical knowledge.

DB Testing

========

-Data is normalized. ER Modelling

-Done on Transactional Data OLTP

-Frequency of Update/Delete/Insert is more. CRUD Operations-SCD are not there

Page 8: DB VS ETL Test

8/13/2019 DB VS ETL Test

http://slidepdf.com/reader/full/db-vs-etl-test 8/8

-History is not checked & maintained

-Data is not used for BI

-Usually used to test data at the source instead of testing using the GUI

-Homogeneous Data

-Smaller scale DB

-Testing approach is different than ETL.

ETL Testing

=========

-Data is de-normalised. Dimensional modelling.

-Done on OLAP. Data is supplied from OLTP.

-Frequency of Update/Delete/Insert is Less. Read Only Operations

-SCD are there.

-History is checked & maintained. More value for Historical data.

-Data is used for BI. Used by business users for decision support.

-Extraction, Transformation, Loading are the major players.-Heterogeneous Data. (Oracle,Flat files, etc etc)

-Large Scale Volume of data.

-Testing approach is different than DBtesting

-ETL tester should know DW concepts.

-Done on OLAP/ODS. OLTP is the feeder database.

-SCD's are there

-Data is mainly used for for BI. Data is used by business users for decision support.

Every Data Warehouse is a Database but every Database is not a Data Warehouse.

Cheers...!!!

 Siraj

Siraj K.

Module Lead

Amit, yes it requires two(+) system for ETL to work and two(+) system could be

anything.

You had asked me where are we dealing with denormalized data :

We dealing with denormalized data to seperate the reports from transactions. Data

warehouse contains analytical data.