db vs etl test
TRANSCRIPT
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 1/8
Difference between ETL and Database testing0
inShare
0
ETL testing applies to Data Warehouses or Data integration projects while Database
Testing applies to any database holding data (typically transaction systems). Here are the
high level tests done in each:
ETL Testing : Primary goal is to check if the data moved properly as expected
Database Testing : Primary goal is check if the data is following the rules/standardsdefined in the Data Model
ETL Testing :
Verify that the counts in the source and target are matching
Verify that the data is matching between source and target
Verify that the transformed data is as per expectation
Verify that the data is incrementally getting updated
Verify that the foreign - primary key relations are preserved during the ETL
Verify if there are any duplicates in the loaded data
Database Testing :
Verify that the foreign - primary key relations are maintained and there are no orphan
records
Verify that the data in the columns had valid values. eg. If there is a domain (encoded list)
defined for a column, check if the values in the column meet that requirement
Verify that the data in columns is accurate. eg. If you have an age column, does it have
values more than 100
Verify if data is missing in columns where required. eg. If is a column is expected to have
data always, check if there are any null values
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 2/8
Nadia A.
Nadia A. R.
Software Test Engg at Virtus-IT
ETL basically stands for Extract Transform Load - which simply implies the process where
you extract data from Source Tables, transform them in to the desired format based oncertain rules and finally load them onto Target tables. There are numerous tools that help
you with ETL process - Informatica, Control-M being a few notable ones.
So ETL Testing implies - Testing this entire process using a tool or at table level with the help
of test cases and Rules Mapping document.
In ETL Testing, the following are validated -
1) Data File loads from Source system on to Source Tables.
2) The ETL Job that is designed to extract data from Source tables and then move them to
staging tables. (Transform process)3) Data validation within the Staging tables to check all Mapping Rules / Transformation
Rules are followed.
4) Data Validation within Target tables to ensure data is present in required format and
there is no data loss from Source to Target tables.
Siraj
Siraj K.
Module Lead
Both ETL and Database Testing are same..Whatever we achieve with Data Warehouse
can be achieved by Database also...We all know that data warehouse is nothing but the set
of databases which follows some characterstics like : subject oriented,Non
Volatile,Integrated and Time Varient.
The only difference is that1) The data in database is Normalized in Nature and the datra in DWH is Denormalized in
Nature.
2) Database Testing uses ER Model whereas DWH uses Dimensional Modeling technique.
3) Database Testing is done on OLTP system and DWH testing is done on OLAP systems.
4) Database uses only .sql files whereas ETL/DWH can use any type of file to import data.
Except that their is no other difference between them.
Arpita C.Business Technology Analyst at Deloitte U.S. India offices
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 3/8
But people say ETL testing is more valued than Functional testing..Being a ETL tester
myself,I dont find any difference in terms of salary for the 2 types of testing.
In what perspective,we then say ETL testing is more valuable?
Sandeep
Sandeep P.
Associate - Projects at Cognizant Technology Solutions
ETL testing has slight difference to database testing. In ETL testing, we test the
workflows (in informatica tool), the worklets and sessions. We run each session separately
and check the functionality of that session. When an ETL fails , we can backtrack/ gets the
logs and can find the error message. We can test the performance of ETL by running
multiple workflows holding huge data simultaneously (provided these ETL workflows source
the data from same point)
Siraj
Siraj K.
Module Lead
Hey Sandeep, i think you are little bit confused or may be i am wrong. As per your
comment,we test workflows,worklet and session in ETL testing which is different thanDatabase testing. But those workflows,session are doing the same thing which database is
doing.In database testing also, we are testing the functionality as per the requirement given.
I don't think its a valid difference.
I think the only difference is :
* Database testing is done on OLTP systems and ETL testing is done on OLAP systems. We
use OLTP as a source in ETL whereas in Databases OLTP systems are target tables.
* Database testing is done on normalized system but ETL testing is done on Denormalized
systems
* No. of joins ,tables and Index are more in Database testing as compared to ETL
Testing.
* Most important, Data warehouse is used to separate transactions from reports and
databases is transactional and con't use for Reporting and analysis.
* Data is less in databases as comapred to Data warehouse because duplicates can exist in
Data warehouse because of its denormalized stuctures. So, we are testing on huge amount
of data in ETL as compared to databases.
* In database testing, we are testing on a model which is created based on ER modelling but
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 4/8
ETL testing model is created based on Dimensional modelling.
Like this, we have so many difference but all based on normalized and denormalized
concepts. Other than this i don't thing their is any difference in Database and Data
warehouse. Incase, you want to know the more difference,than let me know will tell you
many such difference.
Plz correct me if i was wrong!
Suresh Kumar
Suresh Kumar D.Technical Architect at Hexaware Technologies
In DWH, it deals with historic data (normally more than 2 years of data), which will be
given as in input to ETL process, where the data is being extracted from various source files,
cleansed, transformed and loaded in to target tables, with the help of ETL tools like
Informatica, Ab-initio etc..
As Sandeep said, It is very important to validate the workflow and check whether the
transformation logics are correclty applied on the historical data. Since DWH has huge
amount data, we need to identify Performance bottlenecks (volume testing, stress testing)
by running the workflows multiple times and monitoring the log files for errors/warnings.
In DB, there is no ETL logic involved to load the data in to DB. It deals with current data,
which is entered by the user via any UserInterface screen.
Pawan
Pawan D.
QA Team Lead at NTT DATA Americas
I think there is a lot of confusion among the people in understanding the difference
between Database testing and ETL Testing . Let me explain with an example : Let us say you
walk into a super market and buy some groceries and when you come to the billing counter
, the person will generate a bill by scanning those items . The system he uses to generate
these bills are called as OLTP ( online transaction processing systems) ,similarly if you walk in
to a bank and try to open an account ,your account gets registered in these OLTP systems.
Now , similarly like you there will be many more customers for the bank and also super
market , for a higher management to make some decision on their products such as thesales of some particular product in a particular store , a bank high level management they
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 5/8
may want to upgrade all their silver card customers to gold card customers, to achieve this
all this data from various sources needs to be integrated and dumped in a datawarehouse
which are called as OLAP systems .
If you are doing any testing on OLTP systems it is known as Database testing and if you are
doing any testing on OLAP systems it is known as ETL testing .
In database testing we generally check the following :
1) Whether the expected tables are created with expected fields as per the metadata
described in the data mapping document.
2) Insert , update and delete operations performed on the front end system and check on
the back end database whether the same is reflected
3) Ensure no data truncations.
In etl testing we check for the following :
1) Lets say there are 100 rows in your source , you will verify that whether all this 100 rows
has moved to the target ( ie Row count ) . This is called Data completeness
2) Verify that the data has been integrated properly for all these 100 rows . This is called
Data accuracy.
Apart from the above checks there are many other things which comes into picture like SCD
implementations, null handling etc.
You can contact me on [email protected] , if you need more information
Ponnusamy
Ponnusamy S.
ETL Test Lead at Shaw Communications
I see people are giving very good information about ETL. Here is the information I know.In my view testing is application of testers knowledge, skills and experties with a goal of
finding defects in the system before going to production. For this we need to thoroughly
understand the system and its functions.
The data warehouse concept emerged for strategic decision making at the executive
management level. Business need to analyse and understand the trend for forecasting
about future and to lead the business in right direction. For this they need data collected
over the years in the past by way of many many transactions by/ for customers. These pile
of historical data is sitting in the operational or transactional databases or file systems of
many organisations. These systems processing day-to-day transaction and store the data innormalized database is called OLTP system.
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 6/8
But this is not much useful to the business community as it is for strategic decision making.
These data might be scattered across many different geographical locations, over long
duration generated by historically different systems.
ETL process extracts these heterogeneous data from different sources, validates, cleans
using some business rules and transforms to homogeneous data and load it into the target
database i.e data warehouse. These data is denormalised for easy analytical processing and
reporting.
As a tester we need to validate the data flow between source to target by proving that all
the source data are transformed as per the business rule and available in target database
accurately and completely without any data loss or duplication etc.
Amit
Amit K.
Test Automation Expert
For me..Database testing is a subset of ETL testing.As ETL process involves
Extraction,Transformation and Loading process to be tested.Hence data integrity is checked
between different systems.Different system could be a different database e.g. DB2 and
Oracle.Hence the testing of ETL requires appropriate knowledge of disparate system as well
as appropriate transformation logic involved for testing the same.However database testingis mostly limited to one database with or without transformation.Hence,database testing
can be considered as a part of ETL testing however scope of ETL testing is much beyond the
database testing.
SirajSiraj K.
Module Lead
Amit, database testing is not limited to one database with or without transformation.
We can extract data from different systems also for doing database testing. The only
difference in Database and DWH is the type of data which it contains. Database contains
normalized and transactional data but once we create a Data warehouse, it contains only
denormalized and Analytical data. We use data warehouse to seperate the reports from the
transaction.Data is less in databases as comapred to Data warehouse because duplicates
can exist in Data warehouse because of its denormalized stuctures.
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 7/8
Amit
Amit K.Test Automation Expert
Hi Siraj..I feel there is some gap in understanding...in ETL if source and Target both
system uses relational databse then where are we dealing with denormalized data.My
understanding of ETL is, it requires two(+) system for ETL to work and two(+) system could
be anything.Hence,Normalization and Denormalization of relation should not be considered
as a factor to differentiate between Database and ETL Testing...
Pawan
Pawan D.
QA Team Lead at NTT DATA Americas
Hi Siraj, when extraction happens from different systems it should be termed as ETL
Testing ( because the words extraction and etl are co-related) , because if you would like to
extract data from different systems and load into another database or datawarehouse you
need to use etl tools. Instead if you meant we are integrating the data from different
systems to a consolidated system by using some procedures with out the use ofintermediate tools like Datastage or Informatica then it is called data migration and testing
you do on the system is still called database testing .
Rajeev
Rajeev G.Sr. SVT Member at QUINTILES
I hope this will draw a line between DW Testing & DB testing. I believe ETL Testing is a
subset of DB testing. Both are challenging and important to business. Requires loads of
technical knowledge.
DB Testing
========
-Data is normalized. ER Modelling
-Done on Transactional Data OLTP
-Frequency of Update/Delete/Insert is more. CRUD Operations-SCD are not there
8/13/2019 DB VS ETL Test
http://slidepdf.com/reader/full/db-vs-etl-test 8/8
-History is not checked & maintained
-Data is not used for BI
-Usually used to test data at the source instead of testing using the GUI
-Homogeneous Data
-Smaller scale DB
-Testing approach is different than ETL.
ETL Testing
=========
-Data is de-normalised. Dimensional modelling.
-Done on OLAP. Data is supplied from OLTP.
-Frequency of Update/Delete/Insert is Less. Read Only Operations
-SCD are there.
-History is checked & maintained. More value for Historical data.
-Data is used for BI. Used by business users for decision support.
-Extraction, Transformation, Loading are the major players.-Heterogeneous Data. (Oracle,Flat files, etc etc)
-Large Scale Volume of data.
-Testing approach is different than DBtesting
-ETL tester should know DW concepts.
-Done on OLAP/ODS. OLTP is the feeder database.
-SCD's are there
-Data is mainly used for for BI. Data is used by business users for decision support.
Every Data Warehouse is a Database but every Database is not a Data Warehouse.
Cheers...!!!
Siraj
Siraj K.
Module Lead
Amit, yes it requires two(+) system for ETL to work and two(+) system could be
anything.
You had asked me where are we dealing with denormalized data :
We dealing with denormalized data to seperate the reports from transactions. Data
warehouse contains analytical data.