data analytics-testing spectrum

26
Kokila Rudresh Shalini Saini Data Analytics – Testing Spectrum VodQA 2016

Upload: vodqablr

Post on 16-Apr-2017

160 views

Category:

Technology


1 download

TRANSCRIPT

Kokila Rudresh

Shalini Saini

Data Analytics – Testing Spectrum

Vo d Q A 2 0 1 6

Data Analytics: An Introduction

Collection

Processing Modelling Inference Visualization

Data Analytics: Use Cases

Business Intelligence

Social NetworksAstronomy and

Astrophysics

Finance and Stock Market Medical Imaging

Computer Graphics

Computer Vision

Energy ExplorationMaps Retail

Data Analytics: Why Testing is Important

Volume

DomainComplexity

Variety

Computations

Testing

Data Analytics: Testing Challenges

Data Validation

Model Implementation

Business Perspective

Data Analytics: Typical System Implementation

Extract

Transform

Load

Source Data

Modelling AggregationETL VisualizationRaw Data

Source Data

ExtractTransform

LoadSource Data

ETL Process

ExtractTransform

LoadSource Data

Modelling

ExtractTransform

LoadSource Data

Aggregation

ExtractTransform

LoadSource Data

Visualization

ExtractTransform

LoadSource Data

Data Analytics Testing - Approach

ExtractTransform

LoadSource Data

Pre-ETL Validations

Post-ETL Tests

Model Validations

Aggregation Validations

Visualization Validations

Format

Consistency

Completeness

Data Analytics - Testing

ExtractTransform

LoadSource Data

Pre-ETL Validations

Pre ETL Testing

Data Analytics - Testing

ExtractTransform

LoadSource Data

Post-ETL Tests

Meta-data

Data transformation

Data quality checks

Business-specific validations

Post ETL Testing

Data Analytics - Testing

ExtractTransform

LoadSource Data

Model Validations

Implementation

Computation

Model Implementation Testing

Sales = a(Seasonality) + b(Trend) + c(Promotions) + d(Sales Channel) + other factors

Data Analytics - Testing

ExtractTransform

LoadSource Data

Aggregation Validations

Data Hierarchy

Data Scope

Summarized Values

Data Analytics - Testing

ExtractTransform

LoadSource Data

Visualization Validations

Information Representation

Data Format

Result Intuitiveness

Visualization Testing

Learnings

ANALYSE

CODETEST

Initial Data Flow• Pre defined data

template• Pre-ETL data validations

Domain Knowledge• KT Sessions involving SME’s• Core computations

Business Involvement• Test data closer to real

time data• User flows prioritization

Learnings

Implementation

• Alternate implementation• SME validation`

Computation

• Addressing the right problem

• Computational Factors

ANALYSE

CODETEST

Learnings

Testing Process• Step wise data

validation• Defect investigation

Test Automation• Data combinations• Xml test data

Test Execution

• CI test execution• Execution frequency

Testing Tools• Spreadsheet gear• Excel macros

ANALYSE

CODETEST

Domain Context

Integrating Business Use-

cases

Design and Testing

Challenges

Testing Approach Learnings

Summary

[email protected]@thoughtworks.com