3 reasons data virtualization matters in your portfolio

36
DATA VIRTUALIZATION PACKED LUNCH WEBINAR SERIES Sessions Covering Key Data Integration Challenges Solved with Data Virtualization

Upload: denodo

Post on 23-Jan-2018

53 views

Category:

Data & Analytics


1 download

TRANSCRIPT

DATA VIRTUALIZATION PACKED LUNCH WEBINAR SERIES

Sessions Covering Key Data Integration Challenges Solved with Data Virtualization

Next session

3 Reasons Data Virtualization Matters in Your PortfolioThursday, November 16th, 2017 | 11:00am PT | 2:00pm ET

Alberto Pan

Denodo’s CTO

Pablo Alvarez

Denodo’s Director of Product Management

Paul Moxon

Denodo’s Data Architectures & Chief

Evangelist

The Challenges withModern Data Architectures

3

Data Integration – “The Way We Were…”

OperationalData Stores

Staging Area Data Warehouse Data Marts Analytics andReporting

ETLETLETL

Data Integration – A Modern Data Ecosystem

The Data Integration Challenge

Manually access different systems

IT responds with point-to-point data integration

Takes too long to get answers to business users

MarketingSales ExecutiveSupport

Database

Apps

Warehouse Cloud

Big Data

Documents AppsNo SQL“Data bottlenecks create business bottlenecks.”– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015

The Solution – A Data Abstraction Layer

Abstracts access to disparate data sources

Acts as a single repository (virtual)

Makes data available in real-time to consumers

DATA ABSTRACTION LAYER

“Enterprise architects must revise their data architecture to meet the demand for fast data.”

– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester Research, Dec 16, 2015

Denodo Data Virtualization Architecture

Data Virtualization Reference Architecture

Summary

• Modern Data Architectures are much more complex than the architectures of just

10 years ago

• Replicating (copying) data into a central repository doesn’t work at this scale or

complexity

• Data Virtualization can provide access to all of your data, in real-time, and

supporting self-service with a common data model (in the context of the

business users)

• Let’s find out how…

10

Logical Data Warehouse

“The Logical Data Warehouse (LDW) is a new data management

architecture for analytics combining the strengths of traditional

repository warehouses with alternative data management and access

strategy.”

11

Gartner Hype Cycle for Enterprise Information Management, 2012

12

The State and Future of Data Integration. Gartner, 25 may 2016

Physical data movement architectures that aren’t designed to

support the dynamic nature of business change, volatile

requirements and massive data volume are increasingly being

replaced by data virtualization.

Evolving approaches (such as the use of LDW architectures) include

implementations beyond repository-centric techniques

13

DW + Cloud dimensional data

Time Dimension Fact table(sales) Product Dimension

Customer Dimension

CRM

SFDC Customer

EDW

14

Multiple DW integration

Time Dimensi

on

Sales fact

Product Dimension

Region

Finance EDW

City

Marketing EDW

Customer Fidelity factsProduct Dimension

*Real Examples: Nationwide POC, IBM tests

Store

15

DW Historical offloading

Horizontal partitioning

Time Dimension Fact table(sales) Product Dimension

Retailer Dimension

Current Sales Historical Sales

EDW

16

Summary

▪ “The LDW is an evolution and augmentation of DW practices, not a replacement”

▪ “A repository-only style DW contains a single ontology/taxonomy, whereas in the LDW a

semantic layer can contain many combination of use cases, many business definitions of

the same information”

▪ “The LDW permits an IT organization to make a large number of datasets available for

analysis via query tools and applications.”

Query Optimization in the LogicalData Warehouse

17

18

Gartner, Magic Quadrant for Data Integration, 2017

The Denodo Platform ... incorporates dynamic query optimization as

a key value point. This capability includes support for cost-based

optimization specifically for high data volume and complexity;... it

has also added an in-memory data grid with Massively Parallel

Processing(MPP) architecture to its platform.

19

Query Optimization: Example (1)

Naive Strategy (BI Tools, BDI Tools, Simple federation engines):

join

union

group by

Customers (3M)

Sales previous years (3B)Sales this year

(290M)

290M rows

300M rows (sales previous

year)

3M rows593M rows throughthe network

Obtain Total Sales By Customer Country in the Last Two Years

20

Query Optimization: Example (2)

Denodo Strategy

join

union

group by

Customers (3M)

Sales previous years (3B)Sales this year

(290M)

3M rows (sales by customer this year)

3M rows (sales by customer

previous year)

3M rows9 M rows through thenetwork

Obtain Total Sales By Customer Country in the Last Two Years

group by customer

group by customer

Query Optimization: Example (and 3)

union

group by

3M rows (sales by customer

this year)

3M rows (sales by customer

previous year)

3M rows(customers)

Aggregation pushdowngroup by

customer

group by customer

join

Integrated MPP

processing

System Execution TimeOptimization

Technique

No Rewriting 20 min None

Denodo 6 51 sec Aggregation push-down

Denodo 7 13 secAggregation push-down

+ MPP integration

22

Query Optimization: Summary

▪ You can achieve excellent performance in Logical Analytics Architectures.

▪ Key techniques needed:

▪ Advanced Dynamic Optimization to minimize network traffic and leverage the

power of data sources

▪ In-memory MPP processing to speed operations atthe DV layer

▪ Advanced incremental caching for reusing commonly used data and complex

calculations

Universal Semantic Layer

23

• Let business users access the

data that they need and stop

IT being a bottleneck

• That’s the vision as sold by

many BI tool vendors

• i.e. give me the tools and

access to the data and

stand back ☺

The Promise of Self-Service Initiatives

Self-Service Issues…

• Tools are designed for data analysts (or power users)

• Users who are happy finding, wrangling, cleansing data

• Creating calculations, aggregations within the data

• What about the other business users?

• People who don’t want to spend hours fighting the spreadsheet…

• Will they use common definitions for key business entities and

metrics?

• Or will they pick and choose their own?

• Ultimately, can you trust the numbers?

• Where did the data come from? How has is been manipulated?

Rob van der Meulen, Gartner

Gartner predicts that by 2018 most business users

will have access to self-service tools, but that only

one in 10 initiatives will be sufficiently well-

governed to avoid data inconsistencies that

negatively impact the business.

Self-Service with Guardrails

• Don’t build just for the ‘data cowboys’

• Create a common and consistent semantic layer

• Everyone is using the same definitions and metrics

• Create pre-integrated, pre-calculated data services

• Saves the user having to do this themselves

• Ensures consistency of calculations, etc.

• But allow the cowboys to ‘roam and wrangle’

• Even the cowboys can only access ‘approved’ data sources

Self-Service Architecture

28

Indiana University – Decisions Support Initiative

• Multi-campus public university system in state of Indiana

• 110,000 students, 8,700 academic staff, 9 campuses statewide

• DSI Goal: To provide timely, relevant, and accurate data to decision makers

within the University system

• Turning disparate data into actionable information

• DSI portal provide ‘one stop shop’ for key data

• Prepackaged data set available for users

• Role-based access

• Data provisioned through Denodo Platform

• http://dsi.iu.edu

29

Indiana University – Decision Support Initiative

Summary

31

The Benefits of Data Virtualization

32

Complete enterprise information, combining Web, cloud, streaming, and structured data

ROI realization within 6 months, with the flexibility to adjust to unforeseen changes

An 80% reduction in integration costs, in terms of resources and technology

Real-time integration and data access, enabling faster business decisions

“Get it Real-time and Get it Fast!”

Q&A

Next steps

Download Denodo Express: www.denodoexpress.com

Access Denodo Platform on AWS:

www.denodo.com/en/denodo-platform/denodo-platform-for-aws

35

Thank you!

© Copyright Denodo Technologies and Daman, Inc. All rights reservedUnless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and

microfilm, without prior the written authorization from Denodo Technologies and Daman, Inc