webinar -data warehouse augmentation: cut costs, increase power

23
Data Warehouse Augmentation Cut Costs, Increase Power October 26, 2016

Upload: zaloni

Post on 21-Jan-2017

82 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Data Warehouse Augmentation Cut Costs, Increase Power

October 26, 2016

Page 2: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

• Award-winning provider of enterprise data lake management solutions:

Integrated data lake management platform

Self-service catalog and data preparation

• Data Lake Design and Implementation Services: POC, Pilot, Production, Operations, Training

• Data Science Professional Services

Page 3: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

3 Zaloni Proprietary

About our speakers

Pradeep Varadan, Verizon Wireline, OSS Data Science Leader

Varadan is a data scientist and enterprise architect who specializes in data challenges within telecommunications. He is tasked with providing a competitive edge focused on utilizing data analytics to drive effective decision-making. He is skilled in creating systems that can be used to understand and make better decisions involving rapid technology shifts, customer lifestyle and behavior trends and relevant changes that impact the Verizon Network.

Scott Gidley, Zaloni, VP Product Management

Gidley is responsible for the strategy and roadmap of existing and future products within the Zaloni portfolio. He is a nearly 20 year veteran of the data management software and services market. Prior to joining Zaloni, he served as senior director of product management at SAS and was previously CTO and cofounder of DataFlux Corporation.

Page 4: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

4 Zaloni Proprietary

Current state of a corporate data flow architecture

BI/ReportingData Generators

Machines

Data ChannelsWarehouses

MartsRepositories

Data stores

4 Zaloni Proprietary

Page 5: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

5 Zaloni Proprietary

Business Challenges:• Increased processing

time/reduced response• Lack of data lineage/lack of

visibility• Constant CapEx for hardware

upgrade• Lack of access to history

Key Challenges

IT Challenges:• Multiple data transfers• Multiple technology platforms

with data copies• Constant performance tuning

for CPU• Manual data offload for space

management

Page 6: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

6 Zaloni Proprietary

Sources ETL Report Mart

Data DiscoveryAnalytics BI

ELT/Reporting/MiningETL

Resource consumption

Staging Warehouse

6 Zaloni Proprietary

Page 7: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

7 Zaloni Proprietary

Typical utilization of RDBMS resources

We expend almost all CPU for low business value ETLBusiness Value

CPU

ETL to Stage

Auditing(Landing tables query)

Data Mining (Staging query)

Ad-hoc Analysis(Warehouse query)

ETL to Warehouse

ETL to Reporting

Reporting (Presentation table query)

*Size indicates frequency of use

7 Zaloni Proprietary

Page 8: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

8 Zaloni Proprietary

~80% of system capacity used for batch processing (ELT)

8 Zaloni Proprietary

Page 9: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

9 Zaloni Proprietary

Reduce cost of ELT/ETL by offloading to Hadoop

9 Zaloni Proprietary

Page 10: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

10 Zaloni Proprietary

The future of enterprise data flowFu

ture

10 Zaloni Proprietary

Lega

cy

Structured Data ETL EDW+Sandbox BI/ReportingData MartsTransactionalSystems

Machine logs/IOT

Structured/ Unstructured

Data Lake

Mod

ern

T-SystemsMachines ETL Sandbox EDW BI/Reporting/

AnalyticsData Marts

Operational Dashboards/EDA/Mining/Reporting/Analytics

TransactionalSystems

EDW Data Marts ETL SandboxETL

Page 11: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

11 Zaloni Proprietary

Increased Agility

New Insights

Improved Scalability

Data lakes are central to the modern data architecture

Page 12: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

12 Zaloni Proprietary

Data lake challenges

• Ingestion

• Visibility and Quality

• Privacy and Compliance

• Timeliness

• Reliance on IT

• Reusability

• Rate of Change

• Skills Gap

• Complexity

Managing: Delivering:Building:

Page 13: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

13 Zaloni Proprietary

Data Lake 360 ° : A holistic approach to actionable big data

1. Enable the lake 2. Govern the

data

3. Engage the business

• Foster a data-driven business through self-service data discovery and preparation

• Safeguard sensitive data and enable regulatory compliance

• Improve data visibility, reliability and quality to reduce time-to-insight

• Leverage the full power of a scale-out architecture with an actionable, scalable data lake

Page 14: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

14 Zaloni Proprietary

• Managed Ingestion Ability to ingest vast amounts of data Ability to handle a wide variety of formats

(streaming, files, custom) and sources Build in repeatability through automation to pick up incoming

data and apply pre-defined processing

• Metadata Management Capture and manage operational, technical and business

metadata Provides visibility and reliability – key to finding data in the

lake Reduced time to insight for analytics File and record level watermarking provides data lineage,

enables audit and traceability

Enable the lake

Page 15: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

15 Zaloni Proprietary

Govern the data• Data Lineage

See how data moves and how it is consumed in the data lake.

Safeguard data and reduce risk, always knowing where data has come from, where it is, and how it is being used.

• Data Quality Rules based Data validation Integration with the Managed Data Pipeline Stats and metrics for reporting and actions

Page 16: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

16 Zaloni Proprietary

Govern the data• Data Security and Privacy

Differing permissions require enhanced data security Mask or tokenize data before published in the lake for

consumption Policy-based security

• Data lifecycle management across tiered storage environments

Hot -> Warm -> Cold on an entity level based on policies/SLAs

Across on-premise and cloud environments Provide data management features to automate scheduling

and orchestration of data movement between heterogeneous storage environments

Page 17: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

17 Zaloni Proprietary

Engage the business

• Data Catalog See what data is available across your enterprise Contribute valuable business information to

improve search and usage Use a shopping cart experience to create sandbox

for ad-hoc and exploratory analytics

• Self-service Data Preparation Blend data in the lake without a costly IT project Perform interactive data-driven transformations Collaborate and share data assets and

transformations with peers

Page 18: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Zaloni Confidential and Proprietary - Provided under NDA

18 Zaloni Proprietary

Data lake reference architecture

• Data required for LOB specific views - transformed from existing certified data

• Consumers are anyone with appropriate role-based access

• Standardized on corporate governance/ quality policies• Consumers are anyone with appropriate role-based access• Single version of truth

TransientLanding Zone Raw Zone

Refined Zone

Trusted Zone

Sandbox

Data Lake

• Temporary store of source data

• Consumers are IT, Data Stewards

• Implemented in highly regulated industries

• Original source data ready for consumption

• Consumers are ETL developers, data stewards, some data scientists

• Single source of truth with history

• Data required for LOB specific views - transformed from existing certified data

• Consumers are anyone with appropriate role-based access

Sensors (or other time series data)

Relational Data Stores

(OLTP/ODS/DW)

Logs(or other unstructured

data)

Social and shared data

16 Zaloni Proprietary

Page 19: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

19 Zaloni Proprietary

Data lake reference architecture with ZaloniConsumption ZoneSource

System

File Data

DB Data

ETL Extracts

Streaming

TransientLanding

ZoneRaw Zone

Refined Zone

Trusted Zone

Sandbox

APIs

MetadataManagement

Data Quality Data Catalog Security

Data Lake

Business AnalystsResearchers

Data Scientists

DATA LAKE MANAGEMENT &

GOVERNANCE PLATFORM

Sensors (or other time series data)

Relational Data Stores

(OLTP/ODS/DW)

Logs(or other unstructured

data)

Social and shared data

EDWData Marts

Page 20: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

20 Zaloni Proprietary

• Save millions in storage costs• Significantly speed up processing• Maximize the data warehouse for BI• Extract more value from all of your data

Four great reasons to augment with a data lake

Page 21: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

21 Zaloni Proprietary

Centralized data, decentralized access

Business Analyst Business Manager Data Scientist Business SMEWhat

happened?What is

happening? What will happen? What can we control? Can I see the data?

IT Team

BusinessUsers

IT Analyst Programmer DBA/Modeler Data Scientist Data Engineer

Data Lake

Code Analysis App Implementation

App PrototypeData ModelCode Development

Operations Manager

Page 22: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

Questions?

Page 23: Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

DATA LAKE MANAGEMENT AND GOVERNANCE PLATFORM

SELF-SERVICE DATA PREPARATION