big data management: what's new, what's different, and what you need to know

31
1 Big Data Management: What’s New, What’s Different and What You Need to Know

Upload: snaplogic

Post on 13-Jan-2017

654 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Big Data Management: What's New, What's Different, and What You Need To Know

1

Big Data Management: What’s New, What’s Different and What You Need to Know

Page 2: Big Data Management: What's New, What's Different, and What You Need To Know

2

Today’s Featured Presenter

Matt AslettResearch Director, Data Platforms and Analytics451 Research

As Research Director, Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality, data management, analytics, and advanced analytics. Matt's own primary area of focus includes data management, reporting and analytics, and exploring how the various data platform and analytics technology sectors are converging in the form of next-generation data platform

Page 3: Big Data Management: What's New, What's Different, and What You Need To Know

33

Agenda

• Big Data Management– Matt Aslett, 451 Research

• SnapLogic Overview • SnapLogic Demonstration

– Ravi Dharnikota, Head of SnapLogic Enterprise Architecture

• Q&A

Page 4: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Big Data Management

Matt Aslett, Research Director

Page 5: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

451 Research is a leading IT research & advisory company

5

Founded in 2000250+ employees, including over 100 analysts

1,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers50,000+ IT professionals, business users and consumers in our research communityOver 52 million data points published each quarter and 4,500+ reports published each year

2,000+ technology & service providers under coverage

451 Research and its sister company, Uptime Institute, are the two divisions of The 451 Group

Headquartered in New York City, with offices in London, Boston, San Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia, Taiwan, Singapore and Malaysia

Research & Data

Advisory

Events

Go 2 Market

Page 6: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Big data and beyond• V is for various things…

but does not define big data

3

Page 7: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Big data and beyond• V is for various things…

but does not define big data

• To understand the trends driving ‘big data’ 451 Research focused beyond the nature of the data on what enterprises wanted to do with it

4

Page 8: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Big data and beyond

8

• V is for various things…but does not define big data

• To understand the trends driving ‘big data’ 451 Research focused beyond the nature of the data on what enterprises wanted to do with it

• Totality – storing and processing all data (or as much as is economically viable) • Exploration – schema-free approaches to analyzing data to identify new patterns• Frequency – more frequent analysis of data to enable real-time decision making

Page 9: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

‘Big data’ is primarily driven by economics, not data

6

• ‘Big Data’ is the realization of competitive advantage based on the fact that it is now more economically feasible to store and process data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety

Page 10: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

‘Big data’ is primarily driven by economics, not data

6

“Big data is what happened when the cost of keeping information became less than the cost of throwing it away.”

George Dyson

• ‘Big Data’ is the realization of competitive advantage based on the fact that it is now more economically feasible to store and process data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety

Page 11: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

‘Big data’ is primarily driven by economics, not data

7

“Big data is what happened when the cost of keeping information became less than the cost of throwing it away.”

George Dyson

• ‘Big Data’ is the realization of competitive advantage based on the fact that it is now more economically feasible to store and process data that was previously ignored due to the cost and functional limitations of traditional data management technologies to handle its volume, velocity and variety

• Moved from storing 1% of data for 60 days in EDW @ $100,000/TB• To 100% of data for a year in Hadoop @ $900/TB

Page 12: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLCSource: 451 Research, Total Data Analytics 2016

The evolution of enterprise analytics

12

REPORTING- What happened

ANALYSIS- Why did it happen?

PRESCRIPTIVE- Influence what happens

STATISTICALMODELING

MACHINE LEARNING

DESCRIPTIVE- What is happening?

PREDICTIVE- What will happen?

Complexity

Automated

User-d

riven

IT-driv

en

VISUALIZATION

Page 13: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Data sources: Multi-structuredRDBMS, Hadoop, NoSQL, stream processing, historical and real-time

Source: 451 Research, Total Data Analytics 2016

Data sources: Structured, RDBMS, historical

The evolution of enterprise analytics

13

REPORTING- What happened

ANALYSIS- Why did it happen?

PRESCRIPTIVE- Influence what happens

STATISTICALMODELING

MACHINE LEARNING

DESCRIPTIVE- What is happening?

PREDICTIVE- What will happen?

Complexity

Automated

User-d

riven

IT-driv

en

VISUALIZATION

Page 14: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

EDW vs Hadoop (Schema-on-write vs schema-on-read)

14

Source: https://www.flickr.com/photos/wbaiv/16510090506/ Source: https://www.flickr.com/photos/notbrucelee/5696238930/

Page 15: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Schema-on-write

15

Source: https://www.flickr.com/photos/wbaiv/16510090506/

• Pre-prepared

• Single-purpose

• Some assembly required

• Inflexible

Page 16: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Schema-on-read

16

Source: https://www.flickr.com/photos/notbrucelee/5696238930/

• Flexible

• Reusable

• Some imagination required*

• Multi-purpose

• *Instructions available if desired

Page 17: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Hadoop-based data lakes• The concept of the data lake

has taken off in recent years, with the Apache Hadoop data-processing framework serving as the unified repository into which raw data is landed from multiple sources and made available to multiple users for multiple purposes.

17

Photo: Myrabella / Wikimedia Commons, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=11263585

Page 18: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Hadoop-based data lakes• The concept of the data lake

has taken off in recent years, with the Apache Hadoop data-processing framework serving as the unified repository into which raw data is landed from multiple sources and made available to multiple users for multiple purposes.

• Beware the data swamp

18

https://www.flickr.com/photos/lofink/4501610335/

Page 19: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Data governance, data preparation and the data lake• Data needs to be filtered, processed, treated

and managed to make it suitable for multiple analytics use cases.

• Data governance• Data catalog• Data security• Data lineage

• Data preparation• Data discovery• Data cleansing• Data harmonization

19

• Data inventory• Data quality• Data pipelines

• Data enrichment• Data matching• Collaboration

Page 20: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Data governance, data preparation and the data lake

20

DATA-AS-A-SERVICE

PARTNERS

SUPPLIERS

SELF-SERVICEDATA PREPARATION

IT

DATA LAKE

APPLICATIONS

DATA GOVERNANCEData lineage Data inventory

Data catalogData security Data quality

Data pipelines

DATA STEWARDS

Data cleansing

Data harmonizationData discovery

Collaboration

Data matchingData enrichment

ADVANCED ANALYTICS

DATA SCIENTISTS

SELF-SERVICE ANALYTICS

SENIOR EXECUTIVES BUSINESS ANALYSTS DATA ANALYSTS

Page 21: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Hadoop and other animals

21

Page 22: Big Data Management: What's New, What's Different, and What You Need To Know

Copyright (C) 2016 451 Research LLC

Recommendations

22

• Enterprises should seriously consider the data governance and management requirements before embarking on data lake projects to ensure that the functionality is available to turn the concept into reality.

• For flexibility and agility, employ data management approaches and technologies that abstract data processing pipelines from the execution environment.

• Look for data integration and transformation technologies that execute natively, taking advantage of the underlying engine (e.g. Spark, YARN).

• Seek out data management and integration technologies that enable consumption and transformation of large volumes of structured and unstructured data.

Page 24: Big Data Management: What's New, What's Different, and What You Need To Know

SnapLogic Elastic IntegrationAccelerate Your Integration. Accelerate Your Business

“We can do more in two hours with SnapLogic than we could in two days with traditional solutions.”

Page 25: Big Data Management: What's New, What's Different, and What You Need To Know

25

CSV

Big Data and hybrid cloud environments are making yesterday’s approaches to integration obsolete

Page 26: Big Data Management: What's New, What's Different, and What You Need To Know

26

Anythingapps | data | APIs | things

SnapLogic: Unified Platform for Data and Application Integration

Anytime batch | streaming | real-

time

Anywhereon prem | cloud | hybrid

Page 27: Big Data Management: What's New, What's Different, and What You Need To Know

2727

SnapLogic in the Modern Data Fabric: Ingest, Transform, Deliver

Cons

ume

Stor

e &

Proc

ess

Sour

ce

z z z z

HANA

Data Warehouses & Data Marts Big Data and Data

LakesINGEST INGEST

Data Integration and Transformation

On Prem Application

s

RelationalDatabases

CloudApplication

s

NoSQLDatabases

WebLogs

Internet of Things

DELIVER DELIVER

Page 28: Big Data Management: What's New, What's Different, and What You Need To Know

28

Modern Architecture: Hybrid and Elastic Execution

Streams: No data is stored/cachedSecure: 100% standards-basedElastic: Scales out & handles data and app integration use cases

MetadataData

Databases On Prem Apps

Big Data

Cloud Apps and DataCloud-Based Designer,

Manager, Dashboard

Execution

Execution

Execution

Firewall

SnapLogic “respects data’s gravity.”

Page 29: Big Data Management: What's New, What's Different, and What You Need To Know

SnapLogic Demonstration

Page 30: Big Data Management: What's New, What's Different, and What You Need To Know

30

Discussion

Matt AslettResearch Director, Data Platforms and Analytics451 Research

Ravi DharnikotaHead of Enterprise Architecture SnapLogic

Page 31: Big Data Management: What's New, What's Different, and What You Need To Know

31

Integrate at the speed of modern business

+1 [email protected]@SnapLogicwww.snaplogic.com