one hadoop, multiple clouds - nyc big data meetup

45
1 © Cloudera, Inc. All rights reserved. One Hadoop, Multiple Clouds Andrei Savu | Tech Lead, Cloudera Director

Upload: andrei-savu

Post on 19-Jan-2017

624 views

Category:

Software


1 download

TRANSCRIPT

Page 1: One Hadoop, Multiple Clouds - NYC Big Data Meetup

1© Cloudera, Inc. All rights reserved.

One Hadoop, Multiple CloudsAndrei Savu | Tech Lead, Cloudera Director

Page 2: One Hadoop, Multiple Clouds - NYC Big Data Meetup

2© Cloudera, Inc. All rights reserved.

About me

Tech Lead on Cloudera Director

Previously founder of axemblr.com

Contributed to Apache Whirr (PMC) & jclouds.

Twitter: https://twitter.com/andreisavu

LinkedIn: https://www.linkedin.com/in/sandrei

Page 3: One Hadoop, Multiple Clouds - NYC Big Data Meetup

3© Cloudera, Inc. All rights reserved.

Cloudera Directorcloudera.com/director

Deploy and manage enterprise-grade Hadoop in the cloud

AWS & Google CloudExtensible via plugins

Page 4: One Hadoop, Multiple Clouds - NYC Big Data Meetup

Journey to the Cloud

Page 5: One Hadoop, Multiple Clouds - NYC Big Data Meetup

5© Cloudera, Inc. All rights reserved.

Do you use a public or private cloud?

How do you run and manage Hadoop?

Page 6: One Hadoop, Multiple Clouds - NYC Big Data Meetup

6© Cloudera, Inc. All rights reserved.

What is this talk about?

State of the WorldArchitectural PatternsImagine the Future

Page 7: One Hadoop, Multiple Clouds - NYC Big Data Meetup

7© Cloudera, Inc. All rights reserved.

Gartner's 2015 Hype Cycle for Emerging Technologies (source)

Advanced AnalyticsHybrid CloudInternet of Things

Page 8: One Hadoop, Multiple Clouds - NYC Big Data Meetup

8© Cloudera, Inc. All rights reserved.

Hybrid Clouds

Cloud ExchangeApplication PortabilityPrivate-PublicPublic-Public

Page 9: One Hadoop, Multiple Clouds - NYC Big Data Meetup

9© Cloudera, Inc. All rights reserved.

Cloud Wars

AWSMicrosoft AzureGoogle CloudVMWareOpenstacketc.

Page 10: One Hadoop, Multiple Clouds - NYC Big Data Meetup

10© Cloudera, Inc. All rights reserved.

Data has Mass and Gravity

Page 11: One Hadoop, Multiple Clouds - NYC Big Data Meetup

11© Cloudera, Inc. All rights reserved.

Hadoop EnvironmentsOn-Premise versus Cloud

On-Premise CloudStorage Direct Attached Direct Attached or Object Store

Data Not shared across clusters Shared across multiple clusters

Sizing Fixed-size Dynamic based on load

Usage Model All users share cluster Clusters created as needed for apps/users

Resource Management (YARN)

HDFS

Process Discover Model Serve

Industry Standard Servers (CPU, Memory, & Direct Attached Storage)

Resource Management (YARN)

HDFS

Process Discover Model Serve

Industry Standard Servers (CPU & Memory)

Object Storage

Page 12: One Hadoop, Multiple Clouds - NYC Big Data Meetup

12© Cloudera, Inc. All rights reserved.

Cloud providers shipping distributions of Hadoop

IntegrationUnlock Query EnginesMigration workloads

Is that a sustainable advantage? Or just a temporary stop gap?

Page 13: One Hadoop, Multiple Clouds - NYC Big Data Meetup

13© Cloudera, Inc. All rights reserved.

Maturity level

On-prem vs. CloudMonitoringDev / Test / ProdAvailabilityDurability

Page 14: One Hadoop, Multiple Clouds - NYC Big Data Meetup

14© Cloudera, Inc. All rights reserved.

Common Architectural Patterns in the Cloud

Object Storage

Source Data Seed Data Backup/DR

ETL/MODELING(Spark, MapReduce)

• Short-running clusters• Elastic workload• No local storage

necessary

|WASB |SWIFT |BLOB

• Long-running clusters• Sized to demand• Some local storage

BI/ANALYTICS(Impala, Solr)

• Fixed clusters • Periodic sync• Default to local

storage

APP DELIVERY(HBase, Kudu)

Page 15: One Hadoop, Multiple Clouds - NYC Big Data Meetup

15© Cloudera, Inc. All rights reserved.

Cluster lifecycle management

Create / TerminateDiscoveryMetadataMonitoring

Page 16: One Hadoop, Multiple Clouds - NYC Big Data Meetup

16© Cloudera, Inc. All rights reserved.

Work Queue

WorkflowsDispatchTrackingDecoupledFault Tolerant

Page 17: One Hadoop, Multiple Clouds - NYC Big Data Meetup

17© Cloudera, Inc. All rights reserved.

Common Architectural Patterns in the Cloud

Object Storage

Source Data Seed Data Backup/DR

ETL/MODELING(Spark, MapReduce)

• Short-running clusters• Elastic workload• No local storage

necessary

|WASB |SWIFT |BLOB

• Long-running clusters• Sized to demand• Some local storage

BI/ANALYTICS(Impala, Solr)

• Fixed clusters • Periodic sync• Default to local

storage

APP DELIVERY(HBase, Kudu)

Page 18: One Hadoop, Multiple Clouds - NYC Big Data Meetup

18© Cloudera, Inc. All rights reserved.

Multi-user

SecureIsolatedFriendly

Page 19: One Hadoop, Multiple Clouds - NYC Big Data Meetup

19© Cloudera, Inc. All rights reserved.

Elastic

Grow or shrinkBusiness hoursNumber of usersStorage vs. ComputeCost efficient

Page 20: One Hadoop, Multiple Clouds - NYC Big Data Meetup

20© Cloudera, Inc. All rights reserved.

Common Architectural Patterns in the Cloud

Object Storage

Source Data Seed Data Backup/DR

ETL/MODELING(Spark, MapReduce)

• Short-running clusters• Elastic workload• No local storage

necessary

|WASB |SWIFT |BLOB

• Long-running clusters• Sized to demand• Some local storage

BI/ANALYTICS(Impala, Solr)

• Fixed clusters • Periodic sync• Default to local

storage

APP DELIVERY(HBase, Kudu)

Page 21: One Hadoop, Multiple Clouds - NYC Big Data Meetup

21© Cloudera, Inc. All rights reserved.

Advanced Monitoring

LatencyResource utilizationConsistent performance

Page 22: One Hadoop, Multiple Clouds - NYC Big Data Meetup

22© Cloudera, Inc. All rights reserved.

High availability and failure domains

Data durabilityRepair within SLAHost-to-instance

Page 23: One Hadoop, Multiple Clouds - NYC Big Data Meetup

23© Cloudera, Inc. All rights reserved.

Backup and disaster recovery

Object store centricActive-Standby

Page 24: One Hadoop, Multiple Clouds - NYC Big Data Meetup

24© Cloudera, Inc. All rights reserved.

Imagine the Future

Portable ExperienceSelf-serviceSelf-healingGranular SecurityAdvanced GovernanceComplete Management

What’s your vision?

Page 25: One Hadoop, Multiple Clouds - NYC Big Data Meetup
Page 26: One Hadoop, Multiple Clouds - NYC Big Data Meetup

26© Cloudera, Inc. All rights reserved.

Thank [email protected]

Page 27: One Hadoop, Multiple Clouds - NYC Big Data Meetup

27© Cloudera, Inc. All rights reserved.

Resources

Cloudera Director: http://www.cloudera.com/director

Interested in API level integration and scripting?

https://github.com/cloudera/director-sdk

https://github.com/cloudera/director-scripts

Interested in integration with another cloud platform?

https://github.com/cloudera/director-spi

https://github.com/cloudera/director-google-plugin

Page 29: One Hadoop, Multiple Clouds - NYC Big Data Meetup

Cloudera Director Screenshots

Page 30: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 31: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 32: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 33: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 34: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 35: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 36: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 37: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 38: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 39: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 40: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 41: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 42: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 43: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 44: One Hadoop, Multiple Clouds - NYC Big Data Meetup

© 2014 Cloudera, Inc. All rights reserved.

Page 45: One Hadoop, Multiple Clouds - NYC Big Data Meetup

45© Cloudera, Inc. All rights reserved.

Thank [email protected]