one click hadoop clusters - anywhere...one click hadoop clusters - anywhere october, 2015 janos...

22
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering

Upload: others

Post on 20-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

One click Hadoop clusters - anywhere

October, 2015

Janos Matyas, Senior Director of Engineering

Page 2: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Overview

•  Introduction

•  Goals and motivations

•  Technology stack

•  How it works

•  Results/achievements/future plans

•  Demo and Q&A

Page 3: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Goals and motivations

•  Full Hadoop stack provisioning – everywhere

•  Automate and unify the process

•  Zero-configuration approach

•  Same process through a cluster lifecycle (Dev, QA, UAT, Prod)

•  Provide tooling - UI, REST API and CLI/shell

•  Secure and multi-tenant

•  SLA policy based autoscaling

Page 4: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Technology stack

•  Docker

•  Swarm

•  Consul

•  Apache Ambari

Page 5: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Docker

•  Container based virtualization

•  Lightweight and portable

•  Build once, run anywhere

•  Ease of packaging applications

•  Automated and scripted

•  Isolated

Page 6: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Docker – How it works •  Containers are isolated, but share OS and bins/

libraries

•  No need to emulate hardware

Page 7: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Swarm

•  Native clustering for Docker

•  Distributed container orchestration

•  Same API as Docker

Page 8: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Swarm – How it works

•  Swarm managers/agents

•  Discovery services

•  Advanced scheduling

Page 9: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Consul

•  Service discovery/registry

•  Health checking

•  Key/Value store

•  DNS

•  Multi datacenter aware

Page 10: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Consul – How it works

•  Consul servers/agents

•  Consistency through a quorum (RAFT)

•  Scalability due to gossip based protocol (SWIM)

•  Decentralized and fault tolerant

•  Highly available

•  Consistency over availability (CP)

•  Multiple interfaces - HTTP and DNS

•  Support for watches

Page 11: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Apache Ambari

•  Easy Hadoop cluster provisioning

•  Management and monitoring

•  Key feature - Blueprints

•  REST API, CLI shell

•  Extensible •  Stacks •  Services •  Views

Page 12: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Apache Ambari – How it works

•  Ambari server/agents

•  Define a blueprint (blueprint.json)

•  Define a host mapping (hostmapping.json)

•  Post the cluster create

Page 13: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cloudbreak

Cloudbreak is a cloud-agnostic Hadoop as a

Service API. Abstracts the provisioning and ease

management and monitoring of on-demand

clusters.

Cloudbreak is a powerful left surf that breaks over a coral reef, a mile off

southwest the island of Tavarua, Fiji.

Page 14: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cloudbreak

•  Benefits •  Zero configuration •  Elastic •  Secure •  Infrastructure agnostic •  Heterogenous clusters •  Auto-scaling

•  Main REST resources •  /template – specify an instance group infrastructure

•  /stack – creates an infrastructure based on a template

•  /blueprint – describes a Hadoop cluster

•  /cluster – creates a Hadoop cluster

Page 15: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cloudbreak – How it works

•  Start VMs - with a running Docker daemon

•  Cloudbreak Bootstrap •  Start Consul Cluster •  Start Swarm Cluster (Consul for discovery)

•  Start Ambari servers/agents - Swarm API

•  Ambari services registered in Consul (Registrator)

•  Post Blueprint

Page 16: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cloudbreak - Features

•  Extensible – easy to implement Service Provider Interface

•  Cloudbreak “recipes” •  Automate host configuration •  Pre/post Ambari lifecycle hooks •  Services reconfiguration •  Automate/execute custom actions

•  Side – effects •  Ambari CLI/shell and Groovy based client •  Cloud Foundry’s UAA Dockerized •  Munchausen – bootstrap Swarm with Consul •  Dockerized full Hadoop stack (Apache Hadoop 60K+, Ambari 12K+, Spark 10K+ downloads)

Page 17: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cloudbreak - Hadoop as a Service API

•  Public tech preview •  Microsoft Azure •  Amazon AWS •  Google Cloud Platform •  OpenStack

•  Private tech preview – R&D •  Bare metal •  Rackspace Managed Cloud •  HP Helion Public Cloud

*integration SPI is available

Page 18: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cloudbreak – SPI •  Cloud providers have very different API, though model is very similar •  Non – invasive implementation •  One interface to implement - CloudPlatformConnector Network Security Group Image

Subnet Subnet Rules Rules

Instance

Volume Volumes

Volume IP Address

UserData

Instance

Volume Volumes

Volume IP Address

Instance

Volume Volumes

Volume IP Address

Page 19: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Periscope

Periscope is a heuristic Hadoop scheduler

associated with a QoS profile. Built on

YARN schedulers, cloud and VM resource

management API's it allows to associate

SLA's to applications and customers.

Periscope is a powerful, fast, thick and top-to-bottom right-hander, eastward from

Sumbawa's famous west-coast.

Page 20: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Periscope

•  Benefits •  Zero configuration •  Metric and time based alarms •  SLA policy based autoscaling •  Secure •  Hostgroup specific

•  Main REST resources •  /clusters – specify a cluster to be monitored

•  /alerts– time and metric based

•  /policies – specify an SLA policy for a cluster based on an alarm

•  /applications – specify an SLA policy for an application (under development)

Page 21: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Periscope – How it works

•  Configures/monitors alarms in Ambari

•  Setup alarm, cooldown periods

•  Manages cluster sizes

•  Allow to associate SLA scaling policies to alarms

•  Orchestrates Cloudbreak to up/downscale the cluster

Page 22: One click Hadoop clusters - anywhere...One click Hadoop clusters - anywhere October, 2015 Janos Matyas, Senior Director of Engineering Page 2 ... • Full Hadoop stack provisioning

Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Demo and Q&A