streamline hadoop devops with apache ambari

53
Streamline Hadoop DevOps with Apache Ambari Jayush Luniya Hadoop Summit, Tokyo

Upload: jayush-luniya

Post on 13-Apr-2017

166 views

Category:

Technology


6 download

TRANSCRIPT

Page 1: Streamline Hadoop DevOps with Apache Ambari

Streamline Hadoop DevOps with Apache Ambari

Jayush LuniyaHadoop Summit, Tokyo

Page 2: Streamline Hadoop DevOps with Apache Ambari

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Speaker

Jayush LuniyaStaff Software Engineer @ HortonworksApache Ambari [email protected]

Page 3: Streamline Hadoop DevOps with Apache Ambari

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Open-source platform to provision, manage and monitor Hadoop clusters

Page 4: Streamline Hadoop DevOps with Apache Ambari

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why Ambari?

Page 5: Streamline Hadoop DevOps with Apache Ambari

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Why Ambari?

Page 6: Streamline Hadoop DevOps with Apache Ambari

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Page 7: Streamline Hadoop DevOps with Apache Ambari

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ambari 2.0 Ambari 2.1 Ambari 2.2 Ambari 2.4April '15 Jul - Sept'15 Dec'15 - Feb'16 Aug'16 - Sept '16

0

500

1000

1500

2000

2500

3000

1690 1864

797

2251

277

206

34379

488#.#.2#.#.1GA

No.

of J

IRA

s

Ambari Releases

Page 8: Streamline Hadoop DevOps with Apache Ambari

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Ambari Architecture

Page 9: Streamline Hadoop DevOps with Apache Ambari

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Exciting Enterprise Features in Ambari 2.4

New Services: Log Search, Zeppelin, Hive LLAP

Role Based Access Control

Management Packs

Grafana UI for Ambari Metrics System

New Views: Zeppelin, Storm

Page 10: Streamline Hadoop DevOps with Apache Ambari

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Deploy

Secure

Config

Upgrade

Monitor

Extend

Operations - Lifecycle

Ease-of-Use

Page 11: Streamline Hadoop DevOps with Apache Ambari

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Deploy

Page 12: Streamline Hadoop DevOps with Apache Ambari

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Deploy On Premise

Ambari handles all of these combinations and makes recommendations based on host specs.

Page 13: Streamline Hadoop DevOps with Apache Ambari

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Deploy In The Cloud

Certified environmentsSysprepped VMsHundreds of similar clusters

Page 14: Streamline Hadoop DevOps with Apache Ambari

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Deploy with Blueprints

Systematic way of defining a cluster

Export existing cluster into blueprint/api/v1/clusters/:clusterName?format=blueprint

Configs Topology Hosts Cluster

Page 15: Streamline Hadoop DevOps with Apache Ambari

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Create Cluster with Blueprints{ "configurations" : [ { "hdfs-site" : {

"dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" }}

{ "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org"

} ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org"

}, { "fqdn" : "worker002.ambari.apache.org"

}, … { "fqdn" : "worker099.ambari.apache.org"

} ] } ]}

1. POST /api/v1/blueprints/my-blueprint

2. POST /api/v1/clusters/my-cluster

Page 16: Streamline Hadoop DevOps with Apache Ambari

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Create Cluster with Blueprints{ "configurations" : [ { "hdfs-site" : {

"dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" }}

{ "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org"

} ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org"

}, { "fqdn" : "worker002.ambari.apache.org"

}, … { "fqdn" : "worker099.ambari.apache.org"

} ] } ]}

1. POST /api/v1/blueprints/my-blueprint

2. POST /api/v1/clusters/my-cluster

Page 17: Streamline Hadoop DevOps with Apache Ambari

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Create Cluster with Blueprints{ "configurations" : [ { "hdfs-site" : {

"dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" }}

{ "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org"

} ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org"

}, { "fqdn" : "worker002.ambari.apache.org"

}, … { "fqdn" : "worker099.ambari.apache.org"

} ] } ]}

1. POST /api/v1/blueprints/my-blueprint

2. POST /api/v1/clusters/my-cluster

Page 18: Streamline Hadoop DevOps with Apache Ambari

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Create Cluster with Blueprints{ "configurations" : [ { "hdfs-site" : {

"dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" }}

{ "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org"

} ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org"

}, { "fqdn" : "worker002.ambari.apache.org"

}, … { "fqdn" : "worker099.ambari.apache.org"

} ] } ]}

1. POST /api/v1/blueprints/my-blueprint

2. POST /api/v1/clusters/my-cluster

Page 19: Streamline Hadoop DevOps with Apache Ambari

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Blueprints for Large Scale

Kerberos, secure out-of-the-box High Availability is setup initially for

NameNode, YARN, Hive, Oozie, etc Host Discovery allows Ambari to

automatically install services for a Host when it comes online

Stack Advisor recommendations

Page 20: Streamline Hadoop DevOps with Apache Ambari

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

POST /api/v1/clusters/MyCluster/hosts

[ { "blueprint" : "single-node-hdfs-test2", "host_groups" :[ { "host_group" : "slave", "host_count" : 3, "host_predicate" : "Hosts/cpu_count>1” }, { "host_group" : "super-slave", "host_count" : 5, "host_predicate" : "Hosts/cpu_count>2& Hosts/total_mem>3000000" } ] }]

Blueprint Host Discovery

Page 21: Streamline Hadoop DevOps with Apache Ambari

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Secure

Page 22: Streamline Hadoop DevOps with Apache Ambari

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Comprehensive Security

LDAP/AD• User Auth.• Sync

Kerberos• MIT KDC• Keytab

Management

Atlas• Governance• Compliance• Data Classify• Lineage & History

Ranger• Security policies• Audit• Authorization

Knox• Perimeter Sec.• LDAP/AD• Sec. REST/HTTP• SSL

Page 23: Streamline Hadoop DevOps with Apache Ambari

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Kerberos

Ambari manages Kerberos principals and keytabs

Works with existing MIT KDC or Active Directory

Once Kerberized, handles

Adding Services

Adding Hosts

Adding Host Components

Moving Host Components

Page 24: Streamline Hadoop DevOps with Apache Ambari

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Role Based Access Control (RBAC)

As Ambari & organizations grow,so do security needs

Ambari integrates with external authentication systems & LDAP

Page 25: Streamline Hadoop DevOps with Apache Ambari

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

RBAC Terms

Users belong to groups

A group has a role

Users can also have additional roles

Roles are applied to Resources. E.g.,Ambari, particular Cluster, particular View

Roles have permissionse.g., add services to cluster

Page 26: Streamline Hadoop DevOps with Apache Ambari

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

New RBAC Roles

only view

↑, except change configs

↑, except alter cluster topologyor install components

Ambari Admin

Cluster Admin

Cluster Op

Service Admin

Service Op

Read-Only

↑, except add services, Kerberos,manage alerts & upgrades

↑, except manage permissions

all

Page 27: Streamline Hadoop DevOps with Apache Ambari

27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Config

Page 28: Streamline Hadoop DevOps with Apache Ambari

28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Config Management

Config Groups Different config settings for individual host components

Config Versioning Revert back to old configs

Smart Configs Highlight most important configs

Stack Advisor Recommend configurations

Page 29: Streamline Hadoop DevOps with Apache Ambari

29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Smart Configs

Widgets

- Sliders- Combos- Toggles- Spinners- Lists

Page 30: Streamline Hadoop DevOps with Apache Ambari

30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Stack Advisor

KerberosHTTPSZookeeper ServersMemory Settings…High Availability

atlas.rest.address = http(s)://host:port

# Atlas Serversatlas.enabletTLS = true|falseatlas.server.http.port = 21000atlas.server.https.port = 21443

Example

Configurations

Page 31: Streamline Hadoop DevOps with Apache Ambari

31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Upgrade

Page 32: Streamline Hadoop DevOps with Apache Ambari

32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Background: Upgrade Terminology

Manual Upgrade

The user follows instructions to upgrade the stack Incurs downtime

Page 33: Streamline Hadoop DevOps with Apache Ambari

33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Background: Upgrade Terminology

Manual Upgrade

The user follows instructions to upgrade the stack Incurs downtime

Rolling Upgrade

Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact

Page 34: Streamline Hadoop DevOps with Apache Ambari

34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Background: Upgrade Terminology

ExpressUpgrade

Automated Runs in parallel across hosts Incurs downtime

Manual Upgrade

The user follows instructions to upgrade the stack Incurs downtime

Rolling Upgrade

Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact

Page 35: Streamline Hadoop DevOps with Apache Ambari

35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Check Prerequisites

Review the prereqs to confirm your cluster configs are ready

Prepare

Take backups of critical cluster metadata

Register + Install

Register the HDP repository and install the target HDP version on the cluster

Automated Upgrade: Rolling/Express

Perform Upgrade

Perform the HDP upgrade. The steps depend on upgrade method: Rolling or Express

Finalize

Finalize the upgrade, making the target version the current version

Page 36: Streamline Hadoop DevOps with Apache Ambari

36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Process: Rolling Upgrade

ZooKeeper

Ranger

Hive

Oozie

Falcon

Kafka

Knox

Storm

Slider

Flume

Finalize or Downgrade

Clients HDFS, YARN, MR, Tez, HBase, Pig. Hive, etc.

Core Masters

Core Slaves

HDFS

YARN

HBase

Page 37: Streamline Hadoop DevOps with Apache Ambari

37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Monitor

Page 38: Streamline Hadoop DevOps with Apache Ambari

38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Alerting FrameworkAlert Type Description Thresholds (units)

WEB Connects to a Web URL. Alert status is based on the HTTP response code

Response Code (n/a)Connection Timeout (seconds)

PORT Connects to a port. Alert status is based on response time

Response (seconds)

METRIC Checks the value of a service metric. Units vary, based on the metric being checked

Metric Value (units vary)Connection Timeout (seconds)

AGGREGATE Aggregates the status for another alert % Affected (percentage)

SCRIPT Executes a script to handle the alert check Varies

SERVER Executes a server-side runnable class to handle the alert check

Varies

Page 39: Streamline Hadoop DevOps with Apache Ambari

39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Alert UI

Page 40: Streamline Hadoop DevOps with Apache Ambari

40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Grafana for Ambari Metrics

Grafana as a “Native UI” for Ambari Metrics

Pre-built DashboardsHost-level, Service-level

Supports HTTPS

System Home, Servers

HDFS Home, NameNodes, DataNodes

YARN Home, Applications, Job History Server

HBase Home, Performance

FEATURES DASHBOARDS

Page 41: Streamline Hadoop DevOps with Apache Ambari

41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Grafana includes pre-built dashboards for visualizing the most important cluster metrics.

Page 42: Streamline Hadoop DevOps with Apache Ambari

42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

The HDFS NameNodedashboard highlightsfile system activity.

Page 43: Streamline Hadoop DevOps with Apache Ambari

43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Storm Monitoring View

Page 44: Streamline Hadoop DevOps with Apache Ambari

44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Log SearchSearch and index Hadoop logs!

Capabilities• Rapid Search of all Hadoop component logs• Search across time ranges, log levels, and for

keywords

Solr

LogsearchAmbari

Page 45: Streamline Hadoop DevOps with Apache Ambari

45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Log Search

W O R K E RN O D E

L O G F E E D E R

Solr

L O G S E A R C H

U I

Solr

Solr

A M B A R I

Java ProcessMulti-output SupportGrok filters

Solr CloudLocal Disk Storage

Page 46: Streamline Hadoop DevOps with Apache Ambari

46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Extend

Page 47: Streamline Hadoop DevOps with Apache Ambari

47 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Management Packs Improved Release Management:

Decouple Ambari core from stacks releases

Support Add-ons:–Release vehicle for 3rd party services, views–Self-contained release artifacts–Stack is an overlay of multiple management

packs

Page 48: Streamline Hadoop DevOps with Apache Ambari

48 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Overlay of Management Packs

inclu

ded

by

includ

ed by

included byincluded byinherits from 2.3

inherits from 2.4

inherits from 2.5

Page 49: Streamline Hadoop DevOps with Apache Ambari

49 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Management Pack++

Short Term Goals (Ambari 2.4) Retrofit in Stack Processing Framework Enable 3rd party to ship add-on services

Future Goals Management Pack Framework Include Views

Page 50: Streamline Hadoop DevOps with Apache Ambari

50 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Service Level Extensions

Service Role Command Order Service Advisor Service Repos Service Upgrade Packs

Page 51: Streamline Hadoop DevOps with Apache Ambari

51 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future

Page 52: Streamline Hadoop DevOps with Apache Ambari

52 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future of Ambari Cloud Focus Multiple Service Instance (Two ZK quorums) Multiple Service Versions (Spark 1.6 & Spark 2.0) YARN Assemblies Granular Upgrades: Patch, Component, Service Ambari High Availability

Page 53: Streamline Hadoop DevOps with Apache Ambari

53 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank You