bright cluster managerinfo.brightcomputing.com/.../docs/isc14_hadoop_pub.pdf · 2017-10-09 ·...

Post on 06-Jul-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Bright Cluster Manager A Unified Management Solution for HPC and Hadoop

Martijn de Vries CTO

Introduction

Bright Cluster

Architecture

CMDaemon

head node

node001

node003

node002

SOAP+SSL

SOAP/JSONAPI

+SSL

Cluster Management

GUI

Cluster Management

Shell

Web-Based User Portal

Third-Party Applications

Management Interface

Graphical User Interface (GUI) §  Offers administrator full cluster control §  Standalone desktop application §  Manages multiple clusters simultaneously §  Runs natively on Linux, Windows and OS X

Cluster Management Shell (CMSH) §  All GUI functionality also available through

Cluster Management Shell §  Interactive and scriptable in batch mode

Cluster Management

GUI

Cluster Management

Shell

Hadoop Integration

Managing Clusters

  Bright Cluster Manager can be used for several types of clusters   HPC Compute   Storage   Private cloud (OpenStack)   Server farms   Big Data (Hadoop)

  All types of clusters need to be:   Deployed   Configured   Provisioned   Managed   Monitored   Health-checked

Managing Hadoop Clusters

  Managing Hadoop Clusters just as difficult as other types of clusters

  Without proper infrastructure, Hadoop will not run and cluster will not be usable for data processing

  Bright Cluster Manager provides single-pane-of-glass to manage and monitor all aspects of Hadoop cluster

  Includes:   Hardware (set up, configuration, monitoring)   Operating system (provisioning, updates)   Hadoop distribution   Hadoop configuration   Users

  Bright Cluster Manager provides perfect environment for Hadoop to run on

  Hadoop distribution agnostic (switching is easy)

Bright for Hadoop Cluster Management

Bright Cluster Manager 7.0 for Apache Hadoop   Provides single-pane-of-glass for managing both physical cluster

as well as Hadoop   Easy installation of Hadoop

  Apache Hadoop 1.2, 2.2, 2.3 & 2.4 (on Bright DVD)   Cloudera CDH 4 & 5   HortonWorks HDP 1.3 & 2.1

  Configuration, monitoring and healthchecking of Hadoop instances

  Graphical UI, command-line interface and API access

10  

   

11  

   

12  

   

13  

   

Hadoop Configuration

Hadoop configuration through roles   Nodes can be configured to run certain Hadoop related services

by assigning roles   Example roles:

DataNode, JobTracker, TaskTracker, Namenode, SecondaryNameNode, YARNServer, YARNClient, HBaseServer, HBaseClient, ZooKeeper

  Assigning/unassigning role will:   Write out configuration files based on role parameters   Start/stop/monitor relevant services

  Most important Hadoop configuration aspects can be changed from inside Bright

  Exotic Hadoop configuration parameters can be set directly in (partially generated) configuration file

15  

   

Hadoop Management Features

  Integrated user management and HDFS access control   Ability to re-purpose nodes between Hadoop and e.g. HPC   Multiple HDFS instances on same cluster (different Hadoop

distributions possible)   Most Hadoop configuration aspects controlled through GUI and

CLI   Healthchecking and monitoring of Hadoop related services   Ability to use alternative filesystems to HDFS (e.g. Lustre)

Re-purposing nodes

  Node tasks are determined by assignment of roles (e.g. Hadoop Data Node, Slurm)

  By default, node runs all tasks that it has been assigned roles for in parallel (e.g. Hadoop + Slurm)

  Two methods to stop running Hadoop on a node:   Method 1: (temporary)

  Property at category and device level: “Use exclusively for:” Values: <empty>, HPC, OpenStack, Hadoop, Ceph, Nothing

  Setting “Use exclusively for” causes all other tasks to be stopped immediately

  Method 2: (permanent)   Hadoop related operations: decommission/recommission   Decommission: move data to other nodes to maintain

replication factor and stop using for jobs (could take a while)   Recommission: move data back to node and use for Hadoop

jobs

Conclusion

  Bright provides tried & tested method of cluster management   Hundreds of clusters world-wide are being managed using Bright

Cluster Manager   Inclusion of Hadoop management capabilities provides complete

solution for setup, management & monitoring of Hadoop clusters   Single pane of glass for cluster & Hadoop   Especially well suited for clusters that must support both HPC

compute and Hadoop jobs

Questions?

top related