bright cluster managerinfo.brightcomputing.com/.../docs/isc14_hadoop_pub.pdf · 2017-10-09 ·...
Post on 06-Jul-2020
4 Views
Preview:
TRANSCRIPT
Bright Cluster Manager A Unified Management Solution for HPC and Hadoop
Martijn de Vries CTO
Introduction
Bright Cluster
Architecture
CMDaemon
head node
node001
node003
node002
SOAP+SSL
SOAP/JSONAPI
+SSL
Cluster Management
GUI
Cluster Management
Shell
Web-Based User Portal
Third-Party Applications
Management Interface
Graphical User Interface (GUI) § Offers administrator full cluster control § Standalone desktop application § Manages multiple clusters simultaneously § Runs natively on Linux, Windows and OS X
Cluster Management Shell (CMSH) § All GUI functionality also available through
Cluster Management Shell § Interactive and scriptable in batch mode
Cluster Management
GUI
Cluster Management
Shell
Hadoop Integration
Managing Clusters
Bright Cluster Manager can be used for several types of clusters HPC Compute Storage Private cloud (OpenStack) Server farms Big Data (Hadoop)
All types of clusters need to be: Deployed Configured Provisioned Managed Monitored Health-checked
Managing Hadoop Clusters
Managing Hadoop Clusters just as difficult as other types of clusters
Without proper infrastructure, Hadoop will not run and cluster will not be usable for data processing
Bright Cluster Manager provides single-pane-of-glass to manage and monitor all aspects of Hadoop cluster
Includes: Hardware (set up, configuration, monitoring) Operating system (provisioning, updates) Hadoop distribution Hadoop configuration Users
Bright Cluster Manager provides perfect environment for Hadoop to run on
Hadoop distribution agnostic (switching is easy)
Bright for Hadoop Cluster Management
Bright Cluster Manager 7.0 for Apache Hadoop Provides single-pane-of-glass for managing both physical cluster
as well as Hadoop Easy installation of Hadoop
Apache Hadoop 1.2, 2.2, 2.3 & 2.4 (on Bright DVD) Cloudera CDH 4 & 5 HortonWorks HDP 1.3 & 2.1
Configuration, monitoring and healthchecking of Hadoop instances
Graphical UI, command-line interface and API access
10
11
12
13
Hadoop Configuration
Hadoop configuration through roles Nodes can be configured to run certain Hadoop related services
by assigning roles Example roles:
DataNode, JobTracker, TaskTracker, Namenode, SecondaryNameNode, YARNServer, YARNClient, HBaseServer, HBaseClient, ZooKeeper
Assigning/unassigning role will: Write out configuration files based on role parameters Start/stop/monitor relevant services
Most important Hadoop configuration aspects can be changed from inside Bright
Exotic Hadoop configuration parameters can be set directly in (partially generated) configuration file
15
Hadoop Management Features
Integrated user management and HDFS access control Ability to re-purpose nodes between Hadoop and e.g. HPC Multiple HDFS instances on same cluster (different Hadoop
distributions possible) Most Hadoop configuration aspects controlled through GUI and
CLI Healthchecking and monitoring of Hadoop related services Ability to use alternative filesystems to HDFS (e.g. Lustre)
Re-purposing nodes
Node tasks are determined by assignment of roles (e.g. Hadoop Data Node, Slurm)
By default, node runs all tasks that it has been assigned roles for in parallel (e.g. Hadoop + Slurm)
Two methods to stop running Hadoop on a node: Method 1: (temporary)
Property at category and device level: “Use exclusively for:” Values: <empty>, HPC, OpenStack, Hadoop, Ceph, Nothing
Setting “Use exclusively for” causes all other tasks to be stopped immediately
Method 2: (permanent) Hadoop related operations: decommission/recommission Decommission: move data to other nodes to maintain
replication factor and stop using for jobs (could take a while) Recommission: move data back to node and use for Hadoop
jobs
Conclusion
Bright provides tried & tested method of cluster management Hundreds of clusters world-wide are being managed using Bright
Cluster Manager Inclusion of Hadoop management capabilities provides complete
solution for setup, management & monitoring of Hadoop clusters Single pane of glass for cluster & Hadoop Especially well suited for clusters that must support both HPC
compute and Hadoop jobs
Questions?
top related