managing a hadoop cluster
Post on 10-Nov-2015
21 Views
Preview:
DESCRIPTION
TRANSCRIPT
-
Managing Hadoop Cluster
-
Topology of a typical Hadoop cluster .
-
Installation Steps
Installed java
ssh and sshd
gunzip hadoop-0.18.0.tar.gz
Or tar vxf hadoop-0.18.0.tar Or tar vxf hadoop-0.18.0.tar
Set JAVA_HOME in conf/hadoop-env.sh
Modified hadoop-site.xml
-
Hadoop Installation Flavors
Standalone
Pseudo-distributed
Hadoop clusters of multiple nodes
-
Additional Configuration
conf/masters
contains the hostname of the SecondaryNameNode
It should be fully-qualified domain name.
conf/slaves conf/slaves
the hostname of every machine in the cluster which
should start TaskTracker and DataNode daemons
Ex:slave01
slave02
slave03
-
Advance Configuration
enable passwordless ssh
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
The ~/.ssh/id_dsa.pub and authorized_keys
files should be replicated on all machines in
the cluster.
-
Advance Configuration
Various directories should be created on each
node
The NameNode requires the NameNode metadata
directorydirectory
$ mkdir -p /home/hadoop/dfs/name
Every node needs the Hadoop tmp directory and
DataNode directory created
-
Advance Configuration..
bin/slaves.sh allows a command to be
executed on all nodes in the slaves file. $ mkdir -p /tmp/hadoop
$ export HADOOP_CONF_DIR=${HADOOP_HOME}/conf
$ export HADOOP_SLAVES=${HADOOP_CONF_DIR}/slaves
$ ${HADOOP_HOME}/bin/slaves.sh "mkdir -p /tmp/hadoop"$ ${HADOOP_HOME}/bin/slaves.sh "mkdir -p /tmp/hadoop"
$ ${HADOOP_HOME}/bin/slaves.sh "mkdir -p /home/hadoop/dfs/data
Format HDFS
$ bin/hadoop namenode -format
start the cluster:
$ bin/start-all.sh
-
Selecting Machines
Hadoop is designed to take advantage of
whatever hardware is available
Hadoop jobs written in Java can consume
between 1 and 2 GB of RAM per corebetween 1 and 2 GB of RAM per core
If you use HadoopStreaming to write your jobs
in a scripting language such as Python, more
memory may be advisable.
-
Cluster Configurations
Small Clusters: 2-10 Nodes
Medium Clusters: 10-40 Nodes
Large Clusters: Multiple Racks
-
Small Clusters: 2-10 Nodes
In two nodes,
one node: NameNode/JobTracker and a
DataNode/TaskTracker;
the other node: DataNode/TaskTracker. the other node: DataNode/TaskTracker.
Clusters of three or more machines typically
use a dedicated NameNode/JobTracker, and
all other nodes are workers.
-
configuration in conf/hadoop-site.xml
mapred.job.trackerhead.server.node.com:9001
fs.default.name
hdfs://head.server.node.com:9000
hadoop.tmp.dir/tmp/hadooptrue
mapred.system.dir/hadoop/mapred/systemtrue
ue>
dfs.data.dir/home/hadoop/dfs/datatrue
dfs.name.dir/home/hadoop/dfs/nametrue
dfs.replication2
-
Medium Clusters: 10-40 Nodes
The single point of failure in a Hadoop cluster
is the NameNode
Hence, back up the NameNode metadata.
One machine in the cluster should be designated One machine in the cluster should be designated
as the NameNode's backup
It does not run the normal Hadoop daemons
it exposes a directory via NFS which is only
mounted on the NameNode
-
NameNodes backup
The cluster's hadoop-site.xml file should then
instruct the NameNode to write to this
directory as well:
dfs.name.dir
/home/hadoop/dfs/name,/mnt/namenode-backup
true
-
conf/hadoop-site.xml
Nodes must be decommissioned on a schedule that permits replication of blocks being decommissioned.
conf/hadoop-site.xml
dfs.hosts.exclude/home/hadoop/excludes/home/hadoop/excludestrue
mapred.hosts.exclude/home/hadoop/excludestrue
create an empty file with this name: $ touch /home/hadoop/excludes
-
Replication Setting
dfs.replication
3
-
Tutorial
Configure Hadoop Cluster in two nodes.
Tutorial-Installed Hadoop in Cluster.docx
-
Performance Monitoring
Ganglia
Nagios
-
Ganglia
performance monitoring framework for
distributed systems
collects metrics on individual machines and
forwards them to an aggregatorforwards them to an aggregator
designed to be integrated into other
applications
-
Ganglia
Installed and configured Ganglia
create a file named hadoop-metrics.propertiesin the $HADOOP_HOME/conf directorydfs.class=org.apache.hadoop.metrics.ganglia.GangliaContextdfs.period=10dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContextdfs.period=10dfs.servers=localhost:8649
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContextmapred.period=10mapred.servers=localhost:8649
-
Nagios
a machine and service monitoring system
designed for large clusters
provide useful diagnostic information for
tuning your cluster, including network, disk, tuning your cluster, including network, disk,
and CPU utilization across machines.
-
Tutorial
Installed Ganglia /Nagios and monitor Hadoop
Tutorial-MonitorHadoopWithGanglia.docx
top related