February 9, 2006
Vikas Singhal, VECC
1
Cluster Building and Design
Cluster Building and Design
Vikas SinghalVECC, Kolkata, India
February 9, 2006
Vikas Singhal, VECC
2
Cluster Building and Design
General View of HPC
Clustering Concept
Requirement for clustering
Quattor Description
Working of Condor
Glimpse of Ganglia
Current status of our cluster
Cluster Building and Design
February 9, 2006
Vikas Singhal, VECC
3
Cluster Building and Design
High Performance Computing
Branch of Computing that deals with extremely powerful computers and the applications that use them.
High Computing Power required for Data Intensive applications or High Computing applications. (As per requirement)
Eg. Supercomputer is one of the answer for HPC.
Supercomputer is characterized by very high speed, very large memory.
Speed measured in terms of number of flops.
Fastest computer in the world BlueGene/L (IBM made) 280 Tflops.
February 9, 2006
Vikas Singhal, VECC
4
Cluster Building and Design
Technologies for HPC
Traditional : Build Faster CPUs
Special electronic technology for
increasing clock speed
Advanced CPU architecture(Pipelining, Vector
Processing, Multiple functional units etc)
Parallel Processing(Harness large number of
ordinary CPUs and divide the job between then)
Eg: CRAYVery high clock speed
Very High heat dissipation
Advanced cooling techniques requiredLiquid Freon / Liquid nitrogen
Expensive
But easy for UserNo special programming required
Large number of conventional CPUsInterconnected through a Network
Cost effective
Program writing is difficult,Job has to be split into
independently executable units
February 9, 2006
Vikas Singhal, VECC
5
Cluster Building and Design
Why Clustering
For High Performance and High Availability computing, Making Cluster of computers is one of the best solution.
Low cost technology than Supercomputer.
Faster than super computer of same hardware cost.
No technical and technological limitations.
Scalable and Simple.
February 9, 2006
Vikas Singhal, VECC
6
Cluster Building and Design
High Computing Power Clustering of Computers
Application
Computing Intensive Task
Main aim is High Performance Computing (HPC)(Most of TOP500 computers are built by clustering,In BlueGene/L 1,31,000 processors (approx))
Single User and single number crunching problem
Communication between nodes should be much faster(Some Hi-Fi network card is required (Costly))
Program should be written with the help of any parallel language or inParallel environment.
Parallel Languages: LINDA, OCCAM etc
Parallel Extension to serial languages:High Performance Fortran (HPF)
Parallel APIs: OpenMP, MPI
February 9, 2006
Vikas Singhal, VECC
7
Cluster Building and Design
High Computing Power Clustering of Computers
Application
Data Intensive Task
Main aim is not High Performance Computing (HPC) but High Availability.
Multi User and Multi Job System
It is Part of Global Grid like EDG
Security is main concern
7 collaborating InstitutesMore than 100 Users (Consult with Mr. S. K. Pal Talk)
Internet Connectivity (High Bandwidth) is required.(We have installed 4-Mbps Leased Line (1:4))
February 9, 2006
Vikas Singhal, VECC
8
Cluster Building and Design
How to build Cluster of Our Requirement
Hardware
Processors
Memory (RAM)
StorageNo need to purchase Hi-Fi Network Card
Software
Cluster Building S/W
Cluster Monitoring S/W
Job Scheduling S/W
User Management S/W
According to requirement.Open Source Availability.Software Area is Very Big.
Purchase according to requirement and Budget.
February 9, 2006
Vikas Singhal, VECC
9
Cluster Building and Design
Procurement of full cluster is not at Once.Step by step process.
Different H/W support different S/W.
Our specific requirement
Procurement of HARDWARE
Procurement of SOFTWARE
February 9, 2006
Vikas Singhal, VECC
10
Cluster Building and Design
DMZ
Giga-bit Switch Giga-bit Switch
Management Nodes
HP Proliant-360DLG3Dual CPU Xeon 2.4 GHz
192.
168.
x.x
(Sta
nd b
y)
125.20.3.11
Computing Nodes
4Mbps (1:4)
Present status of Tier2-Kol ClusterPresent status of Tier2-Kol Cluster
Based on High Availability
February 9, 2006
Vikas Singhal, VECC
11
Cluster Building and Design
High Availability
For Data Intensive and Real time task critical system requires High availability
High Availability Redundancy (Eliminate single point of failure)
Each server has 2-NICs
Eth0Eth1
2-Gigabit Switch
Based on Bonding Concept
February 9, 2006
Vikas Singhal, VECC
12
Cluster Building and Design
Redundancy Cont.
2 Hard DisksBoth are mirror of each other.Both are hot swappable.Implemented on Hardware RAID-0 technique.Both synchronized in each millisecond.
Trying to make mirror of Management node.
rsync
February 9, 2006
Vikas Singhal, VECC
13
Cluster Building and Design
Software Requirement for making Cluster
Open Source Software for Cluster Building:-
OSCAR : Free but harnessing of Client nodes is limited
SCALI : Not free S/W. Paid with Network Cards (as in IMSc)
Redhat Cluster Suits : Not much suitable
CPM (Central Processor Manager) : IBM Proprietary
Rocks : Not free software
Quattor : Free and Best Suitable
For selecting which one is “Best” according to our requirement one have to get experience with all.
February 9, 2006
Vikas Singhal, VECC
14
Cluster Building and Design
No Specific Hardware or software required for building Quattor Cluster.
Installing a Quattor Server and Client
Requirements:It supports SLC or RH Linux 7.3Disk: 6.5 GB for Server, 2.5 GB per client OS
Site Address:-http://quattor.org
Package RPMs:-http://quattorsw.web.cern.ch/quattorsw/software/quatttor
Quattor is a large scale management system for managing medium to very large (>1000 node) clusters.
3 Sets of Quattor RPM are available:-
1. i386 :- For all Pentium or Xeon processor or that has IA32 bit Instruction set2. IA64 :- For 64 bit machine means Intel Itanium3. i86x64 :- For 64 bit machine but also supports x86 instruction set like AMD
Opetron
Quattor is an administration toolkit for optimizing resources.
February 9, 2006
Vikas Singhal, VECC
15
Cluster Building and Design
SPMASoftware Package Manager Agent for software deploymentManages the different software packages installationHandle multiple package formatsManages Software Repository (SWRep)
CDBConfiguration Data Base
NCM Node Configuration Manager for system configurationFramework, where service-specific plug-in (Components) makes necessary system.
Hierarchical Template Based StructureMakes one common structure for different databasesContains cluster descriptions, networking parameters etc
AIIAutomated Installation InfrastructureWorks on top of native RH/SL installer using PXE.
Anaconda / KickStart.DHCP server (IP address + kernel location).TFTP server (boot kernel).HTTP server (OS images + packages).
February 9, 2006
Vikas Singhal, VECC
16
Cluster Building and Design
For Installing Cluster Site Basic Requirement
Cluster Building : Quattor
Job Scheduling : Condor
Some basic steps after Quattor installationsC3 commands
for High availability (if Dual NIC)Bonding PackageLDAP (Lightweight Directory Access Protocol)S/W Firewall (Make firewall rules)
Specialized workload management system.Provides a job queuing mechanism, scheduling policy, resource monitoring, and resource management.Can checkpoint and migrate a job to a different machine
February 9, 2006
Vikas Singhal, VECC
17
Cluster Building and Design
Condor Daemons
February 9, 2006
Vikas Singhal, VECC
18
Cluster Building and Design
Job Submission Steps
February 9, 2006
Vikas Singhal, VECC
19
Cluster Building and Design
condor_compileRe-links source or object files with condor librariesCondor library provides check-pointing, migration, remote system calls
condor_submit - Takes as input submit description file and produces a job classAd for further processing by central manager
condor_status – to view about various machines in the Condor pool
condor_q – for viewing job status
Condor Commands
February 9, 2006
Vikas Singhal, VECC
20
Cluster Building and Design
Submit description files
Directs queuing of jobs
Contains
Executable locationCommand line arguments to jobstdin, stderr, stdoutInitial working directoryshould_transfer_files = <YES | NO | IF_NEEDED >. NO disables condor file transfer mechanismwhen_to_transfer_output = < ON_EXIT | ON_EXIT_OR_EVICT >
February 9, 2006
Vikas Singhal, VECC
21
Cluster Building and Design
Cluster Monitoring & Job Throwing : Ganglia
Ganglia is a scalable distributed monitoring system for high-performance computing systems.
Relies on a multicast-based listen/announce protocol to monitor state.Very low per-node overheads and high concurrency. It uses
XML for data representation XDR for compact, portable data transport, RRDtool for data storage and visualization.
February 9, 2006
Vikas Singhal, VECC
22
Cluster Building and Design
Ganglia Monitoring Daemon (gmond)
Gmond is a multi-threaded daemon.Runs on each cluster node those we want to monitor .
Ganglia Meta Daemon (gmetad)
Start it only Management node.
Ganglia PHP Web Front-end
Displays Ganglia data in a meaningful way
Cluster Monitoring & Job Throwing : Ganglia
New Era of Internet Use started
We had used Internet / Web as Information / Knowledge BaseNow we can use http for computing also.Open page, select executable file and submit it. This file will execute on Cluster Client node.
February 9, 2006
Vikas Singhal, VECC
23
Cluster Building and Design
With EDG Grid connectivity :- ALIEN, EGEE, gLite, LCG-2 ???
Cluster Grid
To become a Part of Global Monitoring : MonaLisa, Lemon.
February 9, 2006
Vikas Singhal, VECC
24
Cluster Building and Design
VECC Cluster Machine status
One Interactive node:-At this time we have only one Interactive node we will procure
more in near future.#ssh interactive001
Other Computing type of nodes:-
Here 6 Computing nodes (node001 to node006).One cannot login to these nodes but compute jobs.One can use these for Batch mode for computing, not in
Interactive mode.
February 9, 2006
Vikas Singhal, VECC
25
Cluster Building and Design
Where we land up Now
PC – Post Card
PC – Personal Computer PC – Packed Cluster
February 9, 2006
Vikas Singhal, VECC
26
Cluster Building and Design
Future WorkC++ and MPI (Massage Passing Interface) will be the Future for clusters.
For optimum use of cluster users have to learn MPI
Questions ??