clustering manual

7
CLUSTERING MANUAL Introduction: Parallel computing is a form of computation in which many calculations are carried out simultaneously operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently ("in parallel") What Is Cluster: In a computer system, a cluster is a group of servers and other resources that act like a single system and enable high availability and, in some cases, load balancing and parallel processing Beowulf Cluster: Beowulf is an approach to building a supercomputer as a cluster of commodity off- the-shelf personal computers, interconnected with a local area network technology like

Upload: md-mahedi-mahfuj

Post on 11-May-2015

559 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Clustering manual

CLUSTERING MANUAL

Introduction:

Parallel computing is a form of computation in which many calculations are carried out simultaneously operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently ("in parallel")

What Is Cluster:

In a computer system, a cluster is a group of servers and other resources that act like a single system and enable high availability and, in some cases, load balancing and parallel processing

Beowulf Cluster:

Beowulf is an approach to building a supercomputer as a cluster of commodity off-the-shelf personal computers, interconnected with a local area network technology like Ethernet, and running programs written for parallel processing.

Cluster Prerequisites:

1. Atleast Two Computers with a Linux Distribution installed in it.Make sure that your system has GCC installed in it.

2. A network connection between them. If you have just two computers, you can connect them using an ethernet wire. Make sure that IP addresses are assigned to them. If you don’t have a router to assign IP, you can statically assign them IP addresses, assign static IP addresses.

Page 2: Clustering manual

Steps Of Building a Beowulf Cluster:

The following steps are to be done for every node.

Step 1:Add Nodes-

First add the nodes.We can add the nodes in /etc/hosts file.We can use the text editor to add node's IP address followed by its host name. For example,

node0 10.1.1.1

node1 10.1.1.2

we will list the nodes using their host names. When required, the program will use this hosts file to convert host names to corresponding I.Ps.Like

10.255.2.x hostname1

10.255.2.y hostname2

10.255.2.z hostname3

Step 2:Create a user-

We shpuld create a new user for all nodes.we can create new user through System->Administration->Users and Groups and give a passward.We need to make same user all nodes not need to make same passward to all nodes.

Step 3: Installing SSH-

Now download and install ssh-server in every node. Execute the command sudo apt-get install opensshserver. Now we need to use OpenSSH Server for enabling remote login. It should be mentioned that remote tasks can be performed through telnet as well. But it does not

Page 3: Clustering manual

provide security for data flow. OpenSSH is a safer method for remote login between PCs, hence it is adopted.

Step 4:Login to new user-

Now logout my current sesson and login as mpiuser.

Step 5: Generating SSH key-

After login as mpiuser open the terminal and type the ssh-keygen -t dsa. This command will generate a new ssh key. It will be saved in the file id_dsa and id_dsa.pub. The public key is saved in the file ~/.ssh/id_dsa.pub, while ~/.ssh/id_dsa is the private key. These files will be saved in /home/mpiuser/.ssh/.

Step 6: Authorizing Keys-

A folder called .ssh will be created in your home directory. Its a hidden folder. This folder will contain a file id_dsa.pub that contains your public key. Now copy this key to another file called authorized_keys in the same directory. Execute the command in the terminal cd /home/mpiuser/.ssh; cat id_dsa.pub >>authorized_keys.If we donot do this then we need to passward for every time for connect to others computer.

Step 7:Download MPICH1-

Now download the MPICH1 from the web. Parallel computing system MPICH is a robust and flexible implementation of the MPI (Message Passing Interface). MPI is often used with parallel or distributed computing projects. MPICH is a multi-platform, configurable system (development, execution, libraries, etc) for MPI. It can achieve

Page 4: Clustering manual

parallelism using networked machines or using multitasking on a single machine.

Step 8:Install MPICH1-

Make a folder in home folder then configure it and install it by executing the command :

mkdir /home/mpiuser/mpich1

./configure --prefix=/home/mpiuser/mpich1

make

make install

Step 9:Editing files-

Open the file .bashrc in your home directory. If file does not exist, create one. Copy the following code into that file

export PATH=/home/mpiuser/mpich1/bin:$PATH

export PATH

LD_LIBRARY_PATH="/home/mpiuser/mpich1/lib:$LD_LIBRARY_PATH"

export LD_LIBRARY_PATH

The two lines are used to append the address /home/mpiuser/mpich1/bin which contains necessary executables of MPICH to the PATH variable.

The last two lines are used to append the address /home/mpiuser/mpich1/lib which contains necessary libraries of MPICH to the shared library path.

Step 10:Define the path to MPICH for SSH-

Page 5: Clustering manual

Give this command to add the bin directory of mpich1 to environment:

sudo echo /home/mpiuser/mpich1/bin >> /etc/environment

Now logout and login back into the user mpiuser.

Step 11:Edit machines.Linux-

In the folder mpich1, within the sub-directory share or util/machines/ a file called machines.LINUX will be found. Open that file and add the hostnames of all nodes except the home node ie. If you're editing the machines.LINUX file of node0, then that file will contain host names of all nodes except node0. By default MPICH executes a copy of the program in the home node. The machines. sample contents of machines.LINUX file:

hostname1

hostname2

Now cluster is ready .We can run any program through cluster.