![Page 1: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/1.jpg)
A Bioinformatics Introduction to Cluster Computing part I
By Andrew D. Boyd, MDResearch Fellow
Michigan Center for Biological InformationDepartment of Psychiatry,
University of Michigan Health Systemand
Abhijit Bose, PhDAssociate Director
Michigan Grid Research and Infrastructure Development andDepartment of Electrical Engineering and Computer Science
University of Michigan
![Page 2: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/2.jpg)
Introduction
• What is parallel computing?
• Why go parallel?
• What are some limits of parallel computing?
• Types of parallel computing– Shared memory
– Distributed memory
![Page 3: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/3.jpg)
What is parallel computing?• Parallel computing: the use of multiple computers or
processors working together on a common task.
– Each processor works on its section of the problem
– Processors are allowed to exchange information with other processors
CPU #1 works on thisarea of the problem
CPU #3 works on thisarea of the problem
CPU #4 works on thisarea of the problem
CPU #2 works on thisarea of the problem
Grid of Problem to be solved
y
x
Exchange
Exchange
![Page 4: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/4.jpg)
Why do parallel computing?• Limits of serial computing
– Available memory– Performance
• Parallel computing allows:– Solve problems that don’t fit on a single
CPU– Solve problems that can’t be solved in a
reasonable time
• We can run…– Larger problems– Faster– More cases
![Page 5: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/5.jpg)
Types of parallelism
• Data parallel– Each processor performs the same task on
different data– Example - grid problems
• Task parallel– Each processor performs a different task– Example - signal processing
• Most applications fall somewhere on the continuum between these two extremes
![Page 6: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/6.jpg)
Basics of Data ParallelProgramming
• Same code will run on 2 CPUs
• Program has array of data to be operated on by 2 CPU so array is split into two parts.
program.f:… if CPU=a then low_limit=1 upper_limit=50elseif CPU=b then low_limit=51 upper_limit=100end ifdo I = low_limit, upper_limit work on A(I)end do...end program
CPU A CPU B
program.f:…low_limit=1upper_limit=50do I= low_limit, upper_limit work on A(I)end do…end program
program.f:…low_limit=51upper_limit=100do I= low_limit, upper_limit work on A(I)end do…end program
![Page 7: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/7.jpg)
Typical Task Parallel Application
• Signal processing• Use one processor for each task
• Can use more processors if one is overloaded
DATANormalize
Task FFTTask
Multiply
Task
InverseFFTTask
![Page 8: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/8.jpg)
Basics of Task Parallel Programming
• Program has 2 tasks (a and b) to be done by 2 CPUs
program.f:… initialize...if CPU=a then do task aelseif CPU=b then do task bend if….end program
CPU A CPU B
program.f:…initialze…do task a…end program
program.f:…initialze…do task b…end program
![Page 9: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/9.jpg)
Limits of Parallel Computing
• Theoretical upper limits– Amdahl’s Law
• Practical limits
• Other Considerations– time to re-write code
![Page 10: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/10.jpg)
Theoretical upper limits
• All parallel programs contain:– Parallel sections
– Serial sections
• Serial sections limit the parallel effectiveness
• Amdahl’s Law states this formally
![Page 11: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/11.jpg)
Amdahl’s Law• Amdahl’s Law places a strict limit on the
speedup that can be realized by using multiple processors.– Effect of multiple processors on run time
– Effect of multiple processors on speed up
– Where• Fs = serial fraction of code• Fp = parallel fraction of code• N = number of processors
tn fp / N fs t1
S 1fs fp /N
![Page 12: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/12.jpg)
It takes only a small fraction of serial content in a code todegrade the parallel performance. It is essential todetermine the scaling behaviour of your code before doing production runs using large numbers of processors
0
50
100
150
200
250
0 50 100 150 200 250Number of processors
Illustration of Amdahl's Law
fp = 1.000fp = 0.999fp = 0.990fp = 0.900
![Page 13: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/13.jpg)
Amdahl’s Law provides a theoretical upper limit on parallelspeedup assuming that there are no costs for speedup assuming that there are no costs for communications. In reality, communications will result in a further degradation of performance
Amdahl’s Law Vs. Reality
0
10
20
30
40
50
60
70
80
0 50 100 150 200 250
fp = 0.99
Spe
ed u
p
Number of processors
Amdahl's LawReality
![Page 14: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/14.jpg)
Some other considerations• Writing effective parallel application is difficult
– Load balance is important– Communication can limit parallel efficiency– Serial time can dominate
• Is it worth your time to rewrite your application– Do the CPU requirements justify parallelization?– Will the code be used just once?Super-linear Speedup ? Cache effects as number of processors increases Randomised algorithms
![Page 15: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/15.jpg)
Shared and Distributed memory
Shared memory – single address space. All processors have access to a pool of shared memory.(examples: CRAY T90)
Methods of memory access : - Bus - Crossbar
Distributed memory – each processorhas it’s own local memory. Must do message passing to exchange data between processors. (examples: CRAY T3E, IBM SP )
P P P P P P
M M M M M M
Network
P P P P P P
Memory
Bus
![Page 16: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/16.jpg)
Pure Shared Memory Machines
• T90, C90, YMP, XMP, SV1• SGI O2000 (sort of)• HP-Exemplar (sort of)• Vax 780• Various Suns• Various Wintel boxes• BBN GP 1000 Butterfly
![Page 17: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/17.jpg)
Programming methodologies• Standard Fortran or C and let the compiler do it
for you
• Directive can give hints to compiler (OpenMP)
• Libraries
• Threads like methods– Explicitly Start multiple tasks– Each given own section of memory– Use shared variables for communication
• Message passing can also be used but is not common
![Page 18: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/18.jpg)
program calc_pi implicit none integer n,i double precision w,x,sum,pi,f,a double precision start, finish, timef f(a) = 4.0 / (1.0 + a*a) n=100000000 start=timef() w=1.0/n sum=0.0
Example of Shared-Memory Programming
Program: Calculate the value of pi
![Page 19: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/19.jpg)
!$OMP PARALLEL PRIVATE(x,i), SHARED(w,n), & !$OMP REDUCTION(+:sum)
!$OMP DO
do i=1,n
x = w * (i - 0.5)
sum = sum + f(x)
end do
!$OMP END DO
!$OMP END PARALLEL
pi = w * sum
finish=timef()
print*,"value of pi, time taken:"
print*,pi,finish-start
end
Shared-Memory Portion:
![Page 20: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/20.jpg)
Distributed shared memory (NUMA)
• Consists of N processors and a global address space
– All processors can see all memory
– Each processor has some amount of local memory
– Access to the memory of other processors is slower
• Non-Uniform Memory Access
P P P P
Memory
Bus
P P P P
Memory
Bus
![Page 21: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/21.jpg)
Programming methodologies• Same as shared memory
• Standard Fortran or C and let the compiler do it for you
• Directive can give hints to compiler (OpenMP)
• Libraries
• Threads like methods– Explicitly Start multiple tasks– Each given own section of memory– Use shared variables for communication
• Message passing can also be used
![Page 22: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/22.jpg)
DSM NUMA Machines
• SGI O2000• HP-Exemplar
![Page 23: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/23.jpg)
Distributed Memory
• Each of N processors has its own memory• Memory is not shared• Communication occurs using messages
P P P P P P
M M M M M M
Network
![Page 24: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/24.jpg)
Communication networks• Custom
– Many manufacturers offer custominterconnects
– Myrinet 2000
• Off the shelf– Ethernet (Force 10, Extreme
Networks)– ATM– HIPI– FIBER Channel– FDDI– INFINIBAND
![Page 25: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/25.jpg)
Programming methodology
• Mostly message passing using MPI
• Data distribution languages– Simulate global name space
– Examples• High Performance Fortran
• Split C
• Co-array Fortran
![Page 26: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/26.jpg)
Hybrid machines• SMP nodes (clumps) with interconnect
between clumps
• Machines– Origin 2000– Exemplar– SV1– IBM Nighthawk
• Programming– SMP methods on clumps or message passing– Message passing between all processors
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 27: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/27.jpg)
AN INTRODUCTION TO
MESSAGE PASSING INTERFACE
FOR CLUSTERS
![Page 28: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/28.jpg)
A Brief Intro to MPI
• Background on MPI• Documentation • Hello world in MPI • Basic communications• Simple send and receive program
You can use MPI in both Clusters and Grid environments
![Page 29: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/29.jpg)
Background on MPI
• MPI - Message Passing Interface– Library standard defined by committee of
vendors, implementers, and parallel programmers
– Used to create parallel SPMD programs based on message passing
• Available on almost all parallel machines in C and Fortran
• Over 100 advanced routines but 6 basic
![Page 30: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/30.jpg)
Documentation• MPI home page
http://www.mcs.anl.gov/mpi Contains the library standard
• Books http://www.epm.ornl.gov/~walker/mpi/books.html
"MPI: The Complete Reference" by Snir, Otto, Huss-Lederman, Walker, and Dongarra, MIT Press (also in Postscript and html)
"Using MPI" by Gropp, Lusk and Skjellum, MIT Press
![Page 31: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/31.jpg)
MPI Implementations • Most parallel machine vendors have optimized
versions
• Others: http://www-unix.mcs.anl.gov/mpi/mpich/indexold.html
GLOBUS: http://www.globus.org
MPICH-G2:http://www3.niu.edu/mpi/
![Page 32: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/32.jpg)
Key Concepts of MPI• Used to create parallel SPMD programs based
on message passing – Normally the same program is running on
several different nodes– Nodes communicate using message passing
• Typical methodology: start job on n processorsdo i=1 to j each processor does some calculation pass messages between processorend doend job
![Page 33: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/33.jpg)
Include files
• The MPI include file– C: mpi.h– Fortran: mpif.h (a f90 module is a good place
for this)• Defines many constants used within MPI
programs• In C defines the interfaces for the functions• Compilers know where to find the include files
![Page 34: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/34.jpg)
Communicators• Communicators
– A parameter for most MPI calls– A collection of processors working on some
part of a parallel job– MPI_COMM_WORLD is defined in the MPI
include file as all of the processors in your job– Can create subsets of MPI_COMM_WORLD– Processors within a communicator are
assigned numbers 0 to n-1
![Page 35: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/35.jpg)
Data types• Data types
– When sending a message, it is given a data type
– Predefined types correspond to "normal" types
• MPI_REAL , MPI_FLOAT -Fortran and C real • MPI_DOUBLE PRECISION , MPI_DOUBLE -
Fortan and C double• MPI_INTEGER and MPI_INT - Fortran and C
integer
– Can create user-defined types
![Page 36: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/36.jpg)
Minimal MPI Program• Every MPI program needs these…
– C version
#include <mpi.h> /* the mpi include file *//* Initialize MPI */ierr=MPI_Init(&argc, &argv);/* How many total PEs are there */ierr=MPI_Comm_size(MPI_COMM_WORLD, &nPEs);/* What node am I (what is my rank? */ierr=MPI_Comm_rank(MPI_COMM_WORLD, &iam);...ierr=MPI_Finalize();
• In C MPI routines are functions and return an error value
![Page 37: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/37.jpg)
Minimal MPI Program• Every MPI program needs these…
– Fortran version include 'mpif.h' ! MPI include filec Initialize MPI call MPI_Init(ierr)c Find total number of PEs call MPI_Comm_size(MPI_COMM_WORLD, nPEs, ierr)c Find the rank of this node call MPI_Comm_rank(MPI_COMM_WORLD, iam, ierr) ... call MPI_Finalize(ierr)
• In Fortran, MPI routines are subroutines, and last parameter is an error value
![Page 38: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/38.jpg)
Basic Communication• Data values are transferred from one processor
to another – One process sends the data– Another receives the data
• Synchronous– Call does not return until the message is sent
or received• Asynchronous
– Call indicates a start of send or received, and another call is made to determine if finished
![Page 39: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/39.jpg)
Synchronous Send• C
– MPI_Send(&buffer, count ,datatype, destination, tag,communicator);
• Fortran – Call MPI_Send(buffer, count, datatype, destination,tag,communicator, ierr)
• Call blocks until message on the way
![Page 40: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/40.jpg)
Synchronous Send• MPI_Send: Sends data to another processor
• Use MPI_Receive to "get" the data
• C MPI_Send(&buffer,count,datatype, destination,tag,communicator);
• Fortran Call MPI_Send(buffer, count, datatype,destination, tag, communicator, ierr)
![Page 41: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/41.jpg)
Call MPI_Send(buffer, count, datatype,destination, tag, communicator, ierr)
• Buffer: The data • Count : Length of source array (in elements, 1 for scalars)
• Datatype : Type of data, for example : MPI_DOUBLE_PRECISION, MPI_INT, etc
• Destination : Processor number of destination processor in communicator
• Tag : Message type (arbitrary integer) • Communicator : Your set of processors• Ierr : Error return (Fortran only)
![Page 42: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/42.jpg)
Synchronous Receive • C
– MPI_Recv(&buffer,count, datatype, source, tag, communicator, &status);
• Fortran – Call MPI_ RECV(buffer, count, datatype, source,tag,communicator, status, ierr)
• Call blocks until message is in buffer• Status - contains information about incoming
message– C
• MPI_Status status;
– Fortran• Integer status(MPI_STATUS_SUZE)
![Page 43: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/43.jpg)
Call MPI_Recv(buffer, count, datatype,source, tag, communicator,status, ierr)
• Buffer: The data • Count : Length of source array (in elements, 1 for scalars)
• Datatype : Type of data, for example : MPI_DOUBLE_PRECISION, MPI_INT, etc
• Source : Processor number of source processor in communicator
• Tag : Message type (arbitrary integer) • Communicator : Your set of processors• Status: Information about message• Ierr : Error return (Fortran only)
![Page 44: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/44.jpg)
Six basic MPI callsMPI_INIT
Initialize MPIMPI_COMM_RANK
Get the processor rankMPI_COMM_SIZE
Get the number of processorsMPI_Send
Send data to another processorMPI_Recv
Get data from another processorMPI_FINALIZE
Finish MPI
![Page 45: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/45.jpg)
Send and Receive Program Fortranprogram send_receiveinclude "mpif.h"integer myid,ierr,numprocs,tag,source,destination,countinteger bufferinteger status(MPI_STATUS_SIZE)call MPI_INIT( ierr )call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )tag=1234; source=0; destination=1; count=1if(myid .eq. source)then buffer=5678 Call MPI_Send(buffer, count, MPI_INTEGER,destination,& tag, MPI_COMM_WORLD, ierr) write(*,*)"processor ",myid," sent ",bufferendifif(myid .eq. destination)then Call MPI_Recv(buffer, count, MPI_INTEGER,source,& tag, MPI_COMM_WORLD, status,ierr) write(*,*)"processor ",myid," got ",bufferendifcall MPI_FINALIZE(ierr)stopend
![Page 46: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/46.jpg)
Send and Receive Program C#include <stdio.h>#include "mpi.h"int main(int argc,char *argv[]){ int myid, numprocs, tag,source,destination,count, buffer; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); tag=1234; source=0; destination=1; count=1; if(myid == source){ buffer=5678; MPI_Send(&buffer,count,MPI_INT,destination,tag,MPI_COMM_WORLD); printf("processor %d sent %d\n",myid,buffer); } if(myid == destination){ MPI_Recv(&buffer,count,MPI_INT,source,tag,MPI_COMM_WORLD,&status); printf("processor %d got %d\n",myid,buffer); } MPI_Finalize();}
![Page 47: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/47.jpg)
MPI Types
• MPI has many different predefined data types
• Can be used in any communication operation
![Page 48: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/48.jpg)
Predefined types in C
C MPI Types
MPI_CHAR signed char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE -
MPI_PACKED -
![Page 49: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/49.jpg)
Predefined types in Fortran
Fortran MPI Types
MPI_INTEGER INTEGER
MPI_REAL REAL
MPI_DOUBLE PRECISION DOUBLE PRECISION
MPI_COMPLEX COMPLEX
MPI_LOGICAL LOGICAL
MPI_CHARACTER CHARACTER(1)
MPI_BYTE -
MPI_PACKED -
![Page 50: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/50.jpg)
Wildcards • Allow you to not necessarily specify a tag or
source • Example
• MPI_ANY_SOURCE and MPI_ANY_TAG are wild cards
• Status structure is used to get wildcard values
MPI_Status status;int buffer[5];int error;error = MPI_Recv(&buffer, 5, MPI_INTEGER, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD,&status);
![Page 51: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/51.jpg)
Status
• The status parameter returns additional information for some MPI routines– Additional Error status information – Additional information with wildcard parameters
• C declaration : a predefined struct– MPI_Status status;
• Fortran declaration : an array is used instead – INTEGER STATUS(MPI_STATUS_SIZE)
![Page 52: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/52.jpg)
Accessing status information
• The tag of a received message – C : status.MPI_TAG – Fortran : STATUS(MPI_TAG)
• The source of a received message – C : status.MPI_SOURCE – Fortran : STATUS(MPI_SOURCE)
• The error code of the MPI call – C : status.MPI_ERROR – Fortran : STATUS(MPI_ERROR)
• Other uses...
![Page 53: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/53.jpg)
MPI_Probe
• MPI_Probe allows incoming messages to be checked without actually receiving . – The user can then decide how to receive the
data. – Useful when different action needs to be
taken depending on the "who, what, and how much" information of the message.
![Page 54: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/54.jpg)
MPI_Probe
• C – int MPI_Probe(source, tag, comm, &status)
• Fortran – MPI_PROBE(SOURCE, TAG, COMM, STATUS, IERROR)
• Parameters – Source: source rank, or MPI_ANY_SOURCE – Tag: tag value, or MPI_ANY_TAG– Comm: communicator – Status: status object
![Page 55: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/55.jpg)
MPI_Probe example (part 1)
! How to use probe and get_count ! to find the size of an incoming messageprogram probe_itinclude 'mpif.h'integer myid,numprocsinteger status(MPI_STATUS_SIZE)integer mytag,icount,ierr,iray(10)call MPI_INIT( ierr )call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )mytag=123; iray=0; icount=0if(myid .eq. 0)then! Process 0 sends a message of size 5 icount=5 iray(1:icount)=1 call MPI_SEND(iray,icount,MPI_INTEGER, & 1,mytag,MPI_COMM_WORLD,ierr)endif
![Page 56: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/56.jpg)
MPI_Probe example (part 2)
if(myid .eq. 1)then! process 1 uses probe and get_count to find the size call mpi_probe(0,mytag,MPI_COM_WORLD,status,ierr) call mpi_get_count(status,MPI_INTEGER,icount,ierr) write(*,*)"getting ", icount," values" call mpi_recv(iray,icount,MPI_INTEGER,0, & mytag,MPI_COMM_WORLD,status,ierr)endifwrite(*,*)iraycall mpi_finalize(ierr)stopEnd
Fortran source C source
![Page 57: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/57.jpg)
MPI_BARRIER
• Blocks the caller until all members in the communicator have called it.
• Used as a synchronization tool. • C
– MPI_Barrier(comm )
• Fortran – Call MPI_BARRIER(COMM, IERROR)
• Parameter – Comm communicator (MPI_COMM_WOLD)
![Page 58: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/58.jpg)
Asynchronous Communication
• Asynchronous send: send call returns immediately, send actually occurs later
• Asynchronous receive: receive call returns immediately. When received data is needed, call a wait subroutine
• Asynchronous communication used in attempt to overlap communication with computation
• Can help prevent deadlock (not advised)
![Page 59: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/59.jpg)
Asynchronous Send with MPI_Isend
• C – MPI_Request request – int MPI_Isend(&buffer, count, datatype, tag, comm, &request)
• Fortran – Integer REQUEST– MPI_ISEND(BUFFER,COUNT,DATATYPE, DEST, TAG, COMM, REQUEST,IERROR)
• Request is a new output Parameter• Don't change data until communication is complete
![Page 60: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/60.jpg)
Asynchronous Receive with MPI_Irecv
• C – MPI_Request request; – int MPI_Irecv(&buf, count, datatype, source, tag, comm, &request)
• Fortran – Integer request– MPI_IRECV(BUF, COUNT, DATATYPE, SOURCE, TAG,COMM, REQUEST,IERROR)
• Parameter Changes– Request: communication request– Status parameter is missing
• Don't use data until communication is complete
![Page 61: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/61.jpg)
MPI_Wait used to complete communication
• Request from Isend or Irecv is input• The completion of a send operation
indicates that the sender is now free to update the data in the send buffer
• The completion of a receive operation indicates that the receive buffer contains the received message
• MPI_Wait blocks until message specified by "request" completes
![Page 62: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/62.jpg)
MPI_Wait used to complete communication
• C – MPI_Request request;– MPI_Status status;– MPI_Wait(&request, &status)
• Fortran– Integer request– Integer status(MPI_STATUS_SIZE)– MPI_WAIT(REQUEST, STATUS, IERROR)
• MPI_Wait blocks until message specified by "request" completes
![Page 63: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/63.jpg)
MPI_Test
• Similar to MPI_Wait, but does not block• Value of flags signifies whether a message
has been delivered• C
– int flag– int MPI_Test(&request,&flag, &status)
• Fortran – LOGICAL FLAG– MPI_TEST(REQUEST, FLAG, STATUS, IER)
![Page 64: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/64.jpg)
Non blocking send example
call MPI_Isend (buffer,count,datatype,dest, tag,comm, request, ierr)10 continue Do other work ...
call MPI_Test (request, flag, status, ierr) if (.not. flag) goto 10
![Page 65: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/65.jpg)
MPI Broadcast call: MPI_Bcast
• All nodes call MPI_Bcast• One node (root) sends a message all others
receive the message • C
– MPI_Bcast(&buffer, count, datatype, root, communicator);
• Fortran – call MPI_Bcast(buffer, count, datatype, root, communicator, ierr)
• Root is node that sends the message
![Page 66: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/66.jpg)
Scatter Operation using MPI_Scatter
• Similar to Broadcast but sends a section of an array to each processors
A(0) A(1) A(2) . . . A(N-1)
P0 P1 P2 . . . Pn-1Goes to processors:
Data in an array on root node:
![Page 67: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/67.jpg)
MPI_Scatter
• C – int MPI_Scatter(&sendbuf, sendcnts, sendtype, &recvbuf, recvcnts, recvtype, root, comm );
• Fortran – MPI_Scatter(sendbuf,sendcnts,sendtype, recvbuf,recvcnts,recvtype,root,comm,ierror)
• Parameters– Sendbuf is an array of size (number processors*sendcnts)– Sendcnts number of elements sent to each processor– Recvcnts number of elements obtained from the root processor – Recvbuf elements obtained from the root processor, may be an
array
![Page 68: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/68.jpg)
Scatter Operation using MPI_Scatter
• Scatter with Sendcnts = 2
A(0) A(2) A(4) . . . A(2N-2) A(1) A(3) A(5) . . . A(2N-1)
P0 P1 P2 . . . Pn-1
B(0) B(O) B(0) B(0)B(1) B(1) B(1) B(1)
Goes to processors:
Data in an array on root node:
![Page 69: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/69.jpg)
Gather Operation using MPI_Gather
• Used to collect data from all processors to the root, inverse of scatter
• Data is collected into an array on root processor
A(0) A(1) A(2) . . . A(N-1)
P0 P1 P2 . . . Pn-1
A A A . . . AData from variousProcessors:
Goes to an array on root node:
![Page 70: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/70.jpg)
MPI_Gather
• C – int MPI_Gather(&sendbuf,sendcnts, sendtype, &recvbuf, recvcnts,recvtype,root, comm );
• Fortran – MPI_Gather(sendbuf,sendcnts,sendtype, recvbuf,recvcnts,recvtype,root,comm,ierror)
• Parameters– Sendcnts # of elements sent from each processor– Sendbuf is an array of size sendcnts– Recvcnts # of elements obtained from each processor– Recvbuf of size Recvcnts*number of processors
![Page 71: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/71.jpg)
Reduction Operations
• Used to combine partial results from all processors
• Result returned to root processor
• Several types of operations available
• Works on 1 or 2d arrays
![Page 72: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/72.jpg)
MPI routine is MPI_Reduce
• C – int MPI_Reduce(&sendbuf, &recvbuf, count, datatype, operation,root, communicator)
• Fortran – call MPI_Reduce(sendbuf, recvbuf, count, datatype, operation,root, communicator, ierr)
• Parameters– Like MPI_Bcast, a root is specified. – Operation is a type of mathematical operation
![Page 73: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/73.jpg)
Operations for MPI_Reduce MPI_MAX Maximum MPI_MIN Minimum MPI_PROD Product MPI_SUM Sum MPI_LAND Logical and MPI_LOR Logical or MPI_LXOR Logical exclusive or MPI_BAND Bitwise and MPI_BOR Bitwise or MPI_BXOR Bitwise exclusive or MPI_MAXLOC Maximum value and location MPI_MINLOC Minimum value and location
![Page 74: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/74.jpg)
Global Sum with MPI_Reduce C
double sum_partial, sum_global;sum_partial = ...;
ierr = MPI_Reduce(&sum_partial, &sum_global, 1, MPI_DOUBLE_PRECISION, MPI_SUM,root, MPI_COMM_WORLD);Fortran
double precision sum_partial, sum_global sum_partial = ... call MPI_Reduce(sum_partial, sum_global, 1, MPI_DOUBLE_PRECISION, MPI_SUM,root, MPI_COMM_WORLD, ierr)
![Page 75: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/75.jpg)
Global Sum with MPI_Reduce and 2d array
A0+A1+A2 B0+B1+B2 C0+C1+C2NODE 0NODE 1NODE 2
X(0) X(1) X(2)
A0 B0 C0A1 B1 C1A2 B2 C2
NODE 0NODE 1NODE 2
X(0) X(1) X(2)
![Page 76: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/76.jpg)
All Gather and All Reduce
• Gather and Reduce come in an "ALL" variation
• Results are returned to all processors
• The root parameter is missing from the call
• Similar to a gather or reduce followed by a broadcast
![Page 77: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/77.jpg)
Global Sum with MPI_AllReduce and 2d array
A0 B0 C0A1 B1 C1A2 B2 C2
X(0) X(1) X(2)
NODE 0 NODE 1 NODE 2
A0+A1+A2 B0+B1+B2 C0+C1+C2A0+A1+A2 B0+B1+B2 C0+C1+C2A0+A1+A2 B0+B1+B2 C0+C1+C2
X(0) X(1) X(2)
NODE 0 NODE 1 NODE 2
![Page 78: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/78.jpg)
All to All communication with MPI_Alltoall
• Each processor sends and receives data to/from all others
• C – int MPI_Alltoall(&sendbuf,sendcnts, sendtype, &recvbuf, recvcnts, recvtype, root, MPI_Comm);
• Fortran – MPI_ MPI_Alltoall(sendbuf,sendcnts,sendtype, recvbuf,recvcnts,recvtype,root,comm,ierror)
![Page 79: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/79.jpg)
All to All with MPI_Alltoall
• Parameters– Sendcnts # of elements sent to each processor– Sendbuf is an array of size sendcnts– Recvcnts # of elements obtained from each processor– Recvbuf of size Recvcnts*number of processors
• Note that both send buffer and receive buffer must be an array of size of the number of processors
![Page 80: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/80.jpg)
The dreaded “V” or variable or operators
• A collection of very powerful but difficult to setup global communication routines
• MPI_Gatherv: Gather different amounts of data from each processor to the root processor
• MPI_Alltoallv: Send and receive different amounts of data form all processors
• MPI_Allgatherv: Gather different amounts of data from each processor and send all data to each
• MPI_Scatterv: Send different amounts of data to each processor from the root processor
![Page 81: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/81.jpg)
Summary• MPI is used to create parallel programs based on
message passing• Usually the same program is run on multiple
processors• The 6 basic calls in MPI are:
– MPI_INIT( ierr )– MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )– MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )– MPI_Send(buffer, count,MPI_INTEGER,destination, tag,
MPI_COMM_WORLD, ierr)– MPI_Recv(buffer, count, MPI_INTEGER,source,tag,
MPI_COMM_WORLD, status,ierr)– call MPI_FINALIZE(ierr)
![Page 82: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/82.jpg)
Job Management inHPC Clusters
• Interactive and Batch Jobs
• Common Resource Management Systems: PBS, LSF, Condor
• Queues and Job Specification
• Sample PBS and LSF scripts
![Page 83: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/83.jpg)
Job Management inHPC Clusters
• Interactive and Batch Jobs
Interactive Mode is useful for debugging, performance tuning and profiling of applications
Batch Mode is for production runs
Most clusters are configured with a number of Queues for submitting and running batch jobs
Batch jobs are submitted via scripts specific to the resource management system deployed
![Page 84: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/84.jpg)
Example of Batch Queuesabose@nyx:~> qstat -qserver: nyx.engin.umich.edu
Queue Memory CPU Time Walltime Node Run Que Lm State
short -- -- 24:00:00 -- 0 122 -- E Rvs -- -- -- -- 0 0 -- E Rstaff -- -- -- -- 0 0 -- E Ratlas -- -- 30:00:00 4 0 0 -- E Rpowell -- -- -- -- 3 0 -- E Rlong -- -- 336:00:0 -- 47 29 -- E Rlandau -- -- -- -- 14 6 -- E Rdebug -- -- -- -- 0 0 -- E Rroute -- -- -- -- 0 0 -- E Rcoe1 -- -- -- -- 18 2 -- E R • Can assign maxm number of CPUs, maxm wall time etc. to specific queues
![Page 85: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/85.jpg)
An Example PBS Script #PBS -N pbstrial # job name#PBS -l nodes=2,walltime=24:00:00 # number of CPUs, walltime#PBS -S /bin/sh#PBS -q npaci # queue name#PBS -M [email protected]#PBS -m abe # notification level request
#PBS -j oe#export GMPICONF=/home/abose/.gmpi/$PBS_JOBIDecho "I ran on `hostname`"##cd to your execution directory firstcd ~scp [email protected]:/home/abose/emin.cpp em##use mpirun to run my MPI binary with 4 CPUs for 1 hourmpirun -np 4 ~/a.out # run executable
![Page 86: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/86.jpg)
An Example LSF Script #!/bin/ksh## LSF batch script to run the test MPI code#
#BSUB -P 03152005 # Project Number#BSUB -a mpich_gm # select the mpich-gm elim#BSUB -x # exclusive use of node (not_shared)#BSUB -n 2 # number of total tasks#BSUB -R "span[ptile=1]" # run 1 tasks per node#BSUB -J mpilsf.test # job name#BSUB -o mpilsf.out # output filename#BSUB -e mpilsf.err # error filename #BSUB -q regular # queue
# Fortran examplempif90 -o mpi_samp_f mpisamp.fmpirun.lsf ./mpi_samp_f
# C examplempicc -o mpi_samp_c mpisamp.cmpirun.lsf ./mpi_samp_c
# C++ examplempicxx -o mpi_samp_cc mpisamp.cc
mpirun.lsf ./mpi_samp_cc # run executable
![Page 87: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/87.jpg)
The ability of a code to restart from the point of last execution or interruption.
Very useful for timeshared systems such as Clusters and Grids
High-end systems provide checkpointing in the OS and hardware, but on clusters/desktops, we have to checkpoint on our own.
You should checkpoint:(1) you have long-running jobs, and you usually run out of allocated queue time.
(2) you can restart the job from last saved global state (not always possible) – it is quite common for time-dependent simulations. (3) you want to protect yourself from hardware faults/lost CPU cycles
How often should one checkpoint ? Totally depends on the application and the user
Application Checkpointing
![Page 88: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/88.jpg)
Find out how much time is left in a queue: use ctime/etime/dtime type timer routines in the code periodically to check for elapsed time. If close to queue limit, write data to disk from all processes.
An efficient way of writing checkpointed data: keep two different files A & B. File A is the checkpointed data from last write. First, write to File B. If write is successful, then replace A with B.
If you use the same file to write over and over again, a system crash during write will lose all checkpointed data so far. You can also combine files from multiple processes into a single file. In that case, name process files with the process id attached (simplifies management). We will talk about this in the MPI presentation.
How do you decide for a common checkpointed state among all processes ?
Application Checkpointing
![Page 89: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/89.jpg)
Distributed Checkpointing:
In the distributed/message passing model, issue a broadcast to all processes to checkpoint the data at specific points in the code.
Or, use a distributed checkpointing algorithm when program logic does not lend to the above.
e.g. Read the paper: “A Survey of Checkpointing Algorithms for Parallel and Distributed Computers” by Kalaiselvi and Rajaraman. (google)
Simpler “quick-n-dirty” methods based on token passing also works although may not be very robust. (works most of the time)
Contact me at [email protected] if you wish to know more
Application Checkpointing
![Page 90: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/90.jpg)
A Bioinformatics Introduction to Cluster Computing part II
By Andrew D. Boyd, MDResearch Fellow
Michigan Center for Biological InformationDepartment of Psychiatry,
University of Michigan Health SystemAnd
Abhijit Bose, PhDAssociate Director
Michigan Grid Research and Infrastructure Development
![Page 91: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/91.jpg)
Computational Challenges
• Most scientific computationshave a parallel and a serialaspects
• Clusters tend to speed up theparallel aspect the most
• Serial computations decrease the ability to speed up the computation on a cluster
![Page 92: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/92.jpg)
• One can take the data set and brake it up into smaller pieces
• Example: SETI@HOME– Gave everyone a small piece of the radio
frequency to compute on home computer
Parallelize the computation of a large data set
![Page 93: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/93.jpg)
Serial Computations
• Less cost efficient with large interrelated problems that are hard to uncouple
• Less cost efficient when cost of communicating between the nodes exceeds the savings from distributing the computation load.
• Example: Molecular Dynamics– CHARMM and XPLOR
• Software programs developed for Serial Machines– NAMD
• Designed to run efficiently on such parallel machines for simulating large molecules
• Used Charm++, a parallel C++ library to parallelize the code
![Page 94: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/94.jpg)
Large Genome vs Genome searches
• BLAT, BLAST, FASTA, SSAHA– All perform sequence matching– All have difference performance speed and
sensitivity – Some have higher memory requirements than
others
• However relatively easy to parallelize– Divide up the list of sequences you are
searching into smaller pieces.
![Page 95: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/95.jpg)
A simple Bioinformatics Example
• Re-examine the labels of the Affemetrix probe set
• Desire to know other matches with other genes within organism
• Especially as Unigene changes every few months– 500,000 probes to search against the
database
![Page 96: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/96.jpg)
Time Savings of a cluster
• On a single processor at 40 seconds per sequence 231 days to process all 500,000 probes or 57 days at 10 seconds per sequence
• On biosys1 (an old cluster) the job took approximately 2 weeks
• On morpheus (a new cluster) the job took approximately 4 days
![Page 97: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/97.jpg)
One Cluster Computing Method
• Take the .5 million affy probe sequences and divide into 90 files
• submit multiple single BLAST execution with 5500 sequences to the que
• Allow the scheduler to dynamically allocate the jobs to the unused nodes
• Easy to code in perl – Will walk through the code later
![Page 98: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/98.jpg)
Another Cluster Computer method for sequences mpiBLAST
• One concern is the memory of the computers compared to the size of the database
• One could take the database and break it up into smaller pieces and have the .5 million affy probes search against the smaller pieces of the database on nodes of a cluster
![Page 99: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/99.jpg)
mpiBLAST
• Uses the mpi libraries to pass messages between the nodes
• One master node and a lot of slave nodes• Each slave node is assigned a part of the
database• If more database parts than nodes then the
master node dynamically allocates the database pieces
• At the end will recalculate the statistics for the complete database
![Page 100: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/100.jpg)
mpiBLAST
• After initial testing program did not scale beyond 10 nodes
• Poor reliability in completion of BLAST runs• Early versions had very little error checking
– So if one node had an error the complete run was lost– If message between nodes was lost the complete run
was lost– Poor error exiting, mpiBLAST would never finish if
error came up had to be manually killed
![Page 101: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/101.jpg)
Scalability of mpiBLAST
Time(sec)
0
5000
10000
15000
20000
25000
0 5 10 15 20 25 30
Number of Processors
Tim
e in
Sec
on
ds
Time(sec)
![Page 102: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/102.jpg)
Another Parallel bioinformatics example
• InterProScan – Developed by European Bioinformatics Institute (EBI)– Performs 11 protein domain comparisons for a amino
acid sequence and creates a single report– The code is all developed in PERL– Like BLAST can submit sequences via the EBI web
site but for larger number of sequences the applications software can be download
![Page 103: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/103.jpg)
InterProScan
• Their approach to the solution was to take the input sequence and divide it into chunks.
• Then each of the 11 programs has a script configured to run on a specific chunk
• A separate script maintains how many processes are being submitted to the cluster and as one finishes another is submitted
![Page 104: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/104.jpg)
InterProScan
• Assumptions built into the model– The number of nodes you will be running on
will be the same and dedicated to you for the whole run
– InterProScan takes over the job of the cluster scheduler
![Page 105: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/105.jpg)
Sample Code to divide an input file and Execute multiple instances of
BLAST• 3 perl scripts
– Hgdivide.pl– demomaker.pl– multiqsub.pl
• Could have only one script if truly desired
• Could build more error checking into the program as well
![Page 106: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/106.jpg)
Hgdivide.pl• [adboyd@head input]$ cat Hgdivide.pl• #!/usr/bin/perl -w• # divide the affy probe set into files of 1000 probes per file
• open(IN, "<demoinput") || die("can't open out file"); #open input file• $j = 1; # index for file number• open(OUT, ">demoin".$j."") || die("can't open out file");• $k = 101; # number of lines in each input file• $i = 1; # line number• while ($line = <IN>) { # read line of input file• if ($i < $k) {print OUT $line}• # print line of file if not reach end of new input file• if ($i >= $k) {$i = 1; ++$j; open(OUT, ">demoin".$j.""); print OUT $line} ++$i;• # if at the end of the new file, reset index, add one to file number• # open new file• }
• close(OUT);• close(IN);
![Page 107: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/107.jpg)
demomaker.pl• @num = (1..3); #number of scripts you want to make• foreach $chrom (@num) {• open(OUT, ">run".$chrom.".sh") || die "can't open file";• # script file name• print OUT ("#!/bin/tcsh\n");• # chose shell to run• print OUT ("cd affy\n");• # change directory to working directory• $input_file = "~/affy/input/demoin".$chrom."";• # input file and location• $input = "demoin".$chrom."";• # name of input file• $output_file = "~/affy/output/demo".$chrom.".out";• # output file for results of blast• $output2_file = "~/affy/output/demo".$chrom.".out2";• # maintence of output file• $scratch_dir = "/scratch";• # name of scratch directory on cluster• print OUT ("date > ".$output2_file."\n");• # print date to maintence file• print OUT ("hostname >> ".$output2_file."\n");• # print cluster node number to node file• print OUT ("/bin/cp -f Hs.seq.all.* ".$scratch_dir."\n");• # copy database to local node• print OUT ("echo 'db copied into scratch' >> ".$output2_file."\n");• # print status of copy to maintence file
![Page 108: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/108.jpg)
demomaker.pl part 2• print OUT ("/bin/cp -f ".$input_file." ".$scratch_dir."\n");• # copy input file to local node• print OUT ("echo 'input copied into scratch' >> ".$output2_file."\n");• # print status of copy to maintence file• print OUT ("date >> ".$output2_file."\n");• # print time copying is done to maintence file• print OUT ("cd ".$scratch_dir."\n");• # move script to node directory• $cmd_line = "~/affy/blastall -i ".$input." -o ".$output_file." -p blastn -d Hs.seq.all -e 0.0001 -a 2 -U T ";• # commmand line to execute• print OUT ("echo '$cmd_line' >> ".$output2_file."\n");• # print command line to execute to maintence file• print OUT ($cmd_line."\n");• # run command line• print OUT ("echo Blastall finished >> ".$output2_file."\n");• # print status of Blast run to maintence file• print OUT ("/bin/rm -f Hs.seq.all.*\n");• # remove database from node• print OUT ("echo 'db removed from scratch' >> ".$output2_file."\n");• # print status of database removal to maintence file• print OUT ("/bin/rm -f ".$input."\n");• # remove input file for node• print OUT ("echo 'input removed from scratch' >> ".$output2_file."\n");• # print status of inputer removal from node• print OUT ("date >> ".$output2_file."\n");• # print time to maintence file• }
• close(OUT);
![Page 109: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/109.jpg)
multiqsub.pl
• #!/usr/bin/perl
• #system call to submit to cluster
• $i=1; #index• while ($i < 4) {• system("qsub run".$i.".sh"); # system call
for each script• ++$i;• }
![Page 110: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/110.jpg)
The joy of cluster computing
• This is the stone wall which you may encounter while working on a cluster
• This is “feature” not necessarily a bug
![Page 111: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/111.jpg)
How software is maintained on a cluster
• Many clusters will allow you to install any software you desire
• Some clusters pre install common software like BLAST or InterProScan even databases– If multiple people are using the same database it is a
waste of memory to have it installed multiple times
![Page 112: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/112.jpg)
Bring your own Software model
• Find out what libraries and compilers are supported on the cluster– Not everyone uses gcc– The latest version of the compiler you
need may not be installed– Find out the name of the scratch space
on the nodes, • scratch space is the area of the node you can
copy files to• Usually labeled /scratch
![Page 113: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/113.jpg)
Software previously installed
• Find out what directory the software is located
• Find out what version they are running
• Find out if the software is on a drive that can be mounted to the compute nodes – Warning!!!! not all drives and directories
visible from the Head Node are mountable to the compute nodes
![Page 114: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/114.jpg)
The importance of benchmarking
• Before submitting jobs to a cluster– First run a fraction of the job on head node try to figure
out how long the complete job will run– The program will probably not scale linearly for each
node added– However, if the individual executions will take longer
than 24 hours you will need to write a module to checkpoint your program
• There is an assumed direct relationship between the likelihood of a cluster failing and how long the program will run
![Page 115: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/115.jpg)
The importance of troubleshooting• Some steps to take if the program
does not work on a cluster– First run a small job on the Head node– Second, if allowed, consider running in
interactive mode to see if errors aregenerated
• Interactive mode allows you to type commandline instructions on the individual node the program is executed on
• One can also read the error outputs directly• Just because a program runs on the head
node does not mean it will run on a compute node• If not performing normally contact your friendly
system administrator
![Page 116: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/116.jpg)
The importance of troubleshooting part II
– Third submit one job to the scheduler• The scheduler should not modify the code if the commands
to the scheduler are not correct will not execute properly• Just because a program runs on the head node and the
compute node does not mean it will run after being assigned by the scheduler
• If not performing normally contact your friendly system administrator
– Fourth submit multiple jobs to scheduler• Just because a program runs on the head node, a compute
node, through the scheduler, does not mean it will run on every single node on a cluster
• While every node is supposed to be identical sometimes they are different
![Page 117: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/117.jpg)
Information to bring to a System Administer when problems arise• What program/script were you trying to
use?• What command line parameters?• When did this happen?• What node/computer did this happen on?• What was the error message you received
if any?• If you have thoughts or clues to the
problem pass them on to the sys admin.
![Page 118: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/118.jpg)
Errors that I have seen on clusters
• Software not properly installed by systems administrators on drives that can be mounted on the nodes– Results: no output from program, – Troubleshooting: Called system administrator, error was
reported to node which was inaccessible to user
• Scratch space permissions were changed by a previous user, node would not function– Results: Since that node finished first all subsequent jobs were
sent to that node and errored out, found out Monday morning that only half of the output was generated,
– Troubleshooting: tracked all of the uncompleted jobs to a single node from error output file
![Page 119: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/119.jpg)
Errors that I have seen on clusters part II
• Complier version supported on cluster not version software was written– Results: Program failed to compile– Troubleshooting: emailed author of program, tried to install appropriate
compiler, ended up using binaries for operating system• Home storage disk space full, bioinformatics application tend to
take up more disk space than other applications– Results: Output from last twenty jobs were missing– Troubleshooting: looked to see if failed job runs were from a single
node, looked at disk space allocation, other users complaining as well• Programs errored out
– Results: Output from jobs stopped midway through run.– Troubleshooting: Looked to see how long program was running, (5
days), looked at input file did a brief calculation, would have taken 30 days to finish single job, no bench marking, revaluated experiment
![Page 120: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/120.jpg)
Additional Information
• Vi short command reference:– http://linux.dbw.org/vi_short_ref.html
• BLAST– http://www.ncbi.nlm.nih.gov/BLAST/
• Sun Grid Engine sites– http://pbpl.physics.ucla.edu/Computing/
Beowulf_Cluster/Sun_Grid_Engine/
![Page 121: A Bioinformatics Introduction to Cluster Computing part I By Andrew D. Boyd, MD Research Fellow Michigan Center for Biological Information Department of](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e6b5503460f94b6a0f9/html5/thumbnails/121.jpg)
Acknowledgements • Brian Athey, Associate Professor• Abhijit Bose, Associate Director MGRID• Georgi Kostov, System Administrator• Chris Bliton, MCBI project manager• Fan Meng, Assistant Research Scientist• Joe Landman, Scalable Informatics• Jeff Ogden, IT project manager• Paul Trombley, graphic artist• Tom Hacker, Associate Director, Indiana Univ.• NPACI and SDSC