introduction to mpi - interdisciplinaryscheinin/mpi_tutorial/hpc_mpi_fall2008.pdfflow through the...

39
High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/ High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/ Information Technology Services Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008 Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008 Introduction to MPI Alan L. Scheinine IT Consultant HPC @ LSU September 17, 2008

Upload: doannhi

Post on 16-Apr-2018

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Introduction to MPI

Alan L. ScheinineIT ConsultantHPC @ LSU

September 17, 2008

Page 2: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Repository for this tutorialhttp://www.cct.lsu.edu/~scheinin/MPI_Tutorial/

Index of material in repository: index.phpThese slides: HPC_MPI_Fall2008.pdfDetailed information: MPIverbose­v1.0.pdfExamples: Exercises/

Page 3: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Outline

Overview of Message PassingPoint­to­point CommunicationCollective CommunicationExercises

Page 4: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

What is message passing?Parallel programming can be grouped into two categories:shared memory and message passing.Shared memory is limited to the processes on one board. Message passing (distributed memory) can be used with hundreds to thousands of processes.To communicate, both sender and receiver must call the appropriate routine, though MPI­2 also implements one­sided communication.

Page 5: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Overview of MPIA widely­used standard.Coarse­grained parallelism. MIMD/SPMDInitially, copies of the same executable are started.A common practice is for each executable to flow through the same code, doing a test on the process rank in a few places.

Page 6: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

SIMD, MIMD and SPMDThe most common distinction between programming models is Single Instruction Multiple Data (SIMD) and Multiple Instruction Multiple Data (MIMD).  SIMD is used for some specialized attached processors and for vector computers.  Though MPI is MIMD, most often all processes run the same program applied to different subsets of the data, hence Single Program Multiple Data (SPMD).

Page 7: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Sources of Documentation and Softwarehttp://www.mpi­forum.org/docs/docs.html, specifications MPI 1 & 2http://www.redbooks.ibm.com/redbooks/pdfs/sg245380.pdf,RS/6000 SP: Practical MPI Programming, Aoyama and Nakanohttp://www­unix.mcs.anl.gov/mpi/, links to tutorials & papershttp://www­unix.mcs.anl.gov/mpi/tutorial/index.html, list of tutorialsUsing MPI, by W. Gropp, E. Lusk and A. SkjellumUsing MPI­2, by W. Gropp, E. Lusk and R. ThakurMPI: The Complete Reference, The MPI­2 Extensions, Gropp, et alParallel Programming With MPI, by Peter S. PachecoMPICH2, http://www.mcs.anl.gov/research/projects/mpich2/

OpenMPI, http://www.open­mpi.org/software/

Page 8: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Minimal Program#include <stdio.h>

#include <mpi.h>

int main(int argc, char** argv) {

   int rank;

   MPI_Init( argc, argv );

   MPI_Comm_rank( MPI_COMM_WORLD, &rank );

   if( rank == 0 ) fprintf(stdout, "Starting program\n");

   fprintf(stdout, "Process %d running\n", rank );

   MPI_Finalize();

   return 0;

}

Page 9: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Send and Recv must be paired

Page 10: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Blocking Send and Receiveint MPI_Send(void *buf, int count, MPIDatatype type,             int dest, int tag, MPI_Comm comm)int MPI_Recv(void *buf, int count, MPIDatatype type,  int source, int tag, MPIComm comm, MPIStatus *stat)

Program can change buf when call returns.  Size of count in recv must be at least as large as count in send.  Can find actual message size usingint MPI_Get_count(MPIStatus *stat, MPIDatatype type,                    int *count)Wildcards: MPI_ANY_SOURCE and MPI_ANY_TAG

Page 11: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Example send­recv pair in Cint itag=1;int imsg[100];if (my_id == 0) {   ierr = MPI_Send(imsg,100,MPI_INT,1,itag,                   MPI_COMM_WORLD);}else if (my_id == 1) {   ierr = MPI_Recv(imsg,100,MPI_INT,0,itag,                   MPI_COMM_WORLD,&status);}

Page 12: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Buffered SendBuffered send returns as soon as copy is made in 

local buffer.Buffer space allocated with routine

int size = 10000;int stat;char* buf = (char*) malloc(size);stat = MPI_Buffer_attach(buf, size);// When no longer neededstat = MPI_Buffer_detach(buf, &size);free(buf);

Page 13: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Non­blocking Send and Recvint MPI_Isend(void *buf, int count,   MPI_Datatype type, int dest, int tag,   MPI_Comm comm, MPI_Request *request)int MPI_Ibsend(  ibid  )int MPI_Irecv(void *buf, int count,  MPI_Datatype type, int source, int tag,  MPI_Comm comm, MPI_Request *request)MPI_Test(MPI_Request *request, int *flag,   MPI_Status *status)MPI_Wait(MPI_Request *request,   MPI_Status *status)

Page 14: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication

Communication is done between members of a communication group, described in Part II of this tutorial. Size of message must be same on sender and receiver. Can use buffer when call returns but return does not imply that globally the data exchange is complete. Special process of scatter, gather or reduce is called the “root”.

Page 15: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication Routines•Barrier sync blocks until all group members   have called it.•Broadcast same array from one to all.•Gather arrays from all to one.•Scatter different arrays from one to all.•Alltoall, different array of data for each possible exchange.•Global reduction and scan.

Page 16: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication Detailsint MPI_Barrier(MPI_Comm comm)

blocks until all members of comm have called it

int MPI_Bcast(void *buf, int count,              MPI_Datatype type,              int root, MPI_Comm comm)

count elements of array buf at root goes to buf of all other members 

Page 17: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication Details, continuedMPI_Gather(void *sendbuf, int sendcnt, MPI_Datatype,           void *recvbuf, int recvcnt, MPI_Datatype,           int root, MPI_Comm comm)MPI_Gatherv(void *sendbuf, int sendcnt, MPI_Datatype,            void *recvbuf, int *recvcnts, MPI_Datatype,            int *displ, int root, MPI_Comm comm)

Page 18: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication Details, continuedMPI_Scatter(void *sendbuf, int sendcnt, MPI_Datatype,            void *recvbuf, int recvcnt, MPI_Datatype,            int root, MPI_Comm comm)MPI_Scatterv(void *sendbuf, int *sendcnts,             int *displ, MPI_Datatype,             void *recvbuf, int recvcnt, MPI_Datatype,             int root, MPI_Comm comm)

Page 19: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication Details, continued

MPI_Alltoall(void *sendbuf, int sendcnt, MPI_Datatype,             void *recvbuf, int recvcnt, MPI_Datatype,             MPI_Comm comm)MPI_Alltoallv(void *sendbuf, int *sendcnts,              int *send_displs, MPI_Datatype,              void *recvbuf, int *recvcnts,              int *recv_displs, MPI_Datatype,              MPI_Comm comm)

A segment of array buf from each proc to each proc.

Page 20: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Collective Communication Details, continuedThe predefined reduction operations over all processes:MPI_MAX maximumMPI_MIN minimumMPI_SUM sumMPI_PROD productMPI_IAND logical andMPI_BAND bit­wise andMPI_IOR logical orMPI_BOR bit­wise orMPI_IXOR logical xorMPI_BXOR bit­wise xorMPI_MAXLOC max value and locationMPI_MINLOC min value and location

Page 21: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Exerciseshttp://www.cct.lsu.edu/~scheinin/      MPI_Tutorial/ExercisesTo get started, see the subdirectory PI3which has a Makefile for compiling and scripts to submit a job on Linux systems such as the HPC machine tezpur or the LONI machines.(See http://www.loni.org/systems/ for a list.)

Page 22: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

MPI, Part 2

Alan L. ScheinineIT ConsultantHPC @ LSU

September 24, 2008

Page 23: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Repository for this tutorialhttp://www.cct.lsu.edu/~scheinin/MPI_Tutorial/

Index of material in repository: index.phpThese slides: HPC_MPI_Fall2008.pdfDetailed information: MPIverbose­v1.0.pdfExamples (tested on tezpur): Exercises/

Page 24: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Outline

Groups and CommunicatorsOne­sided Communications

Page 25: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Groups and Communicators

•A subset of processes can be assigned to a   group.•From the perspective of the group the processes have ranks 0 to (groupsize ­1)•group  ­>  communicator•Default communicator MPI_COMM_WORLD

Page 26: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Groups and Communicators, continuedThere are many ways to create groups, a simple way 

is the following.Get group from world communicator.Array ranks selects the processes.

int MPI_Comm_group(MPI_COMM_WORLD,                   MPI_Group *group)int MPI_Group_incl(MPI_Group group, int n,           int *ranks, MPI_Group *newgroup)

Page 27: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Groups and Communicators, continuedGroups can be constructed from other groups using MPI_Group_union, MPI_Group_intersection and MPI_Group_difference.Communicator derived usingint MPI_Comm_create(MPI_Comm comm,   MPI_Group group, MPI_Comm *newcomm)

“group” having been derived from group associated with “comm”

Page 28: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Groups and Communicators, continued

Point­to­point operations and collective operations using a user­defined communicator will not interfere with operations using a different communicator.

Page 29: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided CommunicationsStandard send/receive communication requires matching operations by sender and receiver.In order to issue the matching operations, an application needs to distribute the transfer parameters.  In contrast,the use of Remote Memory Access (RMA) communication mechanisms avoid the need for global computations or explicit polling.

Page 30: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continuedRMA allows one process to specify all communication parameters for origin and target, however, the target is involved in synchronization for “active target” communication but not for “passive target” communication.  Does not use shared memory address space.  Does not requiring matching send and recv.

Page 31: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continued

Buffer space assigned to a window using create.Communication using put, get and accumulate.Synchronization for active target with fence, start, complete, post and wait.Passive target sync with lock and unlock.Communication calls are nonblocking.Synchronization call is needed to ensure that a set of communication calls (during an epoch) complete.

Page 32: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continued

int MPI_Win_create(void *base, MPI_Aint size,           int disp_unit, MPI_Info info,           MPI_Comm comm, MPI_Win *win)

The argument win is a collection of windows, zero or one on each process in comm.  Communication calls made at the origin use any local buffer but use these predefined windows on the target.

Page 33: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continued

Communication routines:int MPI_Put(void *origin_addr,  int origin_count, MPI_Datatype origin_type,  int target_rank, MPI_Aint target_disp,  int target_count, MPI_Datatype target_type,  MPI_Win win)int MPI_Get( ibid )int MPI_Accumulate( ibid plus MPI_Op op )

Page 34: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continuedSome array origin_addr on the process that makes the call (origin) moves origin_count elements to the window on the target defined by the pair target_rank, win.The target does not call a routine.  On the target the number of elements moved is target_count.The target window initial position, target_addr, is given by the window base address + (disp_unit x target_disp)where disp_unit was defined in the create routine.Allowed accumulate operations are the built­in operations available for MPI_Reduce. 

Page 35: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continued

For active target, the synchronization routines areint MPI_Win_fence(int assert, MPI_Win win)int MPI_Win_start(MPI_Group group,                  int assert, MPI_Win win)int MPI_Win_complete(MPI_Win win)int MPI_Win_post(MPI_Group group,                 int assert, MPI_Win win)int MPI_Win_wait(MPI_Win win)int MPI_Win_test(MPI_Win win, int *flag)

Page 36: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continuedThe target does not call communication routines but it does call synchronization routines: post to start an exposure epoch and wait to close an exposure epoch, the latter returning only after all communication has completed.  The origin calls start before calling a set of communication routines, starting an access epoch.  The origin finishes an access epoch by calling complete.  The group in start and post is a subset of the window and can be different on each process.  At an origin, the group specifies the targets; and at a target, the group specifies the origins.

Page 37: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continuedThe routine fence combines the four synchronization commands, no subset group is specified and fence both starts and ends an epoch.

The test command returns immediately with flag = false if wait would have blocked waiting for communication to complete.

The argument “assert” is implementation dependent and can be set to zero to indicate that no guarantees are made.

Page 38: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

One­Sided Communications, continued

int MPI_Win_lock(int lock_type, int rank,                int assert, MPI_win win)int MPI_Win_unlock(int rank, MPI_Win win)

These routines are used for passive target synchronization and are called by the origin.  The target “rank” does not need to make any synchronization calls.  If two processes acting as RMA origins are sharing a window at target rank, the origins will may need synchronization between themselves to correctly order the usage of the target.

Page 39: Introduction to MPI - Interdisciplinaryscheinin/MPI_Tutorial/HPC_MPI_Fall2008.pdfflow through the same code, doing a test on the process ... MPICH2, ... •Barrier sync blocks until

High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/High Performance Computing @ Louisiana State University - http://www.hpc.lsu.edu/Information Technology Services

Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008Alan Scheinine, IT Consultant, HPC Tutorial, Sept. 2008

Exercise

http://www.cct.lsu.edu/~scheinin/      MPI_Tutorial/Exercises/Jacobi/