message passing interface

22
MPI Communications Point to Point Collective Communication Data Packaging

Upload: md-mahedi-mahfuj

Post on 11-May-2015

682 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Message passing interface

MPI Communications

Point to Point

Collective Communication

Data Packaging

Page 2: Message passing interface

Point-to-Point CommunicationSend and Receive• MPI_Send/MPI_Recv provide point-to-

point communication– synchronization protocol is not fully specified.

• what are possibilities?

Page 3: Message passing interface

• Fully Synchronized (Rendezvous)– Send and Receive complete simultaneously

• whichever code reaches the Send/Receive first waits

– provides synchronization point (up to network delays)

• Buffered– Receive must wait until message is received– Send completes when message is moved to buffer

clearing memory of message for reuse

Send and Receive Synchronization

Page 4: Message passing interface

• Asynchronous– Sending process may proceed immediately

• does not need to wait until message is copied to buffer

• must check for completion before using message memory

– Receiving process may proceed immediately

• will not have message to use until it is received

• must check for completion before using message

Send and Receive Synchronization

Page 5: Message passing interface

• MPI_Send/MPI_Recv are synchronous, but buffering is unspecified– MPI_Recv suspends until message is received– MPI_Send may be fully synchronous or may

be buffered• implementation dependent

• Variations allow synchronous or buffering to be specified

MPI Send and Receive

Page 6: Message passing interface

Asynchronous Send and Receive

• MPI_Isend() / MPI_Irecv() are non-blocking. Control returns to program after call is made.

• Syntax is the same as for Send and Recv, except a MPI_Request* parameter is added to Isend and replaces the MPI_Status* for receive.

Page 7: Message passing interface

Detecting Completion• MPI_Wait(&request, &status)

– request matches request on Isend or Irecv– status returns status equivalent to

status for Recv when complete

– Blocks for send until message is buffered or sent so message variable is free

– Blocks for receive until message is received and ready

Page 8: Message passing interface

Detecting Completion

• MPI_Test(&request, flag, &status)– request, status as for MPI_Wait

– does not block

– flag indicates whether message is sent/received

– enables code which can repeatedly check for communication completion

Page 9: Message passing interface

Collective Communications

• One to Many (Broadcast, Scatter)

• Many to One (Reduce, Gather)

• Many to Many (All Reduce, Allgather)

Page 10: Message passing interface

Broadcast

• A selected processor sends to all other processors in the communicator

• Any type of message can be sent

• Size of message should be known by all (it could be broadcast first)

• Can be optimized within system for any given architecture

Page 11: Message passing interface

MPI_Bcast() SyntaxMPI_Bcast(mess, count, MPI_INT,

root, MPI_COMM_WORLD);

mess pointer to message buffer

count number of items sent

MPI_INT type of item sent

Note: count and type should be the same

on all processors

root sending processor

MPI_COMM_WORLD communicator within which

broadcast takes place

Page 12: Message passing interface

MPI_Barrier()

MPI_Barrier(MPI_COMM_WORLD);MPI_COMM_WORLD communicator within which

broadcast takes place

provides for barrier synchronization without message of broadcast

Page 13: Message passing interface

Reduce

• All Processors send to a single processor, the reverse of broadcast

• Information must be combined at receiver

• Several combining functions available– MAX, MIN, SUM, PROD, LAND, BAND,

LOR, BOR, LXOR, BXOR, MAXLOC, MINLOC

Page 14: Message passing interface

MPI_Reduce() syntax

MPI_Reduce(&dataIn, &result, count,

MPI_DOUBLE, MPI_SUM, root,

MPI_COMM_WORLD);

dataIn data sent from each processor

result stores result of combining operation

count number of items in each of dataIn, result

MPI_DOUBLEdata type for dataIn, result

MPI_SUM combining operation

root rank of processor receiving data

MPI_COMM_WORLD communicator

Page 15: Message passing interface

MPI_Reduce()

• Data and result may be arrays -- combining operation applied element-by-element

• Illegal to alias dataIn and result– avoids large overhead in function definition

Page 16: Message passing interface

MPI_Scatter()

• Spreads array to all processors

• Source is an array on the sending processor

• Each receiver, including sender, gets a piece of the array corresponding to their rank in the communicator

Page 17: Message passing interface

MPI_Gather()

• Opposite of Scatter

• Values on all processors (in the communicator) are collected into an array on the receiver

• Array locations correspond to ranks of processors

Page 18: Message passing interface

Collective Communications, underneath the hood

Page 19: Message passing interface

Many to Many Communications

• MPI_Allreduce– Syntax like reduce, except no root parameter– All nodes get result

• MPI_Allgather– Syntax like gather, except no root parameter– All nodes get resulting array

• Underneath -- virtual butterfly network

Page 20: Message passing interface

Data packaging

• Needed to combine irregular, non-contiguous data into single message

• pack -- unpack, explicitly pack data into a buffer, send, unpack data from buffer

• Derived data types, MPI heterogeneous data types which can be sent as a message

Page 21: Message passing interface

MPI_Pack() syntaxMPI_Pack(Aptr, count, MPI_DOUBLE,

buffer, size, &pos, MPI_COMM_WORLD);

Aptr pointer to data to pack

count number of items to pack

type of items

buffer buffer being packed

size size of buffer (in bytes)

pos position in buffer (in bytes), updated

communicator

Page 22: Message passing interface

MPI_Unpack()

• reverses operation of MPI_Pack()

MPI_Unpack(buffer, size, &pos, Aptr, count, MPI_DOUBLE, MPI_COMM_WORLD);