mpi message passing interface

26
MPI Message Passing Interface Mehmet Balman Cmpe 587 Dec, 2001

Upload: alden

Post on 13-Jan-2016

45 views

Category:

Documents


2 download

DESCRIPTION

MPI Message Passing Interface. Mehmet Balman Cmpe 587 Dec, 2001. Parallel Computing. Separate workers or processes Interact by exchanging information. Types of parallel computing: SIMD (single instruction multiple data) SPMD (single program multiple data) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MPI Message Passing Interface

MPIMessage Passing Interface

Mehmet BalmanCmpe 587

Dec, 2001

Page 2: MPI Message Passing Interface

Parallel Computing•Separate workers or processes

•Interact by exchanging information

Types of parallel computing:

• SIMD (single instruction multiple data)

• SPMD (single program multiple data)

• MPMD (multiple program multiple data)

Hardware models:•Distributed memory (Paragon, IBM SPx, workstation network)

•Shared memory (SGI Power Challenge, Cray T3D)

Page 3: MPI Message Passing Interface

Communication with other processes

•Cooperative

all parties agree to transfer data

•One sided•one worker performs transfer of data

Page 4: MPI Message Passing Interface

What is MPI?

•A message-passing library specification

•Multiple processors by message passing

•Library of functions and macros that can be used in C FORTRAN and C programs

•For parallel computers, clusters, and heterogeneous networks

Who designed MPI?•Vendors IBM, Intel, TMC, Meiko, Cray, Convex, Ncube

•Library writers PVM, p4, Zipcode, TCGMSG, Chameleon, Express, Linda

•Broad Participation

Page 5: MPI Message Passing Interface

Development history: (1993-1994)

•Began at Williamsburg Workshop in April, 1992

•Organized at Supercomputing '92 (November)

•Met every six weeks for two days

•Pre-final draft distributed at Supercomputing '93

•Final version of draft in May, 1994

•Public and vendor implementations available

Page 6: MPI Message Passing Interface

Features of MPI•Point-to-point communication

• blocking, nonblocking

• synchronous, asynchronous

• ready,buffered

•Collective routines

• built-in, user defined

•Large # of data movement routines

•Built-in support for grids and graphs

•125 functions (MPI is large)

•6 basic functions (MPI is small)

•Communicators combine context and groups for message security

Page 7: MPI Message Passing Interface

example#include "mpi.h"

#include <stdio.h>

int main( int argc, char **argv){

int rank, size;

MPI_Init( &argc, &argv );

MPI_Comm_rank( MPI_COMM_WORLD, &rank );

MPI_Comm_size( MPI_COMM_WORLD, &size );

printf( "Hello world! I'm %d of %d\n", rank, size );

MPI_Finalize(); return 0;

}

Page 8: MPI Message Passing Interface

What happens when an MPI job is run

1. The user issues a directive to the operating system which has the effect of placing a copy of the executable program on each processor

2. Each processor begins execution of its copy of the executable

3. Different processes can execute different statements by branching within the program Typically the branching will be based on process ranks

Envelope of a message: (control block)

• the rank of the receiver

• the rank of the sender

• a tag

• a communicator

Page 9: MPI Message Passing Interface

Two mechanisms for partitioning message space

Tags(0-32767)

Communicators(MPI_COMM_WORLD)

Page 10: MPI Message Passing Interface

MPI_Init MPI_Finalize MPI_Comm_size MPI_Comm_rank MPI_Send MPI_Recv

MPI_Send( start, count, datatype, dest, tag, comm )

MPI_Recv(start, count, datatype, source, tag, comm, status)

MPI_Bcast(start, count, datatype, root, comm)

MPI_Reduce(start, result, count, datatype, operation, root, comm)

Page 11: MPI Message Passing Interface

Collective patterns

Page 12: MPI Message Passing Interface

Collective Computation Operations

Operation Name Meaning

MPI MAX Maximum

MPI MIN Minimum

MPI SUM Sum

MPI PROD Product

MPI LAND Logical And

MPI BAND Bitwise And

MPI LOR Logical Or

MPI BOR Bitwise Or

MPI LXOR Logical Exclusive Or

MPI BXOR Bitwise Exclusive Or

MPI MAXLOC Maximum and Location of Maximum

MPI MINLOC Minimum and Location of Minimum

MPI_Op_create(

user_function,

commute—true if commutative,

op)

MPI_Op_free(op)

Page 13: MPI Message Passing Interface

User defined communication groups

Communicator: contains a context and a group. Group : just a set of processes.

MPI_Comm_create( oldcomm, group, &newcomm )

MPI_Comm_group( oldcomm, &group )

MPI_Group_free( &group )

MPI_Group_incl MPI_Group_excl

MPI_Group_range_incl MPI_Group_range_excl

MPI_Group_union MPI_Group_intersection

Page 14: MPI Message Passing Interface

Non-blocking operations

•MPI_Isend(start, count, datatype, dest, tag, comm, request)

•MPI_Irecv(start, count, datatype, dest, tag, comm, request)

•MPI_Wait(request, status)

•MPI_Waitall•MPI_Waitany•MPI_Waitsome

•MPI_Test( request, flag, status)

Non-blocking operations return immediately.

Page 15: MPI Message Passing Interface

Communication Modes

•Synchronous mode ( MPI_Ssend): the send does not complete until a matching receive has begun.

•Buffered mode ( MPI_Bsend): the user supplies the buffer to system for its use.

•Ready mode ( MPI_Rsend): user guarantees that matching receive has been posted.

Non-blocking versions: MPI_IssendMPI_IrsendMPI_Ibsend

int bufsize; char *buf = malloc(bufsize); MPI_Buffer_attach( buf, bufsize ); ... MPI_Bsend( ... same as MPI_Send ... );... MPI_Buffer_detach( &buf, &bufsize );

Page 16: MPI Message Passing Interface

DatatypesTwo main purpose:

•Heterogenity --- parallel programs between different processors •Noncontiguous data --- structures, vectors with non-unit stride

MPI datatype C datatype

MPI CHAR signed char

MPI SHORT signed short int

MPI INT signed int

MPI LONG signed long int

MPI UNSIGNED CHAR unsigned char

MPI UNSIGNED SHORT unsigned short int

MPI UNSIGNED unsigned int

MPI UNSIGNED LONG unsigned long int

MPI FLOAT foat

MPI DOUBLE ouble

MPI LONG DOUBLE long double

MPI BYTE

MPI PACKED

Page 17: MPI Message Passing Interface

Build derived typevoid Build_derived_type(INDATA_TYPE* indata,MPI_Datatype* message_type_ptr){int block_lengths[3]; MPI_Aint displacements[3]; MPI_Aint addresses[4]; MPI_Datatype typelist[3];

typelist[0] = MPI_FLOAT; typelist[1] = MPI_FLOAT; typelist[2] = MPI_INT;

block_lengths[0] = block_lengths[1] = block_lengths[2] = 1;

MPI_Address(indata, &addresses[0]); MPI_Address(&(indata >a), &addresses[1]); MPI_Address(&(indata >b), &addresses[2]); MPI_Address(&(indata >n), &addresses[3]);

displacements[0] = addresses[1] addresses[0];displacements[1] = addresses[2] addresses[0]; displacements[2] = addresses[3] addresses[0];

MPI_Type_struct(3, block_lengths, displacements, typelist, message_type_ptr); MPI_Type_commit(message_type_ptr); }

Page 18: MPI Message Passing Interface

Other derived data types

•int MPI_Type_contiguous(int count, MPI_Datatype oldtype,MPI_Datatype *newtype)

elements are contiguous entries in an array

•int MPI_Type_vector(int count, int block_length,int stride,MPI_Datatype element_type,MPI_Datatype *new_type)

elements are equally spaced entries of an array

•int MPI_Type_indexed(int count,int array_of_blocklengths, int array_of_displacements,MPI_Datatype element_type,MPI_Datatype *new_type)

elements are arbitrary entries of an array

Page 19: MPI Message Passing Interface

Pack/unpackvoid Get_data4(int my_rank, float* a_ptr, float* b_ptr, int* n_ptr){ int root = 0; char buffer[10];int position; if (my_rank == 0){printf(''Enter a, b, and n``n''); scanf(''%f %f %d'', a_ptr, b_ptr, n_ptr); position = 0;MPI_Pack(a_ptr, 1, MPI_FLOAT, buffer, 100, &position, MPI_COMM_WORLD); MPI_Pack(b_ptr, 1, MPI_FLOAT, buffer, 100, &position, MPI_COMM_WORLD); MPI_Pack(n_ptr, 1, MPI_INT, buffer, 100, &position, MPI_COMM_WORLD);

MPI_Bcast(buffer, 100, MPI_PACKED, root, MPI_COMM_WORLD);

}else{MPI_Bcast(buffer, 100, MPI_PACKED, root, MPI_COMM_WORLD);

position = 0; MPI_Unpack(buffer, 100, &position, a_ptr, 1, MPI_FLOAT, MPI_COMM_WORLD); MPI_Unpack(buffer, 100, &position, b_ptr, 1, MPI_FLOAT, MPI_COMM_WORLD); MPI_Unpack(buffer, 100, &position, n_ptr, 1, MPI_INT, MPI_COMM_WORLD); }}

Page 20: MPI Message Passing Interface

Profiling

static int nsend = 0; int MPI_Send( start, count, datatype, dest, tag, comm ) { nsend++; return PMPI_Send( start, count, datatype, dest, tag, comm ) ;}

Page 21: MPI Message Passing Interface

Architecture of MPI•Complex communication operations can be expressed portably in terms of lower-level ones

•All MPI functions are implemented in terms of the macros and functions that make up the ADI(Abstract Device Interface)

ADI

1. specifying a message to be sent or received

2. moving data between the API and the message-passing hardware

3. managing lists of pending messages (both sent and received),

4. providing basic information about the execution environment (e.g., how many tasks are there

Page 22: MPI Message Passing Interface

Upper layers of MPICH

Page 23: MPI Message Passing Interface

Channel Interface

Routines for send and receive envelope(control) information:

•MPID_SendControl(MPID_SendControlBlock )

•MPID_RecvAnyControl

•MPID_ControlMsgAvail

Send and receive data:

•MPID_SendChannel

•MPID_RecvFromChannel

Page 24: MPI Message Passing Interface

Channel InterfaceThree different data exchange

mechanisms

Eager (default)

Data is sent to the destination immediately. Buffered on receiver site.

Rendezvous (MPI_Bsend)

Data is sent to the destination only when requested.

Get (shared memory)

Data is read directly by the receiver.

Page 25: MPI Message Passing Interface

Lower layers of MPICH

Page 26: MPI Message Passing Interface

Summary•Point-to-point and collective operations

•Blocking

•NonBlocking

•Asynchronous

•Synchronous

•Buffered

•Ready

•Abstraction for processes

•Rank of the group

•Virtual topologies

•Data types

•User specific

•predefined

•pack/unpack

•Architecture of MPI

•ADI(Abstract Device Interface)

•Channel Interface