lecture 20 - stanford universityweb.stanford.edu/class/cme213/files/lectures/lecture_20.pdf ·...

37
CME 213 Eric Darve SPRING 2017

Upload: others

Post on 10-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

CME213

EricDarve

SPRING 2017

Page 2: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

LINEAR ALGEBRAMATRIX-VECTOR PRODUCTS

Page 3: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Applicationexample:matrix-vectorproduct

● WearegoingtousethatexampletoillustrateadditionalMPIfunctionalities.

● Thiswillleadustoprocessgroupsandtopologies.● First,wegoovertwoimplementationsthatusethe

functionalitieswehavealreadycovered.● Twosimpleapproaches:

• Rowpartitioningofthematrix,or• Columnpartitioning

Page 4: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Rowpartitioning

Thisisthemostnatural.

MatrixA Vectorb

Step1:replicateboneachprocess:MPI_Allgather()Step2:performproductSeeMPIcode:matvecrow/

Allgather()

Page 5: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group
Page 6: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Columnpartitioning

Step1:calculatepartialproductswitheachprocess

Partialproducts

MatrixA Vectorb

Page 7: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Columnpartitioning(cont’d)

● Step2:reduceallpartialresults:MPI_Reduce()● Step3:sendsub-blockstoallprocesses:MPI_Scatter()

● Stepsareverysimilartorowpartitioning.

VectorAb

Page 8: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group
Page 9: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Abetterpartitioning

● Ifthenumberofprocessesbecomeslargecomparedtothematrixsize,weneeda2Dpartitioning:

● Eachcoloredsquarecanbeassignedtoaprocess.● Thisallowsusingmoreprocesses.● Inaddition,atheoreticalanalysis(moreonthislater)showsthat

thisschemerunsfaster.

Page 10: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Outlineofalgorithm:step1

Firstcolumncontainsb

Sendbtothediagonalprocesses

Sendbdowneachcolumn.

Thisisabroadcastoperation.

Page 11: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Step2and3

● Step2:performmatrix-vectorproductlocally● Step3:reduceacrosscolumnsandstoreresultincolumn0.

Reductionacrosscolumns

Page 12: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Reduction:2nReduction:n/2

Communicationcost(inanutshell)Whyis2Dpartitioningbetter?

Larger blocks Narrow columns

Page 13: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Difficultieswith2Dpartitioning

● Thistypeofdecompositionbringssomedifficulties.● Weusedtwocollectiveoperations:

•Abroadcastinsideacolumn.•Areductioninsidearow.

● TodothisinMPI,weneedtwoconcepts:•Communicatorsorprocessgroups.Thisdefinesasubsetofalltheprocesses.Foreachsubset,collectiveoperationsareallowed,e.g.,broadcastforthegroupofprocessesinsideacolumn.•Processtopologies.Formatrices,thereisanatural2Dtopologywith(i,j)blockindexing.MPIsupportssuchgrids(anydimension).UsingMPIgrids(called“Cartesiantopologies”)simplifiesmanyMPIcommands.

Page 14: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

PROCESS GROUPS AND COMMUNICATORS

Page 15: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Processgroups

● Groupsareneededformanyreasons.● Enablescollectivecommunicationoperationsacrossa

subsetofprocesses.● Allowstoeasilyassignindependenttaskstodifferent

groupsofprocesses.● Providesagoodmechanismtointegrateaparallellibrary

intoanMPIcode.

Page 16: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Groupsandcommunicators● Agroup isanorderedsetofprocesses.● Eachprocessinagroupisassociatedwithauniqueintegerrank.

RankvaluesstartatzeroandgotoN-1,whereNisthenumberofprocessesinthegroup.

● Agroupisalwaysassociatedwithacommunicatorobject.● Acommunicatorencompassesagroupofprocessesthatmay

communicatewitheachother.AllMPImessagesmustspecifyacommunicator.

● Forexample,thehandleforthecommunicatorthatcomprisesalltasksisMPI_COMM_WORLD.

● Fromtheprogrammer'sperspective,agroupandacommunicatorarealmostthesame.Thegrouproutinesareprimarilyusedtospecifywhichprocessesshouldbeusedtoconstructacommunicator.

● Processesmaybeinmorethanonegroup/communicator.Theyhaveauniquespecificrankwithineachgroup/communicator.

Page 17: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Mainfunctions

MPIprovidesover40routinesrelatedtogroups,communicators,andvirtualtopologies!int MPI_Comm_group(MPI_Comm comm, MPI_Group *group)

Returngroupassociatedwithcommunicator,e.g.,MPI_COMM_WORLD

int MPI_Group_incl(MPI_Group group, int p, int *ranks,

MPI_Group *new_group)

ranks integerarraywithpentries.

Createsanewgroup new_group withpprocesses,whichhaveranksfrom0top-1.Processi istheprocessthathasrankranks[i]ingroup.

int MPI_Comm_create(MPI_Comm comm, MPI_Group group, MPI_Comm *new_comm)

Newcommunicatorbasedongroup.SeeMPIcode:groups/

Page 18: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group
Page 19: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

PROCESS TOPOLOGIES

Page 20: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Processtopologies

● Manyproblemsarenaturallymappedtocertaintopologiessuchasgrids.

● Thisisthecaseforexampleformatrices,orfor2Dand3Dstructuredgrids.

● ThetwomaintypesoftopologiessupportedbyMPIareCartesiangridsandgraphs.

● MPItopologiesallowsimplifyingmanycommonMPItasks.

● MPItopologiesarevirtual— theremaybenorelationbetweenthephysicalstructureofthenetworkandtheprocesstopology.

Page 21: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Advantagesofusingtopologies

● Convenience:virtualtopologiesmaybeusefulforapplicationswithspecificcommunicationpatterns.

● Communicationefficiency:aparticularimplementationmayoptimizetheprocessmappingbaseduponthephysicalcharacteristicsofagivenparallelmachine.• Forexamplenodesthatarenearbyonthegrid

(East/West/North/Southneighbors)maybecloseinthenetwork(lowestcommunicationtime).

● ThemappingofprocessesontoanMPIvirtualtopologyisdependentupontheMPIimplementation.

Page 22: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

MPIfunctionsfortopologiesManyfunctionsareavailable.Weonlycoverthebasicones.int MPI_Cart_create(MPI_Comm comm_old, int ndims,

int *dims, int *periods, int reorder,

MPI_Comm *comm_cart)

ndims numberofdimensionsdims[i] sizeofgridalongdimensioni.Shouldnotexceedthenumberofprocessesincomm_old.Thearrayperiods isusedtospecifywhetherornotthetopologyhaswraparoundconnections.Ifperiods[i] isnon-zero,thenthetopologyhaswraparoundconnectionsalongdimensioni.reorder isusedtodetermineiftheprocessesinthenewgrouparetobereorderedornot.Ifreorder isfalse,thentherankofeachprocessinthenewgroupisidenticaltoitsrankintheoldgroup.

Page 23: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Example

Theprocessesareorderedaccordingtotheirrankrow-wiseinincreasingorder.

0(0,0)

1(0,1)

2(1,0)

3(1,1)

4(2,0)

5(2,1)

Page 24: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

PeriodicCartesiangrids

● Wechoseperiodicityalongthefirstdimension(periods[0]=1)whichmeansthatanyreferencebeyondthefirstorlastentryofanyrowwillbewrappedaroundcyclically.

● Forexample,rowindexi=-1 ismappedintoi=2.

● Thereisnoperiodicityimposedontheseconddimension.Anyreferencetoacolumnindexoutsideofitsdefinedrangeresultsinanerror.Tryit!

Page 25: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Obtainingyourrankandcoordinatesint MPI_Cart_rank(MPI_Comm comm_cart,

int *coords, int *rank)int MPI_Cart_coords(MPI_Comm comm_cart, int rank,

int maxdims, int *coords)

● Thisallowsretrievingarankorthecoordinatesinthegrid.Thismaybeusefultogetinformationaboutotherprocesses.

● coords aretheCartesiancoordinatesofaprocess.

● Itssizeisthenumberofdimensions.● RememberthatthefunctionMPI_Comm_rank isstillavailableto

queryyourownrank.

● SeeMPIcode:mpi_cart/

Page 26: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group
Page 27: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Gettingtherankofyourneighborsint MPI_Cart_shift(MPI_Comm comm_cart, int dir,

int s_step, int *rank_source, int *rank_dest)

● dir direction

● s_step lengthshift

● rank_dest containsthegrouprankoftheneighboringprocessinthespecifieddimensionanddistance.

● rank_source istherankoftheprocessforwhichthecallingprocessistheneighboringprocessinthespecifieddimensionanddistance.

● Thus,thegroupranksreturnedinrank_dest andrank_source canbeusedasparametersforMPI_Sendrecv().

rank_destrank_sources_step = 4

Page 28: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group
Page 29: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

SplittingaCartesiantopology

● ItisverycommonthatonewantstosplitaCartesiantopologyalongcertaindimensions.

● Forexample,wemaywanttocreateagroupforthecolumnsorrowsofamatrix.

int MPI_Cart_sub(MPI_Comm comm_cart,

int *keep_dims, MPI_Comm *comm_subcart)

● keep_dims booleanflagthatdetermineswhetherthatdimensionisretainedinthenewcommunicatorsorsplit,e.g.,iffalsethenasplitoccurs.

Page 30: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Example

x

y

z

keep_dims[] = {true, false, true}

keep_dims[] = {false, false, true}

Page 31: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Applicationexample:2Dpartitioning

Firstcolumncontainsb

Sendbtothediagonalprocesses

Sendbdowneachcolumn.Broadcast!

Startwith2Dcommunicator Usecolumngroup

Page 32: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

2Dtopologyformatrix

Page 33: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Sendtodiagonalblock

Page 34: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Column-wisebroadcast

Page 35: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

matvec2D

Reduction!Userowgroup

SeeMPIcode:matvec2D/

Page 36: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Codeforrowreduction

Page 37: Lecture 20 - Stanford Universityweb.stanford.edu/class/cme213/files/lectures/Lecture_20.pdf · Groups and communicators A groupis an ordered set of processes. Each process in a group

Topologiesforfinite-elementcalculations

● Atypicalsituationisthatprocessesneedtocommunicatewiththeirneighbors.

● Thisbecomescomplicatedtoorganizeforunstructuredgrids.● Inthatcase,graphtopologiesareveryconvenient.Theyallow

defininganeighborrelationshipinageneralway,usingagraph.Example:MPI_Graph_create

● Examplesofcollectivecommunications:• MPI_neighbor_allgather(): gatherdata,andallprocessesgettheresult• MPI_neighbor_alltoall(): processessendtoandreceivefromallneighborprocesses