mpj express
DESCRIPTION
MPJ Express. Alon Vice Ayal Ofaim. Contributors. Aamir Shafi Jawad Manzoor Kamran Hamid Mohsan Jameel Rizwan Hanif Amjad Aziz. Bryan Carpenter. Hong Ong. Mark Baker. Guillermo Taboada Sabela Ramos. OUTLINE. Motivation. Hello World & Embarrassingly Parallel Toy Example. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/1.jpg)
MPJ Express
Alon ViceAyal Ofaim
![Page 2: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/2.jpg)
Contributors
2
Aamir ShafiJawad ManzoorKamran HamidMohsan JameelRizwan HanifAmjad Aziz Bryan Carpenter
Mark Baker
Guillermo TaboadaSabela Ramos
Hong Ong
![Page 3: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/3.jpg)
OUTLINE
Motivation.
Hello World & Embarrassingly Parallel Toy Example.
Performance Evaluation.
The runtime System.
MPJ commands:– Point to point communication.– Collective communication
Summary3
![Page 4: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/4.jpg)
Why Java? Portability A popular language in colleges and software industry:
– Large pool of software developers– A useful educational tool
Improved compile and runtime checking of the code Support for multithreading Rich collection of support libraries
4
![Page 5: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/5.jpg)
“Hello World” MPJ Express Program 1 import mpi.*; 2 3 public class HelloWorld { 4 5 public static void main(String args[]) throws Exception { 6 7 MPI.Init(args); 8 int size = MPI.COMM_WORLD.Size(); 9 int rank = MPI.COMM_WORLD.Rank(); 10 11 System.out.println("I am process <"+rank+">"); 12 13 MPI.Finalize(); 14 } 15 }
5
aamirshafi@velour:~/work/mpj-user$ mpjrun.sh -np 4 HelloWorldMPJ Express (0.38) is started in the multicore configurationI am process <1>I am process <0>I am process <3>I am process <2>
![Page 6: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/6.jpg)
An Embarrassingly Parallel Toy Example
6
Master Process
Worker 0 Worker 1 Worker 2 Worker 3
![Page 7: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/7.jpg)
7
aamirshafi@velour:~/work/mpj-user$ mpjrun.sh -np 5 ToyExampleMPJ Express (0.38) is started in the multicore configuration1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4
![Page 8: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/8.jpg)
Performance Evaluation of Point to Point Communication Normally ping pong benchmarks are used to calculate:
– Latency: How long it takes to send N bytes from sender to receiver?
– Throughput: How much bandwidth is achieved?
Latency is a useful measure for studying the performance of “small” messages
Throughput is a useful measure for studying the performance of “large” messages
Evaluation on GigE and Myrinet systems are in the next 4 slides.
8
![Page 9: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/9.jpg)
Latency Comparison on GigE
9
![Page 10: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/10.jpg)
Throughput Comparison on GigE
10
![Page 11: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/11.jpg)
Latency Comparison on Myrinet
11
![Page 12: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/12.jpg)
Throughput Comparison on Myrinet
12
![Page 13: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/13.jpg)
OUTLINE
Performance Evaluation
The runtime System
MPJ commands:– Point to point communication– Collective communication
Summary
13
![Page 14: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/14.jpg)
The Runtime System
14
![Page 15: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/15.jpg)
OUTLINE
Performance Evaluation
The runtime System
MPJ commands:– Point to point communication– Collective communication
Summary
15
![Page 16: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/16.jpg)
16
CPU waits
“Blocking”
Send() Recv()
Sender Receiver
time CPU waits
“Non Blocking”
Isend() Irecv()
Sender Receiver
time CPU does
computation
Wait()CPU waitsWait()
CPU waits
CPU doescomputation
![Page 17: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/17.jpg)
Implementation of point-to-point communication
Each Send()and Recv()method internally creates a buffer. Various modes of blocking and non-blocking
communication primitives are implemented using two protocols:
17
time ->
control message to receiveractual data sent
sender receiver
time ->
control message to receiver
actual data sent
acknowledgement
sender receiver
RendezvousEager Send
![Page 18: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/18.jpg)
OUTLINE
Performance Evaluation
The runtime System
MPJ commands:– Point to point communication– Collective communication
Summary
18
![Page 19: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/19.jpg)
19
Image from MPI standard doc
![Page 20: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/20.jpg)
Reduce collective operations
20
1
2
3
4
5
15
1
2
3
4
5
15
15
15
15
15
reduce
allreduce
Processes
Data MPI.PROD MPI.SUM MPI.MIN MPI.MAX MPI.LAND MPI.BAND MPI.LOR MPI.BOR MPI.LXOR MPI.BXOR MPI.MINLOC MPI.MAXLOC
Processes
![Page 21: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/21.jpg)
Toy Example with Collectives
21
![Page 22: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/22.jpg)
OUTLINE
Performance Evaluation
The runtime System
MPJ commands:– Point to point communication– Collective communication
Summary
22
![Page 23: MPJ Express](https://reader035.vdocuments.us/reader035/viewer/2022081502/56816265550346895dd2cc2d/html5/thumbnails/23.jpg)
Summary
MPJ Express (www.mpj-express.org) is an environment for MPI-like parallel programming in Java.
It was conceived as having an expandable set of “devices”, allowing different underlying implementations of message passing.
The software explicitly manages internal memory used for sending and receiving messages.
We parallelized Gadget-2 using MPJ Express and managed to get good performance.
23