distributed simulation with mpi in ns-3 joshua pelkey and dr. george riley wns3 march 25, 2011

19
Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

Upload: belinda-wheeler

Post on 18-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

Distributed simulation with MPI in ns-3

Distributed simulation with MPI in ns-3

Joshua Pelkey and Dr. George Riley

Wns3 March 25, 2011

Page 2: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

2

OverviewOverview

• Standard sequential simulation techniques with substantial network traffic– Lengthy execution times– Large amount of computer memory

• Parallel and distributed discrete event simulation [1]– Allows single simulation program to run on multiple

interconnected processors– Reduced execution time! Larger topologies!

Page 3: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

3

Overview (cont.)Overview (cont.)

• Important Note– It is mandatory that distributed simulations produce the

same results as identical sequential simulations

Page 4: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

4

Overview: terminologyOverview: terminology

• Logical Process (LP)– An individual sequential simulation

• Rank or system id– The unique number assigned to each LP

Figure 1. Simple point-to-point topology, distributed

Page 5: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

5

Overview: related workOverview: related work

• Parallel/Distributed ns (PDNS) [2]• Georgia Tech Network Simulator (GTNetS) [3]

– Both use a federated approach and a conservative (blocking) mechanism

Page 6: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

6

Implementation Details in ns-3Implementation Details in ns-3

• LP communication– Message Passing Interface (MPI) standard– Send/Receive time-stamped messages– MpiInterface in ns-3

• Synchronization– Conservative algorithm using lookahead– DistributedSimulator in ns-3

Page 7: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

7

Implementation Details in ns-3 (cont.)Implementation Details in ns-3 (cont.)

• Assigning rank to nodes– Handled manually in simulation script

• Remote point-to-point links– Created automatically between nodes with different

ranks through point-to-point helper– When a packet is set to cross a remote point-to-point link,

the packet is transmitted via MPI using our interface• Merged since ns-3.8

Page 8: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

8

• All nodes created on all LPs, regardless of rank– It is up to the user to only install applications on the

correct rank• Nodes are assigned rank manually

– An MpiHelper class could be used to assign rank to nodes automatically. This would enable easy distribution of existing simulation scripts.

• Pure distributed wireless is currently not supported– At least one point-to-point link must exist in order to

divide the simulation

Implementation Details in ns-3: limitationsImplementation Details in ns-3: limitations

Page 9: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

9

Performance StudyPerformance Study

• DARPA NMS campus network simulation– Using nms-p2p-nix-distributed example available in ns-3– Allows creation of very large topologies– Any number of campus networks are created and

connected together– Different campus networks can be placed on different LPs– Tested with 2 CNs, 4 CNs, 6 CNs, 8 CNs, and 10 CNs

Page 10: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

10

Performance Study: campus network topologyPerformance Study: campus network topology

Figure 2. Campus network topology block [4]

200 ms, 10 us

Page 11: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

11

Performance Study: Georgia Tech clusters usedPerformance Study: Georgia Tech clusters used

• Hogwarts Cluster– 6 nodes, each with 2 quad-core processors and 48GB of

RAM• Ferrari Cluster

– Mix of machines, including 3 quad-core nodes and 8 dual-core nodes

Page 12: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

12

Performance Study: simulations on HogwartsPerformance Study: simulations on Hogwarts

Figure 3. Campus network simulations on Hogwarts with(A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs

Page 13: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

13

Performance Study: simulations on FerrariPerformance Study: simulations on Ferrari

Figure 4. Campus network simulations on Ferrari with(A) 2 CNs (B) 4 CNs (C) 6 CNs (D) 8 CNs (E) 10 CNs

Page 14: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

14

Performance Study: speedupPerformance Study: speedup

Figure 5. Speedup using distributed simulation for campus

network topologies on the (A) Hogwarts cluster and (B) Ferrari cluster

Page 15: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

15

Performance Study: speedup (cont.)Performance Study: speedup (cont.)

• Linear speedup for Hogwarts, not for Ferrari. Further investigation revealed Ferrari consisted of a mix of machines, with the first two nodes considerably faster

2 CNs 4 CNs 6 CNs 8 CNs 10 CNs

Hogwarts 1.8 3.3 5.8 6.9 8.3

Ferrari 1.9 1.6 2.0 2.3 2.4

Table 1: Speedup for Hogwarts and Ferrari

Page 16: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

16

Performance Study: changing the lookaheadPerformance Study: changing the lookahead

• By changing the delay between campus networks, the lookahead was varied (200ms to 10 µs)

• For Hogwarts and Ferrari, the 10 µs simulations ran, on average, 25% and 47% slower, respectively

• As expected, a smaller lookahead time decreases the potential speedup, as the simulators must synchronize with a greater frequency

Page 17: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

17

Future WorkFuture Work

• MpiHelper class to facilitate creating distributed topologies– Nodes assigned rank automatically– Existing simulation scripts could be distributed easily

• Distributing the topology could occur at the node level, rather than the application– Ghost nodes, save memory

• Pure distributed wireless support

Page 18: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

18

SummarySummary

• Distributed simulation in ns-3 allows a user to run a single simulation in parallel on multiple processors

• Very large-scale simulations can be run in ns-3 using the distributed simulator

• Distributed simulation in ns-3 offers potentially optimal linear speedup compared to identical sequential simulations

Page 19: Distributed simulation with MPI in ns-3 Joshua Pelkey and Dr. George Riley Wns3 March 25, 2011

19

ReferencesReferences

[1]R.M. Fujimoto. Parallel and Distributed Simulation Systems. Wiley Interscience, 2000.

[2]PDNS - Parallel/Distributed ns. http://www.cc.gatech.edu/computing/compass/pdns, March 2004.

[3] G. F. Riley. The Georgia Tech Network Simulator. In Proceedings of the ACM SIGCOMM workshop on Models, methods and tools for reproducible network research, MoMeTools ’03, pages 5-12, New York, NY, USA, 2003 ACM.

[4] Standard baseline NMS challenge topology. http://www.ssfnet.org/Exchange/gallery/baseline, July 2002