1 parallel simulation made easy with omnet++ y. ahmet Şekerciuğlu 1, andrás varga 2, gregory k....
TRANSCRIPT
1
Parallel Simulation Made Easy With OMNeT++
Y. Ahmet Şekerciuğlu1, András Varga2, Gregory K. Egan1
1 CTIE, Monash University, Melbourne, Australia2 Omnest Global, Inc.
2
What is OMNeT++?
An open-source, generic simulation framework -- best suited for simulation of communication networks, multiprocessors and other complex distributed systems (further examples: queuing systems, hardware architectures, server farm, business processes, call centers)
C++-based simulation kernel plus a set of libraries and tools (GUI and command-line); platform: Unix, Windows
Active user community (mailing list has about 240 subscribers) Home page: www.omnetpp.org Commercial version also exists:
www.omnest.com
3
Model Structure
Component-oriented approach:The basic building block is a simple module (programmed
in C++).Simple modules can be grouped to form compound modules.Modules are connected with each other.
4
Defining the Topology NED (Network Description Language) defines
topology: what modules exist, how they are connected and assembled to form larger modules
//// Host with an Ethernet interface//module EtherStation parameters: ... gates: ... submodules: app: EtherTrafficGen; llc: EtherLLC; mac: EtherMAC; connections: app.out --> llc.hl_in; app.in <-- llc.hl_out; llc.ll_in <-- mac.hl_out; llc.ll_out --> mac.hl_in; mac.ll_in <-- in; mac.ll_out --> out;endmodule
The graphical editor GNED operates directly on NED
files
5
Defining the Behaviour
Behaviour is encapsulated in simple modules.A simple module:
sends messages, reacts to received messages collects statistics
Simple modules are programmed C++ one can choose between process-oriented or event-
oriented programming simulation class library covers commonly needed
functionality, such as: random number generation, statistics colleection (histograms, etc), queues and other containers, support for topology discovery and routing, etc.
6
Running the Model Under the GUI
Without extra programming, one can:
run or single-step the simulation
examine scheduled events
explore modules and see message flow
monitor state of simulation and execution speed
see what’s happening inside the model
examine model object tree
7
Exploring Model Internals
examine contents of queues, messages and other objects
look at state variables and statistics
trace what one module is doing
step to next event in a module
find out pointer values for C++ debugging
(gdb)
8
Exploring Model Internals
look at results being recorded
and much more…
or any objects by their names
find all messages, all statistics objects or all queues (NEW)
and inspect them
9
Modular Architecture
UI and simulations are separated, and interact via a well-defined API
provides command-line and graphical user interface; user interfaces can be customized, or specialized ones can be created
enables embedding of simulations into larger applicationsuser interfacesimulation
SIM ENVIRmain()
CMDENVor
TKENVor
...
Simulation model
ModelComponent
Library
10
Large-Scale Network Simulations
PDES: Parallel Discrete Event SimulationMotivation:
speedup: make use of multiple CPUs to reduce execution time ability to run large models by distributing resource
requirements
We want to use clusters that can provide supercomputing power at affordable costs -- inexpensive workstations connected via a high-speed network
example: VPAC Linux cluster contains 96 IBM xSeries (dual 2.8GHz Xeon) PCs running Linux 2.4 yielding 629.7 Gflops; Myrinet interconnection provides 4μs end-to-end delay
communication method: MPI MPI (Message Passing Interface) is a standard for high-performance
computing several implementations exist: LAM/MPI, MPICH, plus vendor-specific
implementations
11
Why do we need large-scale simulations?
Research on Internet protocols and technologies extensively relies on simulation
Systems are too large and too complex for analytic treatment
Small experimental networks do not reflect large-scale dynamics
Large-scale simulations (10,000-1,000,000 nodes) are needed to:
… properly understand dynamics of routing protocols … to test various extensions proposed to improve
performance of current Internet protocols … to demonstrate scalability of multicast protocols … plus more
12
Parallel DES
Partitioning to Logical Processes (LPs):
LP1 (on CPU1)
LP2 (on CPU2)
LP3 (on CPU3)
Each partition maps to a separate LP with its own virtual time and list of scheduled events (Future Events Set)
LPs are executed on different processors Synchronization mechanism (e.g. null messages; Chandy-Misra-Bryant
1979) is needed to prevent incausalities from happening
13
PDES Support in OMNeT++
To try running existing OMNeT++ models in parallel, you only need to:
1. enable parallel simulationparallel-simulation=true
2. specify partitioning in configuration file
3. run
14
PDES Support in OMNeT++
Nearly every model can be run in parallel. Constraints: modules may communicate via sending messages only (no direct
method call or member access) unless mapped to the same processor
no global variables limitations on direct sending (no sending to a submodule of another
module, unless mapped to the same processor) lookahead must be present in the form of link delays currently we only support static topologies (this can be improved)
Models run without modification (no special instrumentation needed)
Partitioning is part of configuration, no model change required
follows “separation of model from experiments” principle Code will be publicly released before end 2003
(available on request until then)
15
Extensible PDES Architecture
Pluggable communication library (“transport layer”): currently implemented:
MPI (Message Passing Interface), named pipes, shared directory (for demonstration and debug purposes only)
Pluggable PDES algorithm: currently implemented:
Null Message Algorithm, Ideal Simulation Protocol (for benchmarking), no synchronization (to demonstrate the need for
synchronization)
16
Simulation Kernel
Parallel Simulation Architecture
Parallel simulation subsystem
Synchronization
Communication
Partition (LP)
Simulation Model
Event scheduling,sending, receiving
communications library (MPI, sockets, etc.)
17
Communication Layer
Must implement the following abstract interface:
/** * Provides an abstraction layer above MPI, * PVM, shared-memory communications, etc… */class cParsimCommunications{ virtual void init() = 0; virtual void shutdown() = 0;
virtual cCommBuffer *createCommBuffer() = 0; virtual void recycleCommBuffer(cCommBuffer *buffer) = 0;
virtual void send(cCommBuffer *buffer, int tag, int destination) = 0; virtual void boadcast(cCommBuffer *buffer, int tag) = 0; virtual void receiveBlocking(cCommBuffer *buffer, int& rcvdTag,int& srcProcId) = 0; virtual bool receiveNonblocking(cCommBuffer *buffer, int& rcvdTag, int& srcProcId) = 0; virtual void synchronize() = 0;};
class cMPICommunications : public cParsimCommunications { … };class cNamedPipeCommunications : public cParsimCommunications { … };class cFileCommunications : public cParsimCommunications { … };
Communication buffers encapsulate pack/unpack operations. The cCommBuffer interface (abstract class) has multiple implementations for MPI, etc.
Simulation objects are able to pack/unpack themselves to/from communication buffers, using methods from the cCommBuffer interface.
18
Model Partitioning
OMNeT++ uses placeholder modules and proxy gates:
nodeB(placeholder)
nodeA
CPU0
nodeBnodeA
(placeholder)
CPU1
communication (MPI, pipe, etc.)
19
(placeholder for compound module)
Model Partitioning, cont’d If compound modules themselves are distributed across LPs,
the solution is slightly more complicated:
simple module
CPU0
simple module
(placeholder)CPU1 (placeholder)
(placeholder)(placeholder)CPU2 simple module
20
Placeholder Approach
Advantage of placeholder approach: when simulating telecommunication networks, all nodes (routers, ASes, hosts,
etc) are present (at least as placeholders) in all LPs, so algorithms such as topology discovery for routing can proceed unhindered.
LP1 (on CPU1) placeholders
21
Synchronization Layer
/** * Abstract base class for parallel simulation algorithms... */class cParsimSynchronizer : public cScheduler{ virtual void startRun() = 0; virtual void endRun() = 0;
/** * Scheduler function -- it comes from cScheduler interface... */ virtual cMessage *getNextEvent() =0;
/** * Hook, called when a cMessage is sent out of the segment... */ virtual void processOutgoingMessage(cMessage *msg, int procId, int moduleId, int gateId, void *data) = 0;};
Parallel simulation protocols must implement the following abstract interface:
22
Synchronization Layer
Currently implemented parallel simulation algorithms:
23
Example: Distributed CQN
Closed Queuing Network (CQN)described in the “Performance Evaluation of Conservative Algorithms”, R. Bagrodia et al., 2000
N tandem queues (switch+queues); exponential service times; propagation delay all links link
Lookahead: propagation delay on links
CPU2
CPU1
CPU0
24
Example: Distributed CQN
OMNeT++ model for CQN wraps tandems into compound modules
25
[General]
parallel-simulation=true
#parsim-communications-class=“cFileCommunications"
parsim-communications-class="cMPICommunications"
parsim-synchronization-class= "cNullMessageProtocol"
[Partitioning]
*.tandemQueue[0]*.segment-id=0
*.tandemQueue[1]*.segment-id=1
*.tandemQueue[2]*.segment-id=2
Configuring for Parallel Execution
Configuration file: enable parallel simulation
select communication library and parallel simulation protocol
assign modules to processors
Each partition is simulated in its own process.
26
CQN Partitioning in Tkenv
If simulation executes under the GUI, placeholder modules and proxy gates are shown
27
Running Parallel Simulation
If GUI is used, operation of the Null Message Algorithm can be followed in trace windows
28
Experimental Results
Present simulation framework was used to verify the efficiency criterion for the Null Message Algorithm: LE >> 1 and λ=LE/P >> 1 are necessary for efficient PDES execution
see paper “A Practical Efficiency Criterion For The Null Message Algorithm”, András Varga, Y. Ahmet Şekerciuğlu, Gregory K. Egan in the Proceedings
29
Ongoing Work
Optimisations on the parallel simulation kernel Create support for node mobility across LPs Test large-scale IPv6 simulations (using the IPv6Suite for
OMNeT++, developed at CTIE, Monash University, Australia)
Further verification and refinement of the efficiency criteria for the Null-Message Algorithm