an integrated experimental environment for distributed systems and networks
DESCRIPTION
B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar. An integrated Experimental Environment for Distributed Systems and Networks. Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13. Outline. Motivation Netbed structure - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/1.jpg)
An integrated Experimental Environment for Distributed
Systems and NetworksB. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad,
M. Newbold, M. Hibler, C. Barb, A. Joglekar
Presented by Sunjun Kim
Jonathan di Costanzo2009/04/13
![Page 2: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/2.jpg)
Outline
MotivationNetbed structureValidation and testingNetbed contributionConclusion
2
![Page 3: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/3.jpg)
Outline
MotivationNetbed structureValidation and testingNetbed contributionConclusion
3
![Page 4: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/4.jpg)
Background
Researchers need a platform in which they can develop, debug, and evaluate their systems
One lab is not enough, lack of resources Need more computers Scalability in terms of distance and number
of nodes can’t be reached Requires a huge amount of time to develop
large scale experiments4
![Page 5: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/5.jpg)
Previous approaches
Simulation: NS
Live networks: PlanetLab
Emulation: Dummynet, NSE
controlled, repeatable environment
Achieves realism Not easy to repeat the experiment again
controlled packet loss and delay
Manual configuration is boring
Loses accuracy due to abstraction
5
![Page 6: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/6.jpg)
Netbed ideas Derives from “Emulab Classic”
A universally-available time- and space-shared network emulator
Automatic configuration from NS script
Add Virtual topologies for network experimentations Integrates simulation, emulation, and live-
network with wide-area nodes experimentation in a single framework
6
![Page 7: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/7.jpg)
Netbed goals
Accuracy Provide artifact-free environment
Universality Anyone can use anything the way he wants
conservative policy for the resource allocation No multiplexing (virtual machine) The resource of one node can be fully utilized
7
![Page 8: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/8.jpg)
Resources
Local-Area Resources Distributed Resources Simulated Resources Emulated Resources
WAN emulator (integrated yet)
PlanetLab ModelNet (still in work)
8
![Page 9: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/9.jpg)
Outline
MotivationNetbed structureValidation and testingNetbed contributionConclusion
9
![Page 10: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/10.jpg)
Netbed structure
Resource
Life cycle
10
![Page 11: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/11.jpg)
Local-Area resources 3 clusters
168 in Utah, 48 PCs in Kentucky & 40 in Georgia
Each node can be used as Edge node, router, traffic-shaping node, traffic
generator
Exclusivity of a machine during an experiment
The OS is given but entirely replaceable11
![Page 12: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/12.jpg)
Local-Area resources
12
![Page 13: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/13.jpg)
Distributed resources Also called wide-area resources
50-60 nodes in approximatively 30 sites
provides characteristic live network
Very few nodes These nodes are shared between many users FreeBSD Jail mechanism (kind of Virtual machine) Non-root access 13
![Page 14: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/14.jpg)
Distributed resources
14
![Page 15: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/15.jpg)
Simulated resources
Based on nse (NS-emulation) Enables interaction with real traffics
Provides scalability beyond physical resources Many simulated nodes can be multiplexed
15
![Page 16: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/16.jpg)
Emulated resources
VLANs Emulate wide-area links within a local-area
Dummynet Emulates queue & bandwidth limitation ,
introducing delays and packet loss between physical nodes
nodes act as Ethernet bridges transparent to experimental traffic
16
![Page 17: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/17.jpg)
Netbed structure
Resource
Life cycle
17
![Page 18: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/18.jpg)
Life cycle
18
![Page 19: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/19.jpg)
Life cycle
$ns duplex-link $A $B 1.5Mbps 20ms
BA DB
A BBA
SpecificationGlobal Resource AllocationNode Self-ConfigurationExperiment ControlSwap OutParsingSwap In
19
![Page 20: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/20.jpg)
Accessing Netbed
Experiment creation A project leader propose a project on the web A netbed staff accept or reject the project All the experiment will be accessible from the
web
Experiment managment Log on allocated nodes or on the usershost
(fileserver) The fileserver send the OS images, home and
project directories to the other nodes20
![Page 21: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/21.jpg)
Accessing Netbed
21
![Page 22: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/22.jpg)
Specification
Experimenters use ns scripts with Tcl can do as many functions & loops as they want
Netbed defines a small set of ns extension Possibility of chosing a specfic hardware
simultation, emulation, or real implementation Program objects can be defined using a
Netbed-specific ns extension Possibility of using graphical UI 22
![Page 23: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/23.jpg)
Parsing
Front-end Tcl/ns parser Recognizes subset of ns relevant to topology
& traffic generation
Database Store an abstraction of everything about the
exeriment▪ Fixed generated events▪ Information about Hardwares , users & experiments▪ procedures
23
![Page 24: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/24.jpg)
Parsing
24
![Page 25: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/25.jpg)
Global Resource Allocation Binds abstractions from the database
to physical or simulated entities Best effort to match with specifications On-demand allocations (no reservations)
2 different algorithms for local and distributed nodes (different constraints) Simulated annealing Genetic algorithm
25
![Page 26: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/26.jpg)
Global Resource Allocation Over-reservation of the bottleneck
inter-switch bandwith is to small (2 Gbps)
Against their conservative policy
Dynamic changes of the topology are allowed Add and remove nodes
Consistent naming across instantiations Virtualization of IP addresses and host
names
26
![Page 27: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/27.jpg)
Node Self-Configuration Dynamic linking and loading from
the DB Let have the proper context (hostname,
disk image, script to start the experiment)
No persistent configuration states Only volatile memory on the node If requiered, the current soft state can be
stored in the DB as a hard state Swap out / Swap in
27
![Page 28: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/28.jpg)
Node Self-Configuration Local Nodes
All nodes are rebooted in parallel Contact the masterhost which loads the
kernel directed by the database A second level boot may be requiered
Distributed nodes Boot from a CD-ROM then contact the
masterhost A new FreeBSD Jail is instantiated Tested Master Control Client 28
![Page 29: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/29.jpg)
Experiment Control
Netbed supports dynamic experiment control Start, stop and resume processes, traffic
generators and network monitors
Signals between nodes Used of a Publish/Subscribe event
routing system The static events are retrieved from the
DB Dynamics events are possible
29
![Page 30: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/30.jpg)
Experiment Control ns configuration files is only high-level control
Experimenters can made some low-level controls On local node: root privileges▪ Kernel modification & access to raw sockets
On distributed: Jail-restricted root privileges▪ Access to raw socket with a specific IP address
Each local node support separated network isolated from the experimental one Enable to control a node via a tunnel as we where on
it without interfering 30
![Page 31: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/31.jpg)
Preemption and Scheduling Netbed try to prevent idling
3 metrics: traffic, use of pseudo-terminal devices & CPU load average
To be sure, a message is sent to the user who can disapprove manually
A challenge for distributed nodes with several Jails
Netbed proposes automated batch experiments When no interaction is required Enables to wait for available resources
31
![Page 32: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/32.jpg)
Outline
MotivationNetbed structureValidation and testingNetbed contributionConclusion
32
![Page 33: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/33.jpg)
Validation
1st row : emulation overhead Dummynet gives better results than
nse33
![Page 34: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/34.jpg)
Validation
They expect to have better results with future improvements of nse 34
![Page 35: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/35.jpg)
Validation
5 nodes are communicating with 10 links
Evaluation of a derivative of DOOM
Their goal is to sent 30 tics/sec 35
![Page 36: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/36.jpg)
Testing
Challenges Depends on physical artifacts (cannot be
cloned) Should evaluate arbitrary programs Must run continuoustly
Minibed: 8 separated Netbed nodes Test mode: prevent hardware
modifications Full-test mode: provides isolated
hardware 36
![Page 37: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/37.jpg)
Outline
MotivationNetbed structureValidation and testingNetbed contributionConclusion
37
![Page 38: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/38.jpg)
Practical benefits
All-in-one set of tools Automated and efficient realization of
virtual topologies Efficient use of resources through time-
sharing and space-sharing Increase of fault-tolerance (resource
virtualization)
38
![Page 39: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/39.jpg)
Practical benefits Examples
The “dumbbell” network▪ 3h15 --> 3 min
Improvement in the utilization of a scarce and expensive infrastructure: 12 months & 168 PC in Utah▪ Time-sharing (swapping): 1064 nodes▪ Space-sharing (isolation): 19,1 years
Virtualization of name and IP addresses▪ No problem with the swappings 39
![Page 40: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/40.jpg)
Experiment creation and swapping
Mapping Reservation Reboot issuing Reboot Miscellaneous
Double time to boot on a custom disk image
Key services
40
![Page 41: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/41.jpg)
Key services
Mapping local resources: assign
Match the user’s requirements Based on simulated annealing Try to minimizes the number of switch and
inter-switch bandwidth Less than 13 seconds
41
![Page 42: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/42.jpg)
Key services
Mapping local resources: assign
42
![Page 43: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/43.jpg)
Key services
Mapping distributed resources: wanassign
Different constraints▪ Fully connected via the internet▪ “Last mile”: type instead of topology▪ Specific topologies may be guaranteed by
requesting particular network characteristics (bandwidth, latency & loss)▪ Based on a genetic algorithm
43
![Page 44: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/44.jpg)
Mapping distributed resources: wanassign 16 nodes 100 edges : ~1sec
256 nodes & 40 edges/nodes : 10min~2h
Key services
44
![Page 45: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/45.jpg)
Key services
Disk reloading 2 possibilities ▪ complete disk image loading▪ incremental synchronization (hash tables on files or
blocks) Good▪ Faster (in their specific case)▪ No corruption
Bad▪ Waste of time when similar images are needed
repeatly▪ Pace reloading of freed node (reserved for 1 user) 45
![Page 46: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/46.jpg)
Key services
Disk reloading Frisbee
Performance techniques:▪ Uses a domain-specific algorithm to skip
unused blocks▪ Delivers images via a custom reliable
multicast protocol
117 sec for 80 nodes, write 550MB instead of 3GB
46
![Page 47: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/47.jpg)
Key services
Scaling of simulated resources
Simulated nodes are multiplexed on 1 physical node▪ Must deal with real time taking into account the
user’s specification : rate of events
Test of a live TCP at 2Mb CBR▪ 850MHz PC with UDP background 2Mb CBR / 50ms▪ Able to have 150 links for 300 nodes▪ Problem of routing in very complex topologies 47
![Page 48: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/48.jpg)
Example of a new possibility Possibility to program different batch
experiment, with the modification of only 1 parameter by 1
The Armada file system from Oldfield & Kotz 7 bandwidths x 5 latencies x 3
application settings x 4 configs of 20 nodes
420 tests in 30 hrs (4.3 min ~ per experiment)
48
![Page 49: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/49.jpg)
Outline
MotivationNetbed structureValidation and testingNetbed contributionConclusion
49
![Page 50: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/50.jpg)
Summary Netbed deals with 3 test environments
Reuse of ns script
Quick setup of the test environment
Virtualization techniques provide the artifact-free environment
Enables qualitatively new experimental techniques 50
![Page 51: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/51.jpg)
Future Work
Reliability/Fault Tolerance
Distributed Debugging: Checkpoint/Rollback
Security “Petri Dish”
51
![Page 52: An integrated Experimental Environment for Distributed Systems and Networks](https://reader035.vdocuments.us/reader035/viewer/2022070422/568164d8550346895dd71df2/html5/thumbnails/52.jpg)
Thank you
Any Question ?