an integrated experimental environment for distributed systems and networks

52
An integrated Experimental Environment for Distributed Systems and Networks B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13

Upload: kaloni

Post on 23-Feb-2016

54 views

Category:

Documents


1 download

DESCRIPTION

B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, A. Joglekar. An integrated Experimental Environment for Distributed Systems and Networks. Presented by Sunjun Kim Jonathan di Costanzo 2009/04/13. Outline. Motivation Netbed structure - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An  integrated Experimental Environment  for  Distributed  Systems and Networks

An integrated Experimental Environment for Distributed

Systems and NetworksB. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad,

M. Newbold, M. Hibler, C. Barb, A. Joglekar

Presented by Sunjun Kim

Jonathan di Costanzo2009/04/13

Page 2: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

2

Page 3: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

3

Page 4: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Background

Researchers need a platform in which they can develop, debug, and evaluate their systems

One lab is not enough, lack of resources Need more computers Scalability in terms of distance and number

of nodes can’t be reached Requires a huge amount of time to develop

large scale experiments4

Page 5: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Previous approaches

Simulation: NS

Live networks: PlanetLab

Emulation: Dummynet, NSE

controlled, repeatable environment

Achieves realism Not easy to repeat the experiment again

controlled packet loss and delay

Manual configuration is boring

Loses accuracy due to abstraction

5

Page 6: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Netbed ideas Derives from “Emulab Classic”

A universally-available time- and space-shared network emulator

Automatic configuration from NS script

Add Virtual topologies for network experimentations Integrates simulation, emulation, and live-

network with wide-area nodes experimentation in a single framework

6

Page 7: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Netbed goals

Accuracy Provide artifact-free environment

Universality Anyone can use anything the way he wants

conservative policy for the resource allocation No multiplexing (virtual machine) The resource of one node can be fully utilized

7

Page 8: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Resources

Local-Area Resources Distributed Resources Simulated Resources Emulated Resources

WAN emulator (integrated yet)

PlanetLab ModelNet (still in work)

8

Page 9: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

9

Page 10: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Netbed structure

Resource

Life cycle

10

Page 11: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Local-Area resources 3 clusters

168 in Utah, 48 PCs in Kentucky & 40 in Georgia

Each node can be used as Edge node, router, traffic-shaping node, traffic

generator

Exclusivity of a machine during an experiment

The OS is given but entirely replaceable11

Page 12: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Local-Area resources

12

Page 13: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Distributed resources Also called wide-area resources

50-60 nodes in approximatively 30 sites

provides characteristic live network

Very few nodes These nodes are shared between many users FreeBSD Jail mechanism (kind of Virtual machine) Non-root access 13

Page 14: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Distributed resources

14

Page 15: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Simulated resources

Based on nse (NS-emulation) Enables interaction with real traffics

Provides scalability beyond physical resources Many simulated nodes can be multiplexed

15

Page 16: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Emulated resources

VLANs Emulate wide-area links within a local-area

Dummynet Emulates queue & bandwidth limitation ,

introducing delays and packet loss between physical nodes

nodes act as Ethernet bridges transparent to experimental traffic

16

Page 17: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Netbed structure

Resource

Life cycle

17

Page 18: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Life cycle

18

Page 19: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Life cycle

$ns duplex-link $A $B 1.5Mbps 20ms

BA DB

A BBA

SpecificationGlobal Resource AllocationNode Self-ConfigurationExperiment ControlSwap OutParsingSwap In

19

Page 20: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Accessing Netbed

Experiment creation A project leader propose a project on the web A netbed staff accept or reject the project All the experiment will be accessible from the

web

Experiment managment Log on allocated nodes or on the usershost

(fileserver) The fileserver send the OS images, home and

project directories to the other nodes20

Page 21: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Accessing Netbed

21

Page 22: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Specification

Experimenters use ns scripts with Tcl can do as many functions & loops as they want

Netbed defines a small set of ns extension Possibility of chosing a specfic hardware

simultation, emulation, or real implementation Program objects can be defined using a

Netbed-specific ns extension Possibility of using graphical UI 22

Page 23: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Parsing

Front-end Tcl/ns parser Recognizes subset of ns relevant to topology

& traffic generation

Database Store an abstraction of everything about the

exeriment▪ Fixed generated events▪ Information about Hardwares , users & experiments▪ procedures

23

Page 24: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Parsing

24

Page 25: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Global Resource Allocation Binds abstractions from the database

to physical or simulated entities Best effort to match with specifications On-demand allocations (no reservations)

2 different algorithms for local and distributed nodes (different constraints) Simulated annealing Genetic algorithm

25

Page 26: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Global Resource Allocation Over-reservation of the bottleneck

inter-switch bandwith is to small (2 Gbps)

Against their conservative policy

Dynamic changes of the topology are allowed Add and remove nodes

Consistent naming across instantiations Virtualization of IP addresses and host

names

26

Page 27: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Node Self-Configuration Dynamic linking and loading from

the DB Let have the proper context (hostname,

disk image, script to start the experiment)

No persistent configuration states Only volatile memory on the node If requiered, the current soft state can be

stored in the DB as a hard state Swap out / Swap in

27

Page 28: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Node Self-Configuration Local Nodes

All nodes are rebooted in parallel Contact the masterhost which loads the

kernel directed by the database A second level boot may be requiered

Distributed nodes Boot from a CD-ROM then contact the

masterhost A new FreeBSD Jail is instantiated Tested Master Control Client 28

Page 29: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Experiment Control

Netbed supports dynamic experiment control Start, stop and resume processes, traffic

generators and network monitors

Signals between nodes Used of a Publish/Subscribe event

routing system The static events are retrieved from the

DB Dynamics events are possible

29

Page 30: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Experiment Control ns configuration files is only high-level control

Experimenters can made some low-level controls On local node: root privileges▪ Kernel modification & access to raw sockets

On distributed: Jail-restricted root privileges▪ Access to raw socket with a specific IP address

Each local node support separated network isolated from the experimental one Enable to control a node via a tunnel as we where on

it without interfering 30

Page 31: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Preemption and Scheduling Netbed try to prevent idling

3 metrics: traffic, use of pseudo-terminal devices & CPU load average

To be sure, a message is sent to the user who can disapprove manually

A challenge for distributed nodes with several Jails

Netbed proposes automated batch experiments When no interaction is required Enables to wait for available resources

31

Page 32: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

32

Page 33: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Validation

1st row : emulation overhead Dummynet gives better results than

nse33

Page 34: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Validation

They expect to have better results with future improvements of nse 34

Page 35: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Validation

5 nodes are communicating with 10 links

Evaluation of a derivative of DOOM

Their goal is to sent 30 tics/sec 35

Page 36: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Testing

Challenges Depends on physical artifacts (cannot be

cloned) Should evaluate arbitrary programs Must run continuoustly

Minibed: 8 separated Netbed nodes Test mode: prevent hardware

modifications Full-test mode: provides isolated

hardware 36

Page 37: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

37

Page 38: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Practical benefits

All-in-one set of tools Automated and efficient realization of

virtual topologies Efficient use of resources through time-

sharing and space-sharing Increase of fault-tolerance (resource

virtualization)

38

Page 39: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Practical benefits Examples

The “dumbbell” network▪ 3h15 --> 3 min

Improvement in the utilization of a scarce and expensive infrastructure: 12 months & 168 PC in Utah▪ Time-sharing (swapping): 1064 nodes▪ Space-sharing (isolation): 19,1 years

Virtualization of name and IP addresses▪ No problem with the swappings 39

Page 40: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Experiment creation and swapping

Mapping Reservation Reboot issuing Reboot Miscellaneous

Double time to boot on a custom disk image

Key services

40

Page 41: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Key services

Mapping local resources: assign

Match the user’s requirements Based on simulated annealing Try to minimizes the number of switch and

inter-switch bandwidth Less than 13 seconds

41

Page 42: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Key services

Mapping local resources: assign

42

Page 43: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Key services

Mapping distributed resources: wanassign

Different constraints▪ Fully connected via the internet▪ “Last mile”: type instead of topology▪ Specific topologies may be guaranteed by

requesting particular network characteristics (bandwidth, latency & loss)▪ Based on a genetic algorithm

43

Page 44: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Mapping distributed resources: wanassign 16 nodes 100 edges : ~1sec

256 nodes & 40 edges/nodes : 10min~2h

Key services

44

Page 45: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Key services

Disk reloading 2 possibilities ▪ complete disk image loading▪ incremental synchronization (hash tables on files or

blocks) Good▪ Faster (in their specific case)▪ No corruption

Bad▪ Waste of time when similar images are needed

repeatly▪ Pace reloading of freed node (reserved for 1 user) 45

Page 46: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Key services

Disk reloading Frisbee

Performance techniques:▪ Uses a domain-specific algorithm to skip

unused blocks▪ Delivers images via a custom reliable

multicast protocol

117 sec for 80 nodes, write 550MB instead of 3GB

46

Page 47: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Key services

Scaling of simulated resources

Simulated nodes are multiplexed on 1 physical node▪ Must deal with real time taking into account the

user’s specification : rate of events

Test of a live TCP at 2Mb CBR▪ 850MHz PC with UDP background 2Mb CBR / 50ms▪ Able to have 150 links for 300 nodes▪ Problem of routing in very complex topologies 47

Page 48: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Example of a new possibility Possibility to program different batch

experiment, with the modification of only 1 parameter by 1

The Armada file system from Oldfield & Kotz 7 bandwidths x 5 latencies x 3

application settings x 4 configs of 20 nodes

420 tests in 30 hrs (4.3 min ~ per experiment)

48

Page 49: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Outline

MotivationNetbed structureValidation and testingNetbed contributionConclusion

49

Page 50: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Summary Netbed deals with 3 test environments

Reuse of ns script

Quick setup of the test environment

Virtualization techniques provide the artifact-free environment

Enables qualitatively new experimental techniques 50

Page 51: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Future Work

Reliability/Fault Tolerance

Distributed Debugging: Checkpoint/Rollback

Security “Petri Dish”

51

Page 52: An  integrated Experimental Environment  for  Distributed  Systems and Networks

Thank you

Any Question ?