flexlab: a realistic, controlled, and friendly environment for evaluating networked systems

44
1 Flexlab: A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems Jonathon Duerig, Robert Ricci, Junxing Zhang, Daniel Gebhardt, Sneha Kasera, Jay Lepreau University of Utah HotNets-V November 30, 2006

Upload: rafael

Post on 21-Jan-2016

17 views

Category:

Documents


1 download

DESCRIPTION

Flexlab: A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems. Jonathon Duerig , Robert Ricci, Junxing Zhang, Daniel Gebhardt, Sneha Kasera, Jay Lepreau University of Utah HotNets-V November 30, 2006. Emulators (Emulab Sucks). Examples: Modelnet & Emulab - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

1

Flexlab: A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

Jonathon Duerig, Robert Ricci, Junxing Zhang, Daniel Gebhardt, Sneha Kasera, Jay Lepreau

University of Utah

HotNets-VNovember 30, 2006

Page 2: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

2

Emulators (Emulab Sucks)

• Examples: Modelnet & Emulab• The Good: Control, repeatability, wide variety

of network conditions• The Bad: Artificial network conditions

Page 3: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

3

Overlay Testbeds(PlanetLab Sucks)

• Examples: RON & PlanetLab• The Good: Real network conditions• The Bad: Overloaded, No privileged operations, Poor

repeatability, Hard to develop/debug

Page 4: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

4

Goal: Best of Both Worlds (Don’t Suck)

Page 5: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

5

Model-driven Emulation(How not to suck)

Page 6: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

6

Key Points

• Flexlab is an emulation framework into which different network models may be plugged

• Exploit an overlay testbed to generate measurements for some example models– Models make different fidelity, overhead, and

repeatability trade-offs

• Application-Centric Internet Modeling

Page 7: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

7

Flexlab: Application

Page 8: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

8

Flexlab: Application Monitor

Page 9: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

9

Flexlab: Network Model

Page 10: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

10

Flexlab: Measurement Repository

Page 11: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

11

Flexlab: Path Emulator

Page 12: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

12

Flexlab: Feedback

Page 13: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

13

ACIM:Application-Centric Internet Modeling

Page 14: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

14

Imagine Ideal Fidelity

Page 15: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

15

ACIM Architecture

Page 16: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

16

ACIM Challenges

• Hardening implementation to deal with PlanetLab unreliability

• CPU starvation on PlanetLab– Host artifacts in throughput– Packet loss from libpcap

• Reverse path congestion• Measuring bottleneck queue size in time• Discovering when bottleneck link is saturated

Page 17: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

17

ACIM Network Conditions

Page 18: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

18

ACIM Available Bandwidth

• Throughput == available bandwidth iff agent is saturating && bottleneck link is saturated

• Agent saturating socket buffer full

• Bottleneck queue saturated queue filling up RTT increasing recently

Page 19: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

19

Sample Experiment

Page 20: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

20

Sample Results

Page 21: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

21

Sample Results

Page 22: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

22

Sample Results

Page 23: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

23

Sample Results

Page 24: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

24

Network Model Trade-offs

Page 25: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

25

Sample Real Application: BitTorrent.with Static Model

Page 26: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

26

BitTorrent w/ ACIM Model

Page 27: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

27

BitTorrent w/ PlanetLab

What is “correct”? Challenging to determine; work-in-progress.

Page 28: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

28

Conclusions

• Contribution: Modeling Framework for Emulation– Models can allow the experimenter to trade-off fidelity,

repeatability, and overhead

• Contribution: Application-Centric Internet Modeling• Contribution: Running on Emulab and PlanetLab

in alpha stage

Page 29: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

29

Backup Slides

Page 30: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

30

Why not just add more nodes to every PlanetLab site? (cf. public review)

• Remaining problems:– Poor repeatability– Hard to develop/debug– No privileged operations

• Malicious traffic cannot be tested• Some Flexlab network models reduce network load• Emulab node pool stat muxed and shared more

efficiently than per-site pools• Overload can (will?) still happen with PL’s pure

shared-host model• Major practical barriers: admin, cost

Page 31: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

31

PlanetLab Overload (What)

Page 32: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

32

PlanetLab Overload (Why)

• Only a few nodes per site– Sites supply their own nodes– No incentive to increase number of nodes

• No admission control• No resource guarantees• No incentive to minimize usage• Typically tedious to set up experiments

(exceptions: Emulab portal, Plush, other?)

Page 33: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

33

Network Model 1: Static

Page 34: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

34

Static Trade-offs

• Low fidelity

• Fixed continuous overhead

• Complete repeatability

Page 35: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

35

Network Model 2: Dynamic

Page 36: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

36

Dynamic Trade-offs

• Moderate fidelity

• Overhead proportional to number of paths used

• High repeatability

Page 37: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

37

Low-Frequency Measurements Miss Changes (Changepoint Analysis)

Path

20 Sec. Period

2 Sec. Period

Src Dest Count Count Avg magnitude of 2 sec changes

Commodity Commodity 2 20 39%

Commodity Internet2 1 13 15%

Internet2 Internet2 0 0 -

Page 38: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

38

Flexlab and VINIEntirely different kinds of realism and control

• Flexlab: passes “experiment” traffic over shared path– Real Internet conditions from other traffic on same path, but

app. traffic is not from real users– Control: of all software– Environment: friendly local dev. environ, dedicated hosts

• VINI: can pass “real traffic” over dedicated link– Real routing, real neighbor ISPs, potentially traffic from real

users, but network resources are not realistic/representative– Dedicated pipes with dedicated bandwidth, that insulate

experiment from normal Internet conditions– Control: restricted to VINI’s APIs (Click, XORP, etc)– Environment: distributed environ; shared host resources.

Page 39: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

39

Dealing with PlanetLab Unreliability

• Our initial design was optimistic

• Nodes fail– There is no set of ‘good nodes’– Agents must react robustly to node failure

• Most errors are transient– Log everything– Replay packet analysis

Page 40: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

40

CPU Starvation on PlanetLab

• Host Artifacts– Long period when agent can’t read or write– Empty socket buffer or full receive window– Solution: Detect and ignore

• Packet loss from libpcap– Long period without reading libpcap buffer– Many packets are dropped at once– Solution: Detect and ignore

Page 41: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

41

Handling Reverse Path Congestion

• Can cause ack compression• Throughput Measurement

– Throughput numbers become much noisier– We abuse the TCP timestamp option– PlanetLab: homogenous OS environment– Extending it would require hacking client

• RTT Measurement– Future work

Page 42: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

42

Measuring Bottleneck Queue Size

• Important to emulate loss episodes due to congestion

• No one knows how in terms of bytes/packets

• Easier to measure in terms of time:– full = RTT when queue is full– empty = RTT when queue is empty– queue_time = full - empty

Page 43: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

43

Initial Conditions

• Needed to bootstrap ACIM– ACIM uses traffic to generate conditions– But conditions must exist for first traffic

• We created a measurement framework– All pairs of sites are measured– Put data into measurement repository

• Set initial conditions to latest measurements

Page 44: Flexlab:  A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems

44

Path Emulator (detail)