cs 4700 / cs 5700 network fundamentals lecture 17: network modeling (not everyone has a datacenter)

Post on 01-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CS 4700 / CS 5700Network Fundamentals

Lecture 17: Network Modeling (Not Everyone has a Datacenter)

Wide-Area Network Research

Most research now focused on large-scale systems

Challenges: testing and evaluation How to perform wide-area tests in a repeatable,

reliable manner ModelNet, Emulab

Challenge: understanding/capturing Internet topologies Graph characterization: dK-series

2

3

ModelNet dK

Outline

A Case for Network Emulation

Need a way to test large-scale Internet services Peer-to-peer, overlay networks, novel protocols

Testing in the real world PlanetLab… Results not reproducible or predictable Difficult to deploy and administer research software

Simulation tools Allows control over test environment May miss important system interactions

Emulation Emulators subject application traffic to end-to-end bandwidth

constraints, latency, and loss rate of user specified topology Previous implementations not scalable

4

ModelNet

A scalable, cluster-based, comprehensive network emulation environment

5

Design

User run configurable number of instances of application on Edge Nodes within cluster

Each instance is a Virtual Edge Node (VN) Each VN has a unique IP address

Edge nodes route traffic through cluster of Core Routers Equipped with large memories,

modified FreeBSD kernels Core routers route traffic through

emulated links or “pipes” Each pipe has own packet queue and queuing discipline

6

ModelNet Phases

Create Generates a network topology as a graph From Internet traces, BGP dumps, synthetic topology

generators, etc. Annotate graph with loss rates, failure distributions…

Distillation Transforms GMLgraph into pipe topology

Assignment Maps pipe topology to core nodes, distributing emulation

load across core nodes Finding ideal mapping is NP-complete ModelNet uses greedy k-clusters assignment

For k core nodes, randomly select k nodes in distilled topology. Greedily select links from connected component in round robin

7

ModelNet Phases

Binding Multiplex multiple VNs to each physical edge

nodes Bind each physical edge node to a core router Generate shortest path routes between all VNs

and install in core routing tables

Run Executes target application code on edge nodes

8

Inside the Core

Route traffic through emulated “pipes” Each route is an ordered list of pipes Packets move through pipes by reference Routing table requires O(n2) space

Packet Scheduling When packet arrives, put at tail of first pipe in its route. Scheduler stores heap of pipes sorted by earliest deadline -

exit time for first packet in its queue Once every clock tick

Traverse pipes in heap for packets that are ready to exit Move packets to tail of next pipe or schedule for delivery Calculate new deadlines

Multi-core Configuration Next pipe in route may be on different machine If so, core node tunnels packet descriptor to next node

9

Scalability Issues

Traffic traversing core is limited by cluster’s physical internal bandwidth

ModelNet must buffer up to full bandwidth-delay product of target network.

250 MB of packet buffer space to carry flows at aggregate bandwidth of 10 GB/s with 200 ms roundtrip latency.

Assumes perfect routing protocol

10

Baseline Accuracy

Want to insure that under load, packets are subject to correct end-to-end delays

Used kernel logging to track ModelNet performance and accuracy

Results show that by running ModelNet scheduler at highest kernel priority Packets are delivered within 1ms of target end-

to-end value Accuracy is maintained up to 100% CPU usage

11

Scalability

Additional Cores Adding core routers allows ModelNet to deliver

higher throughput Communication between core routers introduces

overhead. Higher cross-core communication results in less throughput benefit

VN Multiplexing Higher degrees of multiplexing enable larger

network emulation Inaccuracies introduced due to context switching,

scheduling, resource contention, etc

12

Accuracy vs. Scalability

Reduce overhead by deviating from target network requirements

Changes should minimally impact application behavior

Ideally, system reports degree and nature of emulation inaccuracy

13

Scalability via Distillation

Pure hop-by-hop emulation Distilled topology is isomorphic

to target network High per packet overhead

End-to-end distillation Remove all interior network nodes Collapse each path into

single pipe Latency = sum of latencies

along path Reliability = product of link

reliabilities along path Low per packet overhead Does not emulate link contention along path

14

Time Dilation on Modelnet

The challenge Need to emulate networks with more resources E.g. fast CPU (20Ghz), large b/w networks (TB/s) But only commodity machines available

Solution Modelnet + time dilation via virtual machines Run application inside single VMs Slow down time inside VM Result: everything looks faster/bigger/fatter

More CPU cycles/time, packets/time, disk I/O /time

15

How It’s Done

Must isolate VM from outside measures of time Time based on shared data structure provided by VMM Scale data structure by a Time Dilation Factor (TDF) Also scale hardware timer by TDF

How do we scale only some resources? Slow the others back

down!! Example: speed up

network by TDF=10 B/w increases by 10,

but delay dec by 10So inc delay by 10

Virtual Machine Monitor (VMM)

NodeMgr

LocalAdmin

VM1 VM2 VMn…

16

ModelNet Summary

ModelNet, antithesis of PlanetLab Testing of unmodified applications Reproducible results Experimentation using broad range of network

topologies and characteristics Large scale experiments (thousands of nodes and

gigabits of cross traffic) Can scale to emulate non-existent resource levels

But what if you want real deployment on-demand? Emulab / NetBed

17

Emulab / NetBed

A shared configuration on-demand testbed What if you don’t have your own cluster What if you need to test specific

environments/HW? What if you need this in 5 mins?

Emulab / NetBed Hardware: 328 PCs, high speed Gb Cisco switches Software: OS-loader and manager via web

interface Wipe all disks, load OS-images, configure routers in

<2 mins Reboot and give ssh access

18

Emulab Web Interface19

20

ModelNet dK

Outline

Importance of Network Topology

Access to real-world network topologies is vital for research

New routing and other protocol design, development, testing, etc. Analysis: performance of a routing algorithm

strongly depends on topology Generation: empirical estimation of scalability

Network robustness, resilience under attack, worm spreading, etc.

21

Network Topology Research22

Sta

tic

Top

olo

gie

sD

yn

am

icTop

olo

gie

s

23

Trade Secrets

Unfortunately, large scale network topologies are often proprietary Think about BGP ISPs want to hide their internal topology

Real datasets are rare Small scale Out of date Static (i.e. not dynamic)

24

Towards Synthetic Topologies

Question: can we use graph models to capture real network topologies? Fit a model to a real topology Use a generator to produce synthetic topologies

that are similar, but not identical to the real topology

Benefits Privacy – synthetic graphs are not proprietary Randomization – produce an infinite number of

stochastic snapshots Scalable – generator can produce similar

topologies of any size

Important Topology Metrics

Degree distribution Clustering Assortativity Distance distribution Betweenness distribution

Problems

No way to reproduce most of the important metrics

No guarantee there will not be any other/new metric found important

25

The Approach

Look at inter-dependencies among topology characteristics

See if by reproducing most basic, simple, characteristics, we can also reproduce all other characteristics, including practically important

Try to find the characteristic(s) that define all others Key Observation:

Graphs are structures of connections between nodes

26

Definition of dK-distributions

dK-distributions are degree correlations within simple connected graphs of size d

For example 1K distribution

correlations between node degree distribution 2K distribution

correlations on joint node degree distribution 3K distribution

correlations on clustering coefficient

27

An Example of dK

xK is distribution of subgraphs with particular degrees dK-1 describes node degree distribution dK-2 describes joint node degree

distribution dK-3 captures clustering coefficient

28

dk-0: average degree=2dk-1: P(1)=1, P(2)=2, P(3)=1 dk-2: P(1,3)=1, P(2,2)=1, P(2,3)=2 dk-3: P(1,3,2)=2, P(2,2,3)=1

28

Nice properties of dK-series

Constructability: we can construct graphs having properties Pd (dK-graphs)

Inclusion: if a graph has property Pd, then it also has all properties Pi, with i < d (dK-graphs are also iK-graphs)

Convergence: the set of graphs having property Pn consists only of one element, G itself (dK-graphs converge to G)Guarantees that all (even not yet defined!) graph metrics can be captured by sufficiently high d

29

Inclusion and dK-randomness

2K

0K

0K-random

1K

Given G

1K-randomnK

2K-random

30

How Do We Generate Graphs?

A number of different approaches Stochastic Pseudograph Matching Rewriting

Some are extensible to d=3, others are not New research proposed d=2.5, to make

generation tractible

31

Stochastic approach

Classical (Erdos-Renyi) random graphs are 0K-random graph in the stochastic approach

Easily generalizable for any d: Reproduce the expected value of the dK-

distributions by connecting random d-plets of nodes with (conditional) probabilities extracted from G

Best for theory Worst in practice

32

Pseudograph approach

Reproduces dK-distributions exactly Constructs not necessarily connected

pseudographs Extended for d = 2 Failed to generalize for d > 2: d-sized

subgraphs start overlap over edges at d = 3

33

Pseudograph details

1K1. dissolve graph into a

random soup of nodes2. crystallize it back

2K1. dissolve graph into a

random soup of edges2. crystallize it back

k1 k2

k1k2

k3

k4

k1

k1

k1 k1-ends

34

dK-Randomizing Rewiring

Can generate random graphs from original Generalizes to any d But cannot generate desired graph from dK-

distributions

35

Algorithms

All algorithms deliver consistent results for d = 0

All algorithms, except stochastic(!), deliver consistent results for d = 1 and d = 2

Both rewiring algorithms deliver consistent results for d = 3

Eventual choice Use pseudograph to construct 1K graphs Use targeted rewriting to build higher d graphs

36

Skitter Scalar Metrics

Metric 0K 1K 2K 3K skitter

<k> 6.31 6.34 6.29 6.29 6.29

r 0 -0.24 -0.24 -0.24 -0.24

<C> 0.001 0.25 0.29 0.46 0.46

d 5.17 3.11 3.08 3.09 3.12

sd 0.27 0.4 0.35 0.35 0.37

l1 0.2 0.03 0.15 0.1 0.1

ln-1 1.8 1.97 1.85 1.9 1.9

37

HOT Scalar Metrics

Metric 0K 1K 2K 3K HOT

<k> 2.47 2.59 2.18 2.10 2.10

r -0.05 -0.14 -0.23 -0.22 -0.22

<C> 0.002 0.009 0.001 0 0

d 8.48 4.41 6.32 6.55 6.81

sd 1.23 0.72 0.71 0.84 0.57

l1 0.01 0.034 0.005 0.004 0.004

ln-1 1.989 1.967 1.996 1.997 1.997

38

HOT 0K39

True HOT Graph HOT 0K

HOT 1K40

True HOT Graph HOT 1K

HOT 2K41

True HOT Graph HOT 2K

HOT 3K42

True HOT Graph HOT 3K

top related