d arryl v eitch the university of melbourne a ctive m easurement in n etworks metrogrid workshop...

46
DARRYL VEITCH The University of Melbourne ACTIVE MEASUREMENT IN NETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Upload: sofia-blackwell

Post on 01-Apr-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

DARRYL VEITCHThe University of Melbourne

ACTIVE MEASUREMENT IN NETWORKS

MetroGrid Workshop

GridNets 2007

19 October 2007

Page 2: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

A RENAISSANCE IN NETWORK MEASUREMENT

• Data centric view of networking

• Must arise from real problems, or observations on data

• Abstractions based on data

• Results validated by data

• In fact: the scientific method in networking

• Not just getting numbers, Discovery

• Knowing, understanding, improving network performance

• Not just monitoring

• Traffic patterns: link, path, node, applications, management

• Quality of Service: delay, loss, reliability

• Protocol dynamics: TCP, VoIP, ..

• Network infrastructure: routing, security, DNS, bottlenecks, latency..

Page 3: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

THE RISE OF MEASUREMENT

• Now have dedicated conferences:

• Passive and Active Measurement Conference (PAM) 2000 …

• ACM Internet Measurement Conference (IMC) 2001 ...

• Papers between 1966-87 ( P.F. Pawlita, ITC-12, Italy )

• Queueing theory: several thousand

• Traffic measurement: around 50

Page 4: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

ACTIVE VERSUS PASSIVE MEASUREMENT

• Typical Active Aims

• “End-to-End” or “Network edge”

• End-to-End Loss, Delay, Connectivity, “Discovery” …..

• Long/short term monitoring: Network health; Route state

• Internet view: Application performance and robustness

• Typical Passive Aims

• “At-a-point” or “Network Core”

• Link utilisation, Link traffic patterns, Server workloads

• Long term monitoring:

Dimensioning, Capacity Planning, Source modelling

• Engineering view: Network performance

Page 5: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007
Page 6: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

ACTIVE PROBING VERSUS NETWORK TOMOGRAPHY

• Network Tomography

• Typically “network wide”: multiple destinations and/or sources

• Simple black box node/link models, strong assumptions

• Classical inference with twists

• Typical metrics: per link/path loss/delay/throughput

• Active Probing

• Typically over a single path

• Use tandem FIFO queue model

• Exploit discrete packet effects in semi-heuristic queueing analysis

• Typical metrics: link capacities, available bandwidth

Page 7: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

TWO ENTREES

• Loss Tomography

Removing the temporal independence assumption

Arya, Duffield, Veitch, 2007

• Active Probing

Towards (rigorous) optimal probing

Baccelli, Machiraju, Veitch, Bolot, 2007

Page 8: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

TOMO - GRAPHY

Tomos

section

Graphia

writing

Page 9: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

EXAMPLES OF TOMOGRAPHY

• Atom probe tomography (APT)

• Computed tomography (CT) (formerly CAT)

• Cryo-electron tomography (Cryo-ET)

• Electrical impedance tomography (EIT)

• Magnetic resonance tomography (MRT)

• Optical coherence tomography (OCT)

• Positron emission tomography (PET)

• Quantum tomography

• Single photon emission computed tomography (SPECT)

• Seismic tomography

• X-ray tomography

Page 10: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

COMPUTED TOMOGRAPHY

Page 11: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

NETWORK TOMOGRAPHY

The Metrics

• Link Traffic (volume, variance) (Traffic Matrix estimation)

• Link Loss (average, temporal)

• Link Delay (variance, distribution)

• Link Topology

• Path Properties (network `kriging’)

• Joint problems (use loss or delay to infer topology)

Began with Vardi [1996]

“Network Tomography: estimating source-destination traffic intensities from link data”

Classes of Inversion Problems

• End-to-end measurements internal metrics• Internal measurements path metrics

Page 12: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

THE EARLY LITERATURE (INCOMPLETE!)

• Loss/Delay/Topology Tomography

• AT&T ( Duffield, Horowitz, Lo Presti, Towsley et al. )

• Rice ( Coates, Nowak et al. )

• Evolution:• loss, delay topology• Exact MLE EM MLE Heuristics• Multicast Unicast (striping )

• Traffic Matrix Tomography• AT&T ( Zhang, Roughan, Donoho et al. )

• Sprint ATL ( Nucci, Taft et al. )

Page 13: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

LOSS TOMOGRAPHY USING MULTICAST PROBING

1 0 1 11 1 1 0

1 0 0 1

...

multicastprobes

...

Loss Estimator

Loss rates in logical Tree1θ

Page 14: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Stochastic loss process on link acts deterministically on probes arriving to

Node and Link Processes

• loss process on link

• probe `bookkeeping’ process for node

THE LOSS MODEL

1 10 0 …

1 1 00 …

1 0 0 … Bookkeeping process

Link loss process

0

Page 15: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Independent Probes

• Spatial: loss processes on different links independent

• Temporal: losses within each link independent

ADDING PROBABILITY: LOSS DEPENDENCIES

Model reduces to a single parameter per link, the

passage or transmission probabilities

Page 16: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

FROM LINK PASSAGE TO PATH PASSAGE PROBABILITIES

Path probabilities: only ancestors matter

Sufficient to estimate path probabilities

Page 17: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

ESTIMATION: JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE

Aim: estimate

1 0

0 1

1 1

1 1

0 0

1

1

1

1

0

… … …

10 / 19

Original MINC loss estimator for binary tree

[Cáceres,Duffield, Horowitz, Towsley 1999]

ACCESSING INTERNAL PATHS

Obtain a quadratic in

Page 18: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

ESTIMATION: JOINT PATH PASSAGE PROBABILITY, SINGLE PROBE

10 / 19

OBTAIN PATHS RECURSIVELY

Page 19: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Before

• Spatial: link loss processes independent

• Temporal:

link loss processes Bernoulli• Parameters: link passage probabilities

TEMPORAL INDEPENDENCE: HOW FAR TO RELAX ?

After

• Spatial: link loss processes independent

• Temporal:

link loss processes stationary, ergodic

• Parameters: joint link passage probabilities over index sets

Full characterisation/identification possible!

Page 20: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

• Importance

• Impacts delay sensitive applications like VoIP (FEC tuning)• Characterizes bottleneck links

Pr

1 3 …2

Loss-run length

Loss/pass bursts……

• Loss run-length distribution ( density, mean )

TARGET LOSS CHARACTERISTIC

Page 21: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

• Sufficiency of joint link passage probabilities

• Mean Loss-run length

• Loss-run distribution

e.g.

ACCESSING TEMPORAL PARAMETERS: GENERAL PROPERTIES

7

Page 22: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

JOINT PASSAGE PROBABILITIES

• Joint link passage probability

• Joint path passage probability

Page 23: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

ESTIMATION: JOINT PATH PASSAGE PROBABILITY

computed recursivelyover index sets

No of equationsequal to degree of node k

= 0 for

Page 24: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

• Estimation of in general trees

• Requires solving polynomials with degree equal to the degree of node k

Numerical computations for trees with large degree

• Recursion over smaller index sets

• Simpler variants

• Subtree-partitioning

Requires solutions to only linear or quadratic equations

No loss of samples

Also simplifies existing MINC estimators

• Avoid recursion over index sets by considering only

subsets of receiver events which imply

ESTIMATION: JOINT PATH PASSAGE PROBABILITY

... ...

Page 25: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

SIMULATION EXPERIMENTS

• Setup

• Loss process

• Discrete-time Markov chains

• On-off processes: pass-runs geometric, loss-runs truncated Zipf

• Estimation

• Passage probability

• Joint passage probability for a pair of consecutive probes

• Mean loss-run length:

• Relative error:

Page 26: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

EXPERIMENTS

• Estimation for shared path in case of two-receiver binary trees

Markov chain

On-offprocess

Page 27: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

EXPERIMENTS

Markov chain

On-offprocess

• Estimation for shared path in case of two-receiver binary trees

Page 28: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

EXPERIMENTS

• Estimation of for larger trees

Link 1

Trees taken from router-level map of AT&T network produced by Rocketfuel (2253 links, 731 nodes)

Random shortest path multicast trees with32 receivers. Degree of internal nodes from 2 to 6, maximum height 6

Page 29: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

VARIANCE

• Estimation for shared path in case of tertiary tree

Standard errors shown for various temporal estimators

Page 30: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

CONCLUSIONS

• Estimators for temporal loss parameters, in addition to loss rates• Estimation of any joint probability possible for a pattern of probes

• Class of estimators to reduce computational burden

• Subtree-partition: simplifies existing MINC estimators

• Future work• Asymptotic variance

• MLE for special cases (Markov chains)

• Hypothesis tests

• Experiments with real traffic

Page 31: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

FRANÇOIS BACCELLI, SRIDHAR MACHIRAJU, DARRYL VEITCH, JEAN BOLOT

OPTIMAL PROBING IN CONVEX NETWORKS

Page 32: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

The Case Against Poisson Probing

• Zero-sampling bias not unique to Poisson (in non-intrusive case)

• PASTA talks about bias, is silent on variance

• PASTA is not optimal for variance or MSE

• PASTA is about sampling only, is silent on inversion

PREVIOUS WORK

10

Probe Pattern Separation Rule: select inter-probe (or probe pattern) separations as i.i.d. positive random variables, bounded above zero.

Example: Uniform (i.i.d.) separations on [ 0.9μ, 1.1μ ]

• Aims for variance reduction• Avoids phase locking (leads to sample path `bias’)• Allows freedom of probe stream design

Page 33: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

• Results derived in context of delay only

• No optimality result (expected to be highly system and traffic dependent)

LIMITATIONS

10

Page 34: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

• Extended all results to loss case (loss and delay in uniform framework)

• For convex networks, have universal optimum for variance

• But: sample-path bias

• Give family of probing strategies with

• Zero bias and strong consistency

• Variance as close to optimal as desired (tunable)

• Fast simulation

• Consistent with spirit of Separation Rule

NEW WORK

10

But are networks convex?

• Some systems when answer is known to be Yes • virtual work of M/G/1

• loss process of M/M/1/K

• Insight into when No

• Real data when answer seems to be Yes

Page 35: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Sampling

• For end-to-end delay of a probe of size x

• Only have probe samples

TWO PROBLEMS: SAMPLING AND INVERSION

10

Inversion

From the measured delay data of perturbed network, may want:

• Unperturbed delays

• Link capacities

• Available bandwidth

• Cross traffic parameters at hop 3

• TCP fairness metric at hop 5 ….

Here we focus on sampling only, do so using non-intrusive probing

Page 36: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Non-intrusive Delay

• Delay process using zero sized probes still meaningful

• Each probe carries a delay: samples available

THE QUESTION OF GROUND TRUTH

10

Non-intrusive Loss

• Losses with x=0 are hard to find.. meaningful but useless

• Lost probes are not available (where? How?)

• Probes which are not lost tell us ? about loss?

Cannot base non-intrusive probing on virtual (x=0) probes

Page 37: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

Approach

• Define end-to-end process directly in continuous time

• Must be a function of system in equilibrium - no probes, no perturbation!

• Non-intrusive probing defined directly as a sampling of this process:

• Interpretation: what probe would have seen if sent in at time .

GENERAL GROUND TRUTH

10

Note

• Process may be very general, not just isolated probes but probe patterns!

• Applies equally to loss or delay

• This is the only way, even virtual probes can be intrusive!

Page 38: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

LOSS GROUND TRUTH EXAMPLES

10

1-hop

• FIFO, buffer size K bytes, droptail

• If packet based instead, becomes independent of x !

2-hop

• Depends on x

Other examples:

• Packet pair jitter

• Indicator of packet loss in a train

• Largest jitter in a chain, or number lost

Page 39: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

NIPSTA: NON-INTRUSIVE PROBING SEES TIME AVERAGES

10

Strong Consistency:Empirical averages seen by probe samples converge to true expectation of continuous time process, assuming:

• Stationarity and ergodicity of CT

• Stationarity and ergodicity of PT (easy)

• Independence of CT and PT (easy)

• Joint ergodicity between CT and PT

Result follows just as in delay work, using marked point processes,Palm Calculus and Ergodic Theory.

Not Restricted to Means :

• Temporal quantities also, eg jitter

• Probe-train sampling strategies

Page 40: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

OPTIMAL VARIANCE OF SAMPLE MEAN

10

Covariance function

Sample Mean Estimator

Periodic Probing

Page 41: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

PERIODIC PROBING IS OPTIMAL !

10

Compare integral terms :If R convex, then can use Jensen’s inequality:

Problem :• Periodic is not mixing, so joint ergodicity not assured (phase lock)

• If occurs, get `sample path bias’ (estimator not strongly consistent)

No amount of probe train design can beat periodic!

Page 42: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

GAMMA RENEWAL PROBING

10

Gamma Law

Gamma Renewal Family:• As increases, variance drops at constant mean

• Poisson is beaten once

• Process tends to periodic as

• Sensitivity to periodicities increases however

Proof:• Again uses Jensen’s inequality

• Uses a conditioning trick and a technical result on Gamma densities

Page 43: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

TEST WITH REAL DATA FOR MEAN DELAY

10

Page 44: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

EXAMPLE WITH REAL DATA

10

Page 45: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

EXAMPLE WITH REAL DATA

10

Page 46: D ARRYL V EITCH The University of Melbourne A CTIVE M EASUREMENT IN N ETWORKS MetroGrid Workshop GridNets 2007 19 October 2007

CONCLUSION TO A MORNING

• Active measurement continues to develop!

• Grid applications will no doubt lead to interesting new problems

Merci de votre attention