trajectory sampling for direct traffic observation

Trajectory Sampling forDirect Traffic Observation

Matthias Grossglauser

joint work with Nick Duffield

AT&T Labs – Research

Traffic Engineering

overload!

Two large flows

Traffic Engineering

overload!New egress point

for first flow

Multi-homed customer

Traffic Engineering

overload!

OSPF shortest path splitting

Traffic Engineering• Goal: domain-wide control & management to

– Satisfy performance goals– Use resources efficiently

• Knobs:– Configuration & topology: provisioning, capacity

planning– Routing: OSPF weights, MPLS tunnels, BGP policies,

…– Traffic classification (diffserv), admission control,…

• Measurements are key: closed control loop– Characterize demand: what’s coming in?– Observe network state: how is the network

reacting? (low-level adaptivity!)– Check performance: what’s the customer’s QoS?

Traffic Matrix vs. Path Matrix

• Traffic matrix– # bytes from ingress i to egress j

• Path matrix– Spatial flow of traffic through domain– # bytes for every path from i to j

flow 1 flow 2 flow 3 flow 4

Flow Measurement

• IP flow abstraction– Set of packets with “same” src and dest IP

addresses– Packets that are “close” together in time (a

few seconds)• Cisco NetFlow

– Router maintains a cache of statistics about active flows

– Router exports a measurement record for each flow

Inferring the Path Matrix from the Traffic Matrix

Network State Uncertainty• Hard to get an up-to-date snapshot of…• …routing

– Large state space– Vendor-specific implementation– Deliberate randomness– Multicast

• …element states– Links, cards, protocols,…

• …element performance– Packet loss, delay at links

missing “down” alarms spurious down

missing alarms

Direct Traffic Observation• Goal: direct observation

– No network model & state estimation• Basic idea:

– Sample packets at each link– Sampling decision based on hash over packet

content– Consistent sampling trajectories– Labels based on second hash function

• Exploit entropy in packet content to obtain statistically representative set of trajectories

Sampling and Labeling

• Fields of interest collected only once• Multicast: trajectory is a tree

Fields Included in Hashes

Collisions: Identical Packets

Sampling and Labeling Hashes

• x: subset of packet bits, represented as binary number

• Sampling hash– h(x) = x mod A– Sample if h(x) < r– r/A: thinning factor

• Labeling hash– g(x) = x mod M

• Make appropriate choice of A, M– predictable patterns should “mix” well

Pseudo-Random Sampling• Goal: infer metrics of interest from

trajectory samples– E.g., what fraction of traffic of

customer x on a link y?• Question: is sample set statistically

representative?– Obvious for “really random” sampling– Distribution of a field in the sampled

subset = real distribution?– In other words: does the complement

of the field provide enough entropy?

Quality of Deterministic Sampling

• Experiment: statistical test to check if sampled and full distributions are close– Chi-square statistic to verify independence

hypothesis– Hypothesis: sampled distribution consistent

with full distribution

– Confidence level C(T) for hypothesis, where C is cdf of with I-1 degrees of freedom2

jn j bin in samples # :

nnnnmmmmmmmm

111211

000201

jmij bin in d(un)sample packets # :

Chi-square Test on Source AddressIf , then accept hypothesis 1)(TC

Bitwise Independence• 2x2 contingency table formed by

– sampling decision– l-th bit of packet

Optimal Sampling

• Fix amount of measurement traffic c per time period

• Problem:– n: number of samples in sampling period– M: alphabet size, m=log2(M) bits/label– nm: total amount of measurement traffic [bits]– Goal: maximize # unique labels, subject to nm<c

• Result:– optimal alphabet size M*=c log(2)– optimal number of samples n*=M*/log(M*)– example: c=1Mb/period

Label Collisions and Trajectory Ambiguity

Ambiguity cont.

• Rule for acyclic subgraphs + unicast packets:– unambiguous if each connected component of the subgraph is

• (a) a source tree• (b) a sink tree without loss

InferenceExperiment

• Experiment: infer from trajectory samples– Estimate fraction of traffic from customer– Source address customer– Source address sampling + label

• Fraction of customer traffic on backbone link:

b on labels unique #cb, on common labels unique #ˆ

Estimated Fraction (c=1000bit)

Estimated Fraction (c=10kbit)

Sampling Device

MPLS: simple additional logic to look “behind” label stack

Sampling Device Implementation

• Interface vs. processing speed– OC-192: 10 Gbps– State of the art DSP:

• Proc: 600M MACs x 32 bit: 20 Gbps• I/O: 300MHz x 256 bit: 70 Gbps

– Moore’s law vs. interface speed growth• Vendor interest: cisco, juniper, avici

Summary• Advantages

– Trajectory sampling estimates path matrix…and other metrics: loss, link delay

– Direct observation: no routing model + network state estimation

– No router state– Multicast (source tree), DDoS (sink tree)– Control over measurement overhead– Small measurement delay

• Disadvantages– Requires support on linecards

• Open questions & research problems– Collection, storage, querying (in progress)– Management interface

trajectory sampling for direct traffic observation

Documents

observation-based nonlinear proportional–derivative...

15: impact of axbt target observation to tropical cyclone...

trajectory parsing by cluster sampling in...

1 ch 5: cluster sampling with equal probabilities defn: a...

chapter 7: observing decision maker & office lecture 6...

missile trajectory shaping using sampling-based...

stat 322/332/362 sampling and experimental...

trajectory parsing by cluster sampling in spatio-temporal...

real-time observation of taxa-specific plankton...

creating user interfaces review midterm sampling homework:...

an overview of survey techniques, sampling strategies, modes...

turner, r.m. (2016) efficient path sampling for trajectory...

1 cluster sampling defn: a cluster is a group of observation...

mathematical models for aircraft trajectory design : a...

· taken by using quato sampling cross section ally. there...

psychological processes in the work...

correlation-based dynamic sampling for online high...

state-observation sampling and the econometrics … ·...

cost-effective sampling network design for contaminant ......

3. sampling theory - technische universität münchen · 3....