1 cs 268: lecture 14 internet measurements scott shenker and ion stoica computer science division...

23
1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776

Upload: dina-walsh

Post on 18-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

3  1 st slide: Title  2 nd slide: motivations and problem formulation -Why is the problem important? -What is challenging/hard about your problem  3 rd slide: main idea of your solution  4 th slide: status  5 th slide: future plans and schedule

TRANSCRIPT

Page 1: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

1

CS 268: Lecture 14

Internet Measurements

Scott Shenker and Ion StoicaComputer Science Division

Department of Electrical Engineering and Computer SciencesUniversity of California, Berkeley

Berkeley, CA 94720-1776

Page 2: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

2

Project Presentations

Five slides

Five minutes

Five questions

You’re outta there!

Page 3: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

3

1st slide: Title

2nd slide: motivations and problem formulation- Why is the problem important?- What is challenging/hard about your problem

3rd slide: main idea of your solution

4th slide: status

5th slide: future plans and schedule

Page 4: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

4

Poll

How many people know:

- Lamport clocks

- Timestamp vectors

- Byzantine agreement

- Epidemic/gossip dissemination algorithms

- Consistency

- Bayou

Page 5: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

5

Internet Measurements

This field went from casual sideline to serious science with Vern Paxson’s thesis in 1997

Now there is a massive literature in Internet measurements

Two devoted conferences (PAM and IMC) in addition to standard networking conferences

Page 6: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

6

Today’s Lecture

Provide a quick overview of the kinds of techniques used and questions addressed

Tom Anderson (UW) will lead us in a design exercise

Page 7: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

7

Philosophical Question

Why do we take Internet measurements?

Page 8: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

8

Measurement Techniques

Passive:- Traces- Netflow data- Network telescopes- BGP looking glass sites- ……

Active:- Ping- Traceroutes- ……

Page 9: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

9

Measurement Infrastructure

NIMI

Planetlab

Telescopes (CIED)

Traceroute servers

…..

Page 10: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

10

Varieties of Measurement Studies

Basic characterization from direct measurements- Raw statistics- Model fitting- Just because it is “direct” doesn’t mean it is easy!

• Look at the care taken by Paxson in the two readings

Deep inference- Estimating something that can’t be directly measured

Page 11: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

11

Examples of Direct Studies

Packet sizes and protocol type

Flow sizes

Route stability

DNS usage

Stationarity

…….

Page 12: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

12

A Few Results

Most flows are small, but most bytes in large flows

Most flows are slow, but most bytes are in fast flows- Rate/size highly correlated (not due to slow start)

Losses are reasonably modeled as Poisson arrival of loss events followed by geometric loss train

Internet routing does not yield good paths….

In 2003, Bytes: TCP 83% UDP 16% Packets: TCP 75% UDP 22% Flows: TCP 56% UDP 33%

Page 13: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

13

Traffic Variances

A Bellcore team discovered that ethernet traffic had strage behavior

No matter what time scale they averaged it over, the variances were still large!

How could that be?

Page 14: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

14

Self Similarity, LRD, and All That….

Correlation: r(t) = < x(s+t)x(s)> - <x(s)>2

Short-range correlation: Sum r(t) finite

Long-range correlation (LRD): Sum r(t) infinite

Example of LRD: r(t) ~ t-.5

Page 15: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

15

Aggregates and Variance

Consider a sum of M iid random variables:- Variance of sums ~ 1/M

Consider process with short-range correlations:- On large-enough time scales, intervals are iid- Variance should decay as 1/M

Only with LRD can the variance decay more slowly- Often as a power law with p<1

Page 16: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

16

LRD Comes from Many Sources?

Poisson arrivals, with long-tailed sizes is one mechanism- Most files small- Most bytes in large files

Pareto distributions are very common!

Page 17: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

17

Zipf and Pareto

Zipf: F(rank) ~ rank-b

- F is frequency, size, etc.- 1 < b

Pareto: P(size) ~ size-a

- 1 < a < 2- Log(size) has geometric distribution!- Pareto distributed interarrival times means logs are uncorrelated

a = 1+1/b and b = 1/(a-1)

The tail of the distribution can’t be ignored!

Page 18: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

18

Examples of Zipf/Pareto

Income distribution Asteroid sizes Letter frequencies

File sizes Telnet packet interarrivals Web access frequencies (impact on caching) AS connectivity

Page 19: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

19

Examples of Inference Tools

Bandwidth estimation

Critical path analysis

Network tomography

Flow rate causes

…..

Page 20: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

20

Bandwidth Estimation (I)

Send pair of packets back-to-back

If not intermediate packets intervene, their spacing is a function of the bandwidth of the bottleneck link

Send a series of packet-pairs, measure the minimal delays

Doesn’t work well: often overestimate capacity!

Can do better with many packets back-to-back

Page 21: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

21

Bandwidth Estimation (II)

Send variable sized packets, k hops (using TTL).

Delay at link i: Di = Ai + P/Ci

- Ai is size-independent delay

- P is packet size

- Ci is capacity of link

Model delay of k hops: Sum (i = 1 to k) Di

Page 22: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

22

Possible Causes of Flow Rates

Application: doesn’t generate data fast enough

Opportunity: never leaves slow-start

Receiver: limited dby receive window

Sender: limited by sending buffer

Bandwidth: using full bottleneck bandwidth

Congestion: flow is responding to packet loss

Transport: none of the above…

Page 23: 1 CS 268: Lecture 14 Internet Measurements Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences

23

Differentiating Causes

Take trace of flow

Estimate RTT

Look at window epochs

Identify where TCP is in cycle