Download - Root Cause Analysis of TCP throughput

1

Measuring the Internet, Part 2:Digging In

21.2.2007

Matti Siekkinen [[email protected]]

University of Oslo

2 21 February 2007

Root Cause Analysis of TCP throughputWhat limits the throughput of a TCP connection?

Bandwidth estimationCapacity estimationAvailable bandwidth estimation

Outline

3 21 February 2007

Root Cause Analysis of TCP throughput

Yin Zhang, Lee Breslau, Vern Paxson and Scott ShenkerOn the Characteristics and Origins of Internet Flow Rates

SIGCOMM 2002

M. Siekkinen, G. Urvoy-Keller, E. Biersack, and T. En-NajjaryRoot Cause Analysis for Long-Lived TCP Connections

CoNEXT 2005

M. Siekkinen, D. Collange, G. Urvoy-Keller, and E. BiersackPerformance Limitations of ADLS Users: A Case Study

PAM 2007

M. Siekkinen, G. Urvoy-Keller, and E. BiersackOn the Interaction Between TCP and Internet Applications

ITC 2007

4 21 February 2007


Motivation and Objectives

Taxonomy of TCP rate limitation causes

Approach to infer limitation causes

Case study on Performance Analysis of ADSL Clients

Conclusions

5 21 February 2007

The Internet: over the last 5 years…

Traffic volumes and number of users have skyrocketedAccess link capacities have multipliedDominance shifted from Web+FTP into Peer-to-peer applicationsTCP still the dominating transport protocol

Carries over 90% of traffic

6 21 February 2007

The Internet: questions raised

ISPs would like to know how clients are doingWhat are the performance limitations that Internet applications are facing?Why does a client with 4Mbit/s ADSL access obtain only total download rate of few KB/s with eDonkey?Why, after upgrading my link, I see no improvement in throughput?Internet does not provide directly answers

The network is dumb!

7 21 February 2007

Root Cause Analysis of TCP Throughput

What?Analysis and inference of the reasons that prevent a given TCP connection from achieving a higher throughput.

Reasons are called limitation causes

Why TCP?TCP typically over 90% of all traffic

8 21 February 2007






Conclusions

9 21 February 2007

What can limit TCP Throughput?

ApplicationTCP End Point

Receiver window limitation

NetworkBottleneck link

TCP protocolSlow Start and Congestion Avoidance

10 21 February 2007

An example of xplot time-sequence diagram

receiver advertizedwindow limit

received acknowledgments

retransmitted datasent data packets

pushed data pktis marked with a diamond

outstanding bytes

size of receiver advertized window

11 21 February 2007

Limitation Causes: Application

The application does not even attempt to use all network resourcesTCP connections are partitioned into two periods:

Bulk Transfer Period (BTP): application provides constantly data to transfer

o Never run out of data in buffer B1Application Limited Period (ALP): opposite of BTP

o TCP has to wait for data because B1 is empty

Application Application

TCP TCPNetwork

Sender Receiver

buffersB1

12 21 February 2007

Application that sends small amounts of data at constant rateStreaming applications

Skype: Internet telephony applicationWeb radios

Throttling applicationsP2P: eDonkey (rate control by user)

13 21 February 2007

Application that sends larger bursts separated by idle periodsBitTorrent, HTTP/1.1 (persistent)

only keep-alive messages

transfer periods

14 21 February 2007

Limitation Causes: TCP Receiver

Receiver advertized window limits the ratemax amount of outstanding bytes = min(cwnd,rwnd)

⇒ Sender is idle waiting for ACKs to arriveFlow control

Sender applicationoverflows receivingapplicationBuffer B2 is full

Configuration problem (unintentional)default receiver advertized window is set too lowwindow scaling is not enabled

Application Application

TCP TCPNetwork

Sender Receiver

buffersB2

15 21 February 2007

Limitation Causes: Network

Limitation is due to congestion at a bottleneck linkpackets get dropped when buffers are fullshared bottleneck: obtain only a fraction of its capacitynon-shared bottleneck: obtain all of its capacity

16 21 February 2007

Limitation Causes: TCP protocol

Limiting factor is TCP’s congestion avoidance or slow start algorithm

Transfer ends before the rate grows enough to hit limits set by network or receiving TCP

17 21 February 2007






Conclusions

18 21 February 2007

Approach

Analyze passive traffic measurementsCapture and store all TCP/IP headers, analyze later off-line

Observe traffic at a single measurement pointApplicable in diverse situationsE.g. at the entry point to an ISP’s network

o Know all about clients’ downloads and uploads

Bidirectional packet tracesConnection level analysis

19 21 February 2007

Single measurement point anywhere along the pathCannot/don’t want to control itComplicates estimation of parameters (RTT and cwnd)

Challenges (1/3)

A: RTT ~ d1⇒ piece of cake…

B: RTT ~ d3+d4⇒ How to get d4?

(Did ack2 triggerdata2?)

ack2

A B

20 21 February 2007

Challenges (2/3)

A lot of data to analyzePotentially millions of connections per trace

Deep analysisFor each connection of each trace

o Compute a lot of metricso Divide connections into pieces

• Analyse separately and compute more metricso Need to keep track of everything

InTraBase

21 21 February 2007

Challenges (3/3)

Find the right metrics to characterize all limitationsNot too manyNeed to gather a lot of experience

Get it right!Several methods for computing a particular metrics

o Choose the “best” for the situationo Try to maximize correctness of resultso E.g. 5 ways to estimate RTTs

Careful validationso Benchmark with a lot of reference traceso Cross validate metrics

22 21 February 2007

Procedure to analyze a connection

Divide & Conquer1. Partition connections into BTPs and ALPs

o Filter out application impact2. Analyze the bulk transfer periods for limitation by

o TCP receivero TCP protocolo Network

Methods are based on metrics computed from packet headers

23 21 February 2007

Why filter out application effect?Many TCP/IP –level traffic studies do not account for application effect

RTTs, burstiness…Try to study network properties but end up measuring application effect instead!

24 21 February 2007

Distinguishing BTPs from ALPs:Isolate & Merge algorithm

1. phase: IsolateFact: TCP always tries to send MSS size packetsConsequence: small packets (size < MSS) and idle time indicate application limitationo Buffer between application and TCP is empty

TimeIdle time > RTT

MSS packet

packet smaller than MSS

ALP

…

ALP

…

large fraction of small packets

25 21 February 2007

Distinguishing BTPs from ALPs:Isolate & Merge algorithm

2. phase: MergeWhy?

o After Isolate, BTPs may be separated by very short ALPso Analyze impact of the application

• How much ALPs decrease overall throughput?

How?o Merge subsequent transfer periods separated by ALP to create a new

BTPo Mergers controlled with

drop parametero Iterate until all possible

mergers are performed

26 21 February 2007

BTP Analysis

1. Compute limitation scores for each BTP4 quantitative scores computed from various metricso ∈[0,1]o Metrics: retransmission rate, pkt inter-arrival time pattern, path

capacity, RTT, etc. computed from packet headers

2. Perform classification of BTPs into limitation causesMap combination of limitation scores into the dominant causeThreshold-based scheme

27 21 February 2007

Example: Receiver window limitation score

Uses two time series:outstanding bytes (O)receiver advertised window (R)Computed over RTT-long intervalComputed using rwnd, TCP #seq, TCP #ack, pkt timestamps, RTTso RTTs itself computed from the above parameters

Compute R-O for each pair of valuesindicates how close the sender is to the limit set by the receiver advertised windowoutput 1 if R ~ O, and 0 otherwise

Limitation score is the average value from the R-O comparisonindicates the fraction of time being limited by the receiver advertised window

28 21 February 2007

Retransmission scorefraction of retransmitted bytes

Dispersion scoreassess the impact of the bottleneck on the throughput

if DS is close to zero tput~r non-shared bottleneck linkelse shared bottleneck link

Example: Network limitation scores

path theofcapacity is where,1DS rr

tput−=

29 21 February 2007

b score

high b score ⇒TCP receiver limited low b score ⇒ network limited

B-score relates to the time the sender is idle waiting for ACKs to arrive

Sender waitsfor ACKs

30 21 February 2007

Classification scheme

4 thresholds need to be set

b-score

Dispersion score

Retransmissionscore

Receiver windowlimitation score

31 21 February 2007

Classification: calibrating the thresholdsDifficult task: Diversity vs. Control

Reference data needs to be representative & diverse enougho No simulations

Need to control experiments in some way to get what we wantReference data with partially controlled experiments

Try to generate transfers limited by certain causeFTP downloads from Fedora Core mirror sites

o 232 sites covering all continentsArtificial bottleneck links with rshaper

o network limitationNistnet to add delay

o receiver limitation (Wr/RTT < bw)Control the number of simultaneous downloads

o unshared vs. shared bottleneck

InternetInternet

AustraliaJapan

FinlandUSA

Eurecom RshaperNistnet

32 21 February 2007

Classification: calibrating the thresholdsexample

bottleneck set at 1 Mbit/s, 1 download at a time

set th1 here

33 21 February 2007






Conclusions

34 21 February 2007

Motivation

Stress test for our techniquesDo we learn useful things?

Installed InTraBase at France Telecom to study traffic at their ADSL access network

Root cause analysis techniques implemented within InTraBase

35 21 February 2007

Measurement Setup

24 hours of traffic on March 10, 2006290 GB of TCP traffic

64% downstream, 36% upstreamObserved packets from ~3000 clients, analyze only 1335

Excluded clients that did not generate enough traffic for RCA

Two pcap probes here

Internetcollectnetwork

accessnetwork

36 21 February 2007

ConnectionsSize distribution highly skewedUse only 1% of them for RCA

o Represent > 85% of all traffic

ClientsHeavy-hitters: 15% of clients generate 85-90% of traffic (up & down)Low access link utilization

o Why?

Warming up…

37 21 February 2007

Results of Limitation Analysis

Applied the root cause analysis algorithms

Obtained a striking resultApplication is the main performance limitation cause for over 80% of clientsWhat’s going on?

38 21 February 2007

Application analysis:Application limited traffic

Quite stable and symmetric volumesOver 80% of all traffic

eDonkey and “other” dominateP2P

other

eDonkey

39 21 February 2007

Application analysis:Saturated access link

No recognized P2PAsymmetric port 80/8080 downstream

Real Web traffic?

40 21 February 2007

Connecting the evidence…

Most clients’ performance limited by applicationsVery low link utilizations for application limited trafficMost of application limited traffic seems to be P2P

Peers often have asymmetric uplink and downlink capacitiesP2P applications/users enforce upload rate limits

⇒ Most clients’ download performance seems to suffer from P2P clients drastically limiting their upload rates

InternetInternet

Low utilization Low capacity+rate limiter

downloading client uploading

clients

41 21 February 2007

Conclusions

Causes can be on different layers and in different locations

Application, TCP, or IPEnd hosts, network

Root causes of TCP throughput can be inferred usingbidirectional packet traces atsingle measurement point located anywhere on the TCP/IP path.

TCP root cause analysis techniques enable:performance evaluation of applications,evaluation of network utilization, andidentification of TCP configuration problems.

42 21 February 2007

Root Cause Analysis of TCP throughputWhat limits the throughput of a TCP connection?

Bandwidth estimationCapacity estimationAvailable bandwidth estimation

Outline

43 21 February 2007

Bandwidth estimation

Ravi S.Prasad, Marg Murray, K.C. Claffy, Constantine Dovrolis:Bandwidth Estimation: Metrics, Measurement Techniques, and Tools

IEEE Network, 2003.

Vinay Ribeiro, Rolf Riedi, Jiri Navratil, Rich Baraniuk, Les Cottrell:PathChirp: Efficient Available Bandwidth Estimation

PAM 2003

Manish Jain, Constantine Dovrolis:End-to-End Available Bandwidth: Measurement Methodology, Dynamics, and Relation with TCP Throughput

IEEE/ACM Transactions on Networking, 2003.

Constantine Dovrolis, Parmesh Ramanathan, David Moore:Packet Dispersion Techniques and Capacity Estimation

IEEE/ACM Transactions on Networking, 2004.

Rohit Kapoor, Ling-Jyh Chen, Li Lao, Mario Gerla, M. Y. Sanadidi:CapProbe: A Simple and Accurate Capacity Estimation Technique

ACM SIGCOMM 2004

44 21 February 2007


IntroductionMetrics and definitionsCapacity estimation

PathrateCapProbe

Available bandwidth estimationPathloadPathChirp

Conclusions

45 21 February 2007

Why is bandwidth estimation needed?

Route Selection in OverlaysTraffic EngineeringQoS Verification

46 21 February 2007

What’s the fuzz about?

Routers and switches do not provide direct feedback to end-hosts (except ICMP, also of limited use)

Mostly due to scalability, policy, and simplicity reasonsNetwork administrators can read router/switch information using SNMP protocolEnd-to-end bandwidth estimation cannot be done in the above way

No access because of administrative barriers

47 21 February 2007

The Internet is a “black box”

The InternetThe Internet

End-systems can infer network state through end-to-end (e2e) measurementsWithout any feedback from routersObjectives: accuracy, speed, minimal intrusiveness

Probing packets

48 21 February 2007



PathrateCapProbe


Conclusions

49 21 February 2007

Metrics and definitions

Capacity = maximum possible throughputAvailable bandwidth = portion of capacity not currently usedBulk transfer capacity = throughput that a new single long-lived TCP connection could obtain

We focus on capacity and available bandwidth estimation

50 21 February 2007

Example end-to-end path


router1

cross traffic cross traffic

link1(access link) router2

cross traffic

link2

source host

destination host

link3(access link)

51 21 February 2007


Capacity of this path is 100 MbpsDetermined by the narrow link

Available bandwidth of this path is 50 MbpsDetermined by the tight link

narrow link tight link

100 Mbps 90 Mbps10 Mbps



link capacity

available bandwidthused bandwidth

link1 link3link2

52 21 February 2007



PathrateCapProbe


Conclusions

53 21 February 2007

Originally, due to Jacobson & KeshavSend two equal-sized packets back-to-back

Packet size: LPacket trx time at link i: L/Ci

P-P dispersion: time interval between last bit of two packetsWithout any cross traffic, the dispersion at receiver is determined by narrow link:

⎟⎟⎠

⎞⎜⎜⎝

⎛∆=∆

iinout C

L,max

CL

CL

iHiR =⎟⎟

⎠

⎞⎜⎜⎝

⎛=∆

= ,...,1max

Packet pair technique

54 21 February 2007

The problem: Cross traffic interference

Cross traffic packets can affect P-P dispersionP-P expansion: capacity underestimationP-P compression: capacity overestimation

Noise in P-P distribution depends on cross traffic load

55 21 February 2007

Effect: Multimodal packet pair distributionTypically, P-P distribution includes several local modes

One of these modes (not always the strongest) is located at L/C (CM)Sub-Capacity Dispersion Range (SCDR) modes:

P-P expansion due to common cross traffic packet sizes (e.g., 40B, 1500B)Post-Narrow Capacity Modes (PNCMs):

P-P compression at links that follow narrow link

56 21 February 2007



PathrateCapProbe


Conclusions

57 21 February 2007

Packet train dispersion

Measure a train of N packets instead of a packet pairWhat happens when we increase N?

Range decreases and tends to unimodal distributionMode is at Asymptotic Dispersion Rate (ADR)

Asymptotic Dispersion Rate = 15Mbps

capacity

58 21 February 2007

Asymptotic Dispersion Rate

ADR is not the capacity

ADR is not the available bandwidth

ADR is always less than capacity

59 21 February 2007

Pathrate algorithm

Phase I:Form packet pair distribution β by doing many packet pair experimentsDetermine the set of local modes M of distribution β

Phase II:Form bandwidth distribution β(N) by doing many packet train experiments with trains of length NIf β(N) not unimodal, repeat previous step with increased N[c-,c+] is the range of the unique modeEstimate the capacity as one of the local modes of β:

Ĉ = mk = min{mi ∈ M : mi > c+}

60 21 February 2007



PathrateCapProbe


Conclusions

61 21 February 2007

No cross-traffic

Capacity = (Packet Size) / (Dispersion)

Ideal Packet Dispersion

62 21 February 2007

Cross-traffic (CT) serviced between PP packetsSecond packet queues due to Cross Traffic (CT )=> expansion of dispersion =>Under-estimationMore pronounced when CT pkt size < probe pkt size

Expansion of Dispersion

63 21 February 2007

First packet queueing => compressed dispersion => Over-estimationMore pronounced when CT pkt size > probe pkt size

Compression of Dispersion

64 21 February 2007

CapProbe approach

Observations:First packet queues more than the second

o Compressiono Over-estimation

Second packet queues more than the firsto Expansiono Under-estimation

Both expansion and compression are the result of probe packets experiencing queuing

o Sum of PP delay includes queuing delay

Filter PP samples that do not have minimum queuing timeDispersion of PP sample with minimum delay sum reflects capacity

65 21 February 2007

CapProbe Observation

For each packet pair, CapProbe calculates delay sum: delay(packet_1) + delay(packet_2)

A PP with the minimum delay sum points out the capacity

capacity

66 21 February 2007



PathrateCapProbe


Conclusions

67 21 February 2007

Sender transmits Self-Loading Periodic Streams (SLoPS)Periodic packet stream of rate RK packets, packet size L, interarrival T = L/R

Receiver measures One-Way Delay (OWD) for each packetD(k) = tarv(k) - tsnd(k)OWD variations: ∆(k) = D(k+1) – D(k)

o Independent of clock offset between sender/receiverBasic principle:

If R > A, then ∆(k) > 0 for all k, where A is available bandwidthElse, ∆(k) = 0 for all k

Probing method: Self-induced congestion

68 21 February 2007

Increasing OWDs means R>AAlmost constant OWDs means R<A

Self-loading periodic streams

69 21 February 2007

Pathload iterative algorithm

1. Sender sends SLoPS with rate R(n)2. Receiver determines from OWDs if R(n) > A or not3. Receiver notifies sender4. Sender sends a new SLoPS with rate R(n+1)

If R(n) > A then R(n+1) < R(n) else R(n+1) > R(n)Specifically:

If R(n) > A, Rmax = R(n);If R(n) ≤ A, Rmin = R(n);R(n+1) = (Rmax + Rmin)/2

5. Terminate algorithm if Rmax - Rmin ≤ ωω is user-specified estimation resolution

70 21 February 2007

Detection of an increasing OWD trendPartition measured (relative) OWDs = D1, D2,…,DK into G = groups of G consecutive OWDsCompute the median OWD of each group

o More robust to outliers and errorsPairwise Comparison Test (PCT)

,

o An increasing trend if SPCT > 0.55

How does receiver determine if R(n) > A ?

K

kD̂

1)ˆˆ(

21

−

>= ∑ =

−

GDDI

SG

kkk

PCT⎩⎨⎧

=01

)( XIif X holds

otherwise

71 21 February 2007



PathrateCapProbe


Conclusions

72 21 February 2007

100Mbps-1 packets, 134.1 ⇒=γ

Chirp probing train

Train of packets with exponentially decreasing packet spacingWide range of probing ratesEfficient: few packets

73 21 February 2007

Same principle of self-induced congestion (SIC)Multiple rates in each chirp train

Allows one estimate per-chirpPotentially more efficient estimation

Chirps vs. SLoPS

74 21 February 2007

Point where queuing delay begins to increase gives available bandwidth

CBR Cross-Traffic Scenario

75 21 February 2007

Bursty Cross-Traffic Scenario

Goal: exploit information in queuing delay signature

76 21 February 2007

I. Per-packet pair available bandwidth, (k=packet number)

II. Per-chirp available bandwidth

III. Smooth per-chirp estimate over sliding time window of size

∑∑

∆

∆=

kk

kkk

t

tED

kE

τ

PathChirp Algorithm Overview

77 21 February 2007

Definitions:delay of packet k

= inst. rate at packet k

kkkk

kkkk

REqqREqq

<⇒<>⇒>

+

+

1

1

=kqkk tR ∆= size/packet

Computing Ek: Preliminaries

SIC principle:

78 21 February 2007

Segment signature into excursionsqk > 0 for several consecutive packets

Valid excursions are those consisting of at least L packetsapplies only to valid excursionskk RE <

Computing Ek: Excursions

79 21 February 2007

• Valid excursion increasing queuing delay

kk

kk

RE

RE

=

<

nk

kk

RE

RE

=

> • Valid excursiondecreasing queuing delay

nk

kk

RE

RE

=

<

•Last excursion• Invalid excursions

nk RE =

Computing Ek: Different cases

= Set Ek

= Rk

80 21 February 2007

AccuracyEfficiency

63-74Mbps54-63Mbps8.6MB0.6MB70Mbps



pathloadAvg.min-max

pathChirp10-90%

pathloadpathchirpAvailable bandwidth

Comparison with Pathload

100Mbps linkspathChirp uses 10 times fewer bytes for comparable accuracy

81 21 February 2007



PathrateCapProbe


Conclusions

82 21 February 2007

Wrapping up

Zillion of other estimation tools & techniques that we did not look at

Abing, netest, pipechar, STAB, pathneck, IGI/PTR, abget, Spruce, pathchar, clink, pchar, PPrate, …

Some practical issuesTraffic shapersNon-FIFO queues

More scalable methodsPassive measurements instead of active measurements

o E.g. PPrate (2006) for capacity estimation: adapt Pathrate’s algorithmOne measurement host instead of two cooperating ones

o E.g. abget (2006) for available bandwidth estimation: adapt Pathload

83

That’s all folks…

Download - Root Cause Analysis of TCP throughput

Top Related