advanced networking technologies - startseite tu ilmenau · 1 advanced networking technologies...
Post on 16-Sep-2019
7 Views
Preview:
TRANSCRIPT
1
Advanced NetworkingTechnologies
Chapter 7Transport Layer Evolution
Advanced Networking (SS 17): 07 - Transport Layer Evolution
2
Content
q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC
Advanced Networking (SS 17): 07 - Transport Layer Evolution
3
TCP congestion control: What is the problem?
q Relies on packets getting lost!q Root cause of all the buffer problemsq Degrades quality for other servicesq Takes some time to measure with large buffers
q Assumes packets get loss due to congestionq What about wireless?
q Direct dependency on bandwidth/delay productq Maximum window 64KB out of the boxq Problems utilizing intercontinental linksq Problems utilizing 10GB/s
Advanced Networking (SS 17): 07 - Transport Layer Evolution
4
TCP congestion control: Significance of the problem
q Wireless routers implemented TCP stitching
q Nowadays:q Router vendors implement AQM strategies
■ Discussed earlierq Link layer has repeated transmits & PAUSE framesq WAN optimizers (decrease number of ACKs etc)
➡ Everything is build around TCP
Advanced Networking (SS 17): 07 - Transport Layer Evolution
“Normal” TCP I-TCP, METP, SNOOP or similar
5
TCP evolution
q Can we let TCP evolve?q Yes, but takes time
■ New algorithms may make use of existing protocol fields■ New fields via extensions
q RFCs take time to publicationq Need to be adopted by OSes & testers (chicken-or-egg problem)q Must not break existing TCP algorithmsq Must not mess with fairness
q Major improvements theses days: OS vendors “simply” implement new strategiesq CTCPq CUBICq BBR
Advanced Networking (SS 17): 07 - Transport Layer Evolution
6
TCP SACK option
q Introduces Selective Acknowledgmentsq RFC 2018 from 1996q Redefined in RFC 3517 from 2003 &
RFC 6675 from 2012q Supported by all main operating systems
q Negotiated during handshake
q Simple solution?!q Problem solved?!
Advanced Networking (SS 17): 07 - Transport Layer Evolution
7
TCP SACK option – Implementation errorsOS A1 A2 B C D E F GFreeBSD 5.3-5.4 ❌ ❌
FreeBSD 6.0-8.0Linux 2.2.20-2.6.18 ❌
Linux 2.6.31MacOS X 10.5-10.6OpenBSD 4.2-4.8 ❌ ❌
OpenSolaris 2008.05-2009.06 ❌ ❌
Solaris 10 ❌
Solaris 11 ❌
Windows 2000-2003 ❌ ❌ ❌ ❌ ❌
Windows Vista-7 ❌ ❌
Advanced Networking (SS 17): 07 - Transport Layer Evolution
� Degraded performance, but eventually consistent as timeouts delete SACK state (luckily)
Ekiz et al, Misbehaviors in TCP SACK Generation, ACM SIGCOMM Computer Communication Review, 2011
8
TCP SACK option – Gain?
aver
age
thro
ughp
ut [M
b/s]
burst error rate
NewReno
SACK
Reno
Tahoe
Westwood+
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
1E-07 1E-06 1E-05 1E-04 1E-03
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Nguyen et al, An Implementation of the SACK-Based Conservative Loss Recovery Algorithm for TCP in ns-3 (extended version), Extended version of Wns3, 2015
9
C-TCP
q Compound TCP introduced by Microsoft 2005q 3 RFC drafts posted until November 2008q Enabled in Windows server editions by default, requires enabling them
on clients
q Idea: Two congestion windowsq “Normal” loss-based oneq One for delay, also estimates bottleneck queueq Sum up, i.e., win = min(cwnd + dwnd, awnd)
q Delay based congestion window increases quickly when long delay observed
q Decreases to 0 to reach “normal” steady state behavior afterwards
Advanced Networking (SS 17): 07 - Transport Layer Evolution
10
C-TCP – Window behavior
TCP
CTCP
DWND
Advanced Networking (SS 17): 07 - Transport Layer Evolution
tTan et al, A Compound TCP Approach for High-speed and Long Distance Networks, TR Microsoft, 2005
C-TCP assumes a backlog of ! packets at
bottleneck
11
C-TCP – Throughput in lossy networks
0
100
200
300
400
500
600
700
0.01 0.001 0.0001 0.00001 0.000001 0Packet loss rate
Thro
ughp
ut (M
bps)
Regular TCP HSTCP CTCP
Advanced Networking (SS 17): 07 - Transport Layer Evolution
12
C-TCP – Fairness
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.01 0.001 0.0001 0.00001 0.000001 0Packet loss rate
Band
widt
h st
olen
CTCP HSTCP
Advanced Networking (SS 17): 07 - Transport Layer Evolution
13
CUBIC
q Developed from an older more complex algorithm BICq Todays standard in Linux and MacOS
q Ideas: q Decrease queues in routers, send data at expected bandwidthq Aggressively increase bandwidth periodically to probe for moreq Scale window using a cubic function
Advanced Networking (SS 17): 07 - Transport Layer Evolution
A. BIC Window Growth Function Before delving into CUBIC, let us examine the features of
BIC. The main feature of BIC is its unique window growth
function.
Fig. 1 shows the growth function of BIC. When it gets a
packet loss event, BIC reduces its window by a multiplicative
factor β. The window size just before the reduction is set to the
maximum Wmax and the window size just after the reduction is
set to the minimum Wmin. Then, BIC performs a binary search
using these two parameters – by jumping to the “midpoint”
between Wmax and Wmin. Since packet losses have occurred at
Wmax, the window size that the network can currently handle
without loss must be somewhere between these two numbers.
However, jumping to the midpoint could be too much
increase within one RTT, so if the distance between the
midpoint and the current minimum is larger than a fixed
constant, called Smax, BIC increments the current window size
by Smax (linear increase). If BIC does not get packet losses at the
updated window size, that window size becomes the new
minimum. If it gets a packet loss, that window size becomes the
new maximum. This process continues until the window
increment is less than some small constant called Smin at which
point, the window is set to the current maximum. So the
growing function after a window reduction will be most likely
to be a linear one followed by a logarithmic one (marked as
“additive increase” and “binary search” respectively in Fig. 1).
If the window grows past the maximum, the equilibrium
window size must be larger than the current maximum and a
new maximum must be found. BIC enters a new phase called
“max probing.” Max probing uses a window growth function
exactly symmetric to those used in additive increase and binary
search – only in a different order: it uses the inverse of binary
search (which is logarithmic; its reciprocal will be exponential)
and then additive increase. Fig. 1 shows the growth function
during max probing. During max probing, the window grows
slowly initially to find the new maximum nearby, and after
some time of slow growth, if it does not find the new maximum
(i.e., packet losses), then it guesses the new maximum is further
away so it switches to a faster increase by switching to additive
increase where the window size is incremented by a large fixed
increment.
The good performance of BIC comes from the slow increase
around Wmax and linear increase during additive increase and
max probing.
B. CUBIC Window Growth Function Although BIC achieves pretty good scalability, fairness, and
stability during the current high speed environments, the BIC’s
growth function can still be too aggressive for TCP, especially
under short RTT or low speed networks. Furthermore, the
several different phases of window control add a lot of
complexity in analyzing the protocol. We have been searching
for a new window growth function that while retaining most of
strengths of BIC (especially, its stability and scalability),
simplifies the window control and enhances its TCP
friendliness.
In this paper, we introduce a new high-speed TCP variant:
CUBIC. As the name of the new protocol represents, the
window growth function of CUBIC is a cubic function, whose
shape is very similar to the growth function of BIC. CUBIC is
designed to simplify and enhance the window control of BIC.
More specifically, the congestion window of CUBIC is
determined by the following function:
max
3)( WKtCWcubic +−= (1)
where C is a scaling factor, t is the elapsed time from the last
window reduction, Wmax is the window size just before the last
window reduction, and 3max CWK β= , where β is a constant
multiplication decrease factor applied for window reduction at
the time of loss event (i.e., the window reduces to βWmax at the
time of the last reduction).
Fig. 2 shows the growth function of CUBIC with the origin
at Wmax. The window grows very fast upon a window reduction,
but as it gets closer to Wmax, it slows down its growth. Around
Wmax, the window increment becomes almost zero. Above that,
CUBIC starts probing for more bandwidth in which the
window grows slowly initially, accelerating its growth as it
moves away from Wmax. This slow growth around Wmax
enhances the stability of the protocol, and increases the
utilization of the network while the fast growth away from Wmax
ensures the scalability of the protocol.
The cubic function ensures the intra-protocol fairness among
the competing flows of the same protocol. To see this, suppose
that two flows are competing on the same end-to-end path. The
Wmax
Steady State Behavior
Max Probing
Fig. 2: The Window Growth Function of CUBIC
Wmax
Additive Increase
Max Probing
Fig. 1: The Window Growth Function of BIC
Binary Search
Rhee et al.:CUBIC: A New TCP-Friendly
High-Speed TCP Variant, ACM SIGOPS Operating Systems
Review 42.5 (2008)
Wcubic
= C(t� 3pW
max
�/C)3 +Wmax
14
CUBIC – Throughput over time
III. PERFORMANCE EVALUATION
In this section, we present some performance results regarding the TCP friendliness and stability of CUBIC and other high-speed TCP variants. For CUBIC, we set β to 0.8, C to 0.4, and Smax to 160. We use NS-2 for simulation. The network topology is dumbbell. For each simulation run, we run four flows of a high-speed protocol and four flows of regular long-term TCP SACK over the same end-to-end paths for the entire duration of the simulation; their starting times and RTTs are slightly varied to reduce the phase effect. About 10% of background traffic is added in both forward and backward directions of the dumbbell setup. For all the experiments unless notes explicitly, the buffer size of Drop Tail routers is set to 100% of BDP. Experiment 1: TCP Friendliness in Short-RTT Networks (Simulation script available in the BIC web site):
We test five high speed TCP variants: CUBIC, BIC, HSTCP,
Scalable TCP, and HTCP. We set RTT of the flows to be around 10 ms and vary the bottleneck bandwidth from 20 Mbps to 1 Gbps. Fig. 5 shows the throughput ratio of the long-term TCP flows over the high-speed flows (or TCP friendly ratio) measured from these runs.
The surprising result is that BIC and STCP even show worse TCP friendliness over 20Mbps than over 100Mbps. However, we are still not sure the exact reason for this result. Over 100 Mbps, all the high speed protocols show reasonable friendliness to TCP. As the bottleneck bandwidth increases from 100Mbps to 1Gbps, the ratios for BIC, HSTCP and STCP drop dramatically indicating unfair use of bandwidth with respect to TCP. Under all these environments, regular TCP can still use the full bandwidth. Scalable TCP shows the worst TCP friendliness in these tests followed by BIC and HSTCP. CUBIC and HTCP consistently give good TCP friendliness.
Experiment 2: TCP Friendliness in Long-RTT Networks (Simulation script available in the BIC web site)
Although the TCP mode improves the TCP friendliness of
the protocol, it does so mostly for short RTT situations. When the BDP is very large with long RTT, the aggressiveness of the window growth function (more specifically, the congestion epoch length) has more decisive effect on the TCP friendliness. As the epoch gets longer, it gives more time for TCP flows to grow their windows.
An important feature of BIC and CUBIC is that it keeps the epoch fairly long without losing scalability and network utilization. Generally, in AIMD, a longer congestion epoch means slower increase (or a smaller additive factor). However, this would reduce the scalability of the protocol, and also the network would be underutilized for a long time until the window becomes fully open (Note that it is true only if the multiplicative decrease factor is large; but we cannot keep the multiplicative factor too small since that implies much slower convergence to the equilibrium). Unlike AIMD, CUBIC increases the window to (or its vicinity of) Wmax very quickly and then holds the window there for a long time. This keeps the scalability of the protocol high, while keeping the epoch long and utilization high. This feature is unique both in BIC and CUBIC.
In this experiment, we vary the bottleneck bandwidth from 20Mbps to 1Gbps, and set RTT to 100ms. Fig. 6 shows the throughput ratio of long-term TCP over high-speed TCP variants. Over 20 Mbps, all the high speed protocols show reasonable friendliness to TCP. As the bandwidth gets larger than 20 Mbps, the ratio drops quite rapidly. Overall, CUBIC shows a better friendly ratio than the other protocols.
Experiment 3: Stability (Simulation script available in the
Fig. 5: TCP-Friendly Ratio in Short-RTT Networks
0
0.2
0.4
0.6
0.8
1
1.2
100050030010020
TCP/
Hig
h-Sp
eed
Thro
ughp
ut R
atio
(%)
Link Speed (Mbps)
CUBICBIC
HSTCPSTCPHTCP
Fig. 4: CUBIC window curves with competing flows (NS simulation in a network with 500Mbps and 100ms RTT), C = 0.4, β = 0.8.
Fig. 6: TCP-Friendly Ratio in Long-RTT Networks
0
0.2
0.4
0.6
0.8
1
1.2
100050030010020
TCP/
Hig
h-Sp
eed
Thro
ughp
ut R
atio
(%)
Link Speed (Mbps)
CUBICBIC
HSTCPSTCPHTCP
Advanced Networking (SS 17): 07 - Transport Layer Evolution
15
CUBIC - Maybe fair but takes forever
4
0
50
100
150
200
250
300
350
0 100 200 300 400 500 600
cwnd
(pac
kets
)
time (s)
Convergence 10Mbit/sec Bottleneck
Flow 1Flow 2
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 100 200 300 400 500 600
cwnd
(pac
kets
)
time (s)
Convergence 250Mbit/sec Bottleneck
Flow 1Flow 2
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
0 100 200 300 400 500 600 700 800
cwnd
(pac
kets
)
time (s)
Flow 1 Flow 2
Fig. 2. Cubic TCP cwnd time histories following startup of a secondflow. Bandwidth is 10Mbits/s (top), 250 Mbit/sec (middle) and 500 Mbit/sec(bottom). RTT is 200ms, queue size 100% BDP, no web traffic.
This effect is reinforced by changes to the AIMD backofffactor. In standard TCP flows backoff cwnd by 0.5 on detect-ing packet loss. Strategies such as BIC-TCP and Cubic-TCPinstead use a backoff factor of 0.8. As a result, flows releasebandwidth more slowly when informed of congestion, againhaving the effect of slowing convergence.
B. Slow convergence implies prolonged unfairness.One consequence of slow convergence is that periods of
extreme unfairness between flows may persist for long periods;even in situations where flows do eventually converge tofairness. Such situations are masked when fairness results arepresented purely in terms of long-term averages. However, thisbehaviour is immediately evident, for example, in the timehistories shown in Figure 2 and it seems clear that it has im-
0
0.2
0.4
0.6
0.8
1
10 100
Fairn
ess
Rat
io
RTT (ms)
Standard TCPCubici 10Mb/s
Cubic 250Mb/s
Fig. 3. Ratio of throughputs of two Cubic TCP flows with the same RTT(also sharing same bottleneck link and operating same congestion controlalgorithm) as path propagation delay is varied. Flow throughputs are averagedover the last 200s of each test run and so approximate asymptotic behaviour,neglecting initial transients. Results are shown for 10Mbit/sec and 250Mbit/secbottleneck bandwidths. The bottleneck queue size is 100% BDP, no webtraffic.
0
50
100
150
200
250
200 250 300 350 400 450 500 550 600
(Mbp
s)
time (s)
Flow 1Flow 2
Fig. 4. Impact of web traffic on convergence. Evolution of mean bandwidth,averaged over 20 test runs, following startup of a second flow. 200 backgroundweb flows (100 in each direction). Link bandwidth is 250 Mbit/sec, RTT is200ms, queue size 100% BDP.
portant practical implications. For example, two identical filetransfers may have very different completion times dependingon the order in which they are started. Also, long-lived flowscan gain a substantial throughput advantage at the expense ofshorter-lived flows. The latter seems particularly problematicas the majority of TCP flows are short to medium sized andso a single long-lived flow may potentially penalize a largenumber of users (akin to a form of denial of service).With regard to the last point, the impact of a long-lived flow
on a short-lived flow is illustrated, for example, in Figure 5.Here, we measure the completion time for a download versusthe size of the download. Measurements are shown (i) for thebaseline case where no other flow shares the bottleneck linkand (ii) for the case where a single long-lived flow sharesthe link and competes for bandwidth. It can be seen that inthe baseline situation, Cubic-TCP, standard TCP and H-TCPall exhibit similar completion times. It is perhaps initiallysurprising that standard TCP performs so well in this test,in view of concerns about performance in high-speed paths.However, we note that the link in this example is provisionedwith a BDP of buffering. A standard TCP flow slow-starts to
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Leigth et al.: Experimental evaluation of Cubic-TCP, 2008
16
Reinventing congestion control: BBR
q Estimates bottleneck bandwidth and round-trip propagation time (BBR)
q Developed at Google, available in recent Linux kernels
q Goal: Reduce buffer bloat by optimizing TCPq Idea: “congestion-based” observe how much data is in-flightq Idea: keep queues filled at the sender onlyq Another old idea: Use also delay information (but different from New
Vegas etc.)q No clocked by ACKs but paced
q Following slides based on: q Cardwell et al.: BBR Congestion Control, IETF Meeting 97, Seoul, 2016q Cardwell et al.: BBR Congestion-Based Congestion Control, ACM Queue,
2016
Advanced Networking (SS 17): 07 - Transport Layer Evolution
17
BBR: Working point with increasing bandwidth
4
Deliv
ery
rate
BDP BDP + BufSize
RTT
Amount in flight
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Optimal
Where loss-based CC starts controlling
Where loss-based CC works in lossy networks
18
Phases in BBR: 1. Exponential BW search
q Exponential BW searchq Increase then decrease exponentiallyq Probes for max bandwidth by monitoring in-flight data & ACKs
Advanced Networking (SS 17): 07 - Transport Layer Evolution
4
Deliv
ery
rate
BDP BDP + BufSize
RTT
Amount in flight
Optimal
19
Phases in BBR: 2. Drain queues
q Exponentially decrease in-flight dataq Clears queues fast againq By monitoring in-flight data & ACKs
Advanced Networking (SS 17): 07 - Transport Layer Evolution
4
Deliv
ery
rate
BDP BDP + BufSize
RTT
Amount in flight
Optimal
20
Phases in BBR: 3. Refresh measurements
q Periodically increase send rate to probe for more bwq Periodically decrease it to probe for minimal RTTq Remember: BW * RTT = Max in-flight data being processed
Advanced Networking (SS 17): 07 - Transport Layer Evolution
4
Deliv
ery
rate
BDP BDP + BufSize
RTT
Amount in flight
Optimal
21
BBR behavior compared to CUBIC
16
RTT
(ms)
Data
sent
or A
CKed
(MBy
tes)
STARTUP DRAIN PROBE_BW
CUBIC (red)BBR (green)ACKs (blue)
16
BBR and CUBIC: Start-up behavior
Advanced Networking (SS 17): 07 - Transport Layer Evolution
22
Fairness to RENO & CUBICSharing deep buffers with loss-based CC
At first CUBIC/Reno gains an advantage by filling deep buffers
But BBR does not collapse it adapts BBR's bw and RTT probing tends to drive system toward fairness
Deep buffer data point 8*BDP case bw = 10Mbps, RTT = 40ms, buffer = 8 * BDP
-> CUBIC 6.31 Mbps vs BBR 3.26 Mbps
23
q Cubic with small advantage here, but about fairq Depending on parameters
Advanced Networking (SS 17): 07 - Transport Layer Evolution
23
BBR - Throughput in lossy networks
BBR vs CUBIC synthetic bulk TCP test with 1 flow, bottleneck_bw 100Mbps, RTT 100ms
Fully use bandwidth, despite high loss
18
Advanced Networking (SS 17): 07 - Transport Layer Evolution
What does it mean to fairness?
24
All fixed? Maybe, maybe not yet
Advanced Networking (SS 17): 07 - Transport Layer Evolution
http://blog.cerowrt.org/post/birthday_problem/
25
Excursion: Why is everybody worried about fairness?
q Back in the very old days people tried to optimize network power
q Delay average weighted by throughput!
q We want to maximize using locally observable informationq Capacity of all links on pathq # of users sharing each link on the pathq Message rate of all of these users
q Unfortunately this is impossible
Advanced Networking (SS 17): 07 - Transport Layer Evolution
J. Jaffe: Flow Control Power is Nondecentralizable, IEEE Transactions on Communications, 1981
Power P =
Total Throughput �̂
Average Delay D
26
Maximizing P [Kleinrock 78]
q Maximizing P for a single link yields in
q For a single link (capacity μ) modelled by an M/M/1 system (not being overburdened):
Advanced Networking (SS 17): 07 - Transport Layer Evolution
P 0 =�0D � �D0
D2
Pmax
) P 0 = 0
) �0D = �D0 =d�
d�D = �
dD
d�
) D
�=
dD
d�
D =1
µ� �
dD
d�=
1
(µ� �)2) µ/2 = �
27
Network power: Example network
q Gives us:
Advanced Networking (SS 17): 07 - Transport Layer Evolution
µ1
µ2
�̂ = (µ1+µ2)/2
D = �̂�1(2�1/µ1 + 2�2/µ2) =4
µ1 + µ2
P = 1/8(µ1 + µ2)2
Optimal if µ1 = µ2
Autonomous servers
28
Network power: Counter example
q Assume µ1 >> µ2
q Let only server 1 send data
Advanced Networking (SS 17): 07 - Transport Layer Evolution
µ1
µ2
Autonomous servers
�̂ = (µ1)/2
D = �̂�1(2�1/µ1 + 0/µ2) =2
µ1
P = µ21/4Larger than 1/8(µ1 + µ2)
2
29
Network power: Conclusion
q (Now) obvious: Optimizing overall network requires knowledge over the whole network � would not scale
q More fundamental: q “No performance criterion based on and is decentralizable.”q Details in [Jaf81]
q Focusing on global optimization metrics may simply be not the right thing to do…q Fairness between flows is a local criterionq Seems more suited
Advanced Networking (SS 17): 07 - Transport Layer Evolution
�̂ D
30
Content
q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC
Advanced Networking (SS 17): 07 - Transport Layer Evolution
31
Multipath TCP – Motivation
q TCP connections are bound to a hosts IP addressesq IP address determines routing between hosts (unless IP spoofing)
q In many scenarios insufficient (resilience, bandwidth, mobility)q Mobile users
■ Handoff between WiFi and LTEq Channel bundling
■ Why can’t I have two DSL connections and use both?q Data centers
■ Fat Tree topologies■ Aggregating links■ Advantages with resilience and blocking
Advanced Networking (SS 17): 07 - Transport Layer Evolution
C. Paasch: Decoupled from IP, TCP is at last able to support multihomed hosts, ACM queue, March 2014
32
Multipath – Blocking in Fat Trees
Advanced Networking (SS 17): 07 - Transport Layer Evolution
� Not full control over routing, but at least the lowest layer
33
Multipath TCP – Objectives
q Scenario: q Work in any scenario with multiple IP addressesq No source routing or so (cross-product IP address many)
q Fully backward compatibleq No change to socket APIq No change to middleboxes
■ Firewalls■ NICs (think of TSO)■ WAN optimizers
q No unfairness to non-multipath aware TCPq All build in TCP option headers
Advanced Networking (SS 17): 07 - Transport Layer Evolution
34
Transport Layer
Multipath TCP – Architecture
Advanced Networking (SS 17): 07 - Transport Layer Evolution
TCP 1 TCP 2 TCP n
MPTCP
Application
Network Layer
Socket API
Transport Layer
TCP 1 TCP 2 TCP n
MPTCP
Application
Network Layer
Socket API
35
Multipath TCP – Connection setup
q Subflows created after devices agree to use MPTCP (middle-box safe)
q Keys exchanged during setupq Used to bind other
sessions cryptographically to master session
q Kb is echoed back for stateless operation■ Why? State already
existing?q Token to identify MPTCP
session derived from the keys
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Initial connection
Establishing additional
connections
36
Fun with middleboxes
q In NAT & firewall scenarios: Server cannot initiate new connection
q � Devices announce available addresses by ADD_ADDR optionq Client may establish second connection in this scenario
Advanced Networking (SS 17): 07 - Transport Layer Evolution
37
(More) Fun with middleboxes
q Middleboxes mess with TCP streams (aggregate, rewrite etc.)q Rely on consistent sequence numbers per substreamq Requires additional mechanism to track sequence in overall stream
■ Absolutely independent from substream handling■ I.e. ACKs are sent for substreams, even if not expected in aggregated
stream
q Placement in overall stream carried in TCP option header?q Not feasible (aggregation, TSO)
q Sender specifies a fixed “mapping” of data to subflowsq Receiver informed in advanceq May be remapped for retransmits (if one flow dies)q Fall-back to single flow TCP possible by “infinite mapping”
Advanced Networking (SS 17): 07 - Transport Layer Evolution
38
MPTCP: Retransmits
q Obvious: Needs to deal with retransmits on subflow levelq Middleboxes may introduce thisq MPTCP instances may use it therefore too
q Subflows may fail (temporarily or permant)q Needs retransmits over different subflowq Implies changes in mappingq Underlying TCP connection in original flow still needs to retransmit
■ Would break connection otherwise■ � Performance penalty
q Scheduling?
Advanced Networking (SS 17): 07 - Transport Layer Evolution
39
MPTCP: Congestion control (I)
q Observation: MPTCP “smears” congestion over the network
q Naïve solution for CC: use congestion control of subflowsq Unfair advantage against regular TCPq Depends on number of used flowsq Also may not be optimal
■ Naïve: λ1 = λ2= λ3
■ Optimal: λ1 = λ3; λ3=0
q Also possible: measure & control subflows togetherq May lead to “flappiness”,
i.e. sudden load switchesq May smear congestion too
much
Advanced Networking (SS 17): 07 - Transport Layer Evolution
λ3
λ1
λ2
40
MPTCP: Congestion control (II)
q Loosely coupled subflows:q RFC 6356 suggests with each ACK:
q Losses still halve cwndi
q Throttle subflows to not exceed rate of “virtual” single TCP flowq Alpha controlling the allowed violation of that conditionq Still: no load-balancing between interfering flowsq Several scenarios with unfairness towards Reno TCP and between
MPTCP instances discovered (not Pareto optimal)
Advanced Networking (SS 17): 07 - Transport Layer Evolution
cwndi
+= #bytesAcked⇥MSSi
⇥min
✓↵
cwndtotal
,1
cwndi
◆
↵ = cwndtotal
⇥
0
@max
⇣pcwndi
rtti
⌘
Pcwndirtti
1
A
2
C. Raiciu: Practical Congestion Control for Multipath Transport Protocols, Tech. Rep, 2009
41
MPTCP: Congestion control (III)
q Opportunistic linked-increases algorithm (OLIA)q Also loosely coupled q Addresses problems with Pareto optimalityq With each ACK:
Advanced Networking (SS 17): 07 - Transport Layer Evolution
R. Khalili et al: MPTCP Is Not Pareto-Optimal: Performance Issues and a Possible Solution, IEEE/ACM Transactions on Networking, 2013
cwndi +=
pcwndi
rttiP cwndirtti
!2
+↵i
cwndi
Like before!Controls
aggressiveness for subflow
q "i is positive for subflows that have not reached the estimated bandwidth/delay ratio
42
Multipath TCP – Discussion
q Handshake:q Why is MP_CAPABLE sent three times?q Why is the second handshake based on a normal TCP handshake?q Why is it a four way handshake?
q Scenarios:q Can I use MPTCP with a single NIC?q Can I use MPTCP with a single IP address?q Can I use MPTCP to increase performance if I have two DSL lines (with
NAT)?q Does MPTCP help with delay problems?q Do applications need to be aware of MPTCP? What does MPTCP mean
for “legacy” application?q Security:
q IDS & Firewalls?q SYN cookies?
Advanced Networking (SS 17): 07 - Transport Layer Evolution
43
Content
q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC
Advanced Networking (SS 17): 07 - Transport Layer Evolution
44
Stream Control Transmission Protocol (SCTP)
q Protocol developed to transport SS7 messages over IPq Reliable and message-orientated
q Like a crossbreed of TCP and UDP
q First RFC 2960 in October 2000, current RFC 4960 (September 2007)q Many shortcomings of TCP have been known during design phaseq So from scratch SCTP supported
q Selective ACKsq Multistreamingq Multihomingq Heartbeatsq TLV coding of extension headersq SYN flood protectionq Better protocol state handling (no half-open connections)q …
Advanced Networking (SS 17): 07 - Transport Layer Evolution
45
SCPT – Multistreaming
IEEE Communications Magazine • April 200466
upper-layer applications. In other words, theHOL effect is limited within the scope of indi-vidual streams, but does not affect the entireassociation.
Multistreaming and HOL blocking are illus-trated in Fig. 4 where an SCTP associationconsisting of four streams is shown. Segmentsare identified by stream sequence numbers(SSNs) [1] that are unique within a stream, butdifferent streams can have the same SSN. Inthe figure, SSN 11 in stream 1 has been deliv-ered to the upper-layer application, and SSN 9of the second stream is lost in the network;SSNs 10, 11, 12 are therefore queued in thebuffer of the second stream, waiting forretransmitted SSN 9 to arrive. Arriving SSN 13at stream 2 will also be queued. Similarly, SSN4 of stream 3 is missing during the transmis-sion resulting in the blocking of SSNs 5, 6, and7. For stream 4, SSN 21 is being delivered tothe upper-layer application, while arriving SSN23 will be queued in the buffer because ofmissing SSN 22. Note that when SSN 12 arrivesat the buffer of stream 1, it can be deliveredimmediately even if the other streams areblocked. This illustrates that segments arrivingon stream 1 can still be delivered to the upper-layer application, although streams 2 and 3 are(and stream 4 will be) blocked because of lostsegments.
An example application of using SCTP multi-streaming in Web browsing is shown in Fig. 5.Here, an HTML page is split into five objects: aJava applet, an ActiveX control, two images, andplain text. Instead of creating a separate connec-tion for each object as in TCP, SCTP makes useof its multistreaming feature to speed up thetransfer of HTML pages. By transmitting eachobject in a separate steam, the HOL effectbetween different objects can be eliminated. Ifone object is lost during the transfer, the otherscan still be delivered to the Web browser at theupper layer while the lost object is being retrans-mitted from the Web server. This results in a
better response time to users while opening onlyone SCTP association for a particular HTMLpage.
CONGESTION CONTROLSCTP congestion control is based on the wellproven rate-adaptive window-based congestioncontrol scheme of TCP. This ensures that SCTPwill reduce its sending rate during network con-gestion and prevent congestion collapse in ashared network. SCTP provides reliable trans-mission and detects lost, reordered, duplicate, orcorrupt packets. It provides reliability by retrans-mitting lost or corrupt packets. However, thereare several major differences between TCP andSCTP:
•SCTP incorporates a fast retransmit algo-rithm based on SACK gap reports similar to thatof TCP SACK. This mechanism speeds up lossdetection and increases the bandwidth utiliza-tion. One of the major differences betweenSCTP and TCP is that SCTP does not have anexplicit fast recovery phase. SCTP achieves fastrecovery automatically with the use of SACK [1].
•Compared to TCP, the use of SACK ismandatory in SCTP, which allows more robustreaction in the case of multiple losses from a sin-gle window of data. This avoids a time-consum-ing slow start stage after multiple segment losses,thus saving bandwidth and increasing through-put.
•During slow start or congestion avoidanceof SCTP, the congestion window (cwnd) isincreased by the number of acknowledged bytes;in TCP it is increased by the number of ACKsegments received. Since the TCP sender
■ Figure 3. An SCTP association consisting of four streams carrying data fromone upper layer application.
Association Stream
SCTPStreambuffers
Stream1
Stream2
Stream3
Stream4
Application (source)
Stream4
Stream3
Stream2
Stream1
Application (destination)
IP
DLL
PHY
■ Figure 4. An illustration showing HOL blockingof individual streams at the receiver.
11
12 13
Association
7 23
21
10
11
5
6
12
Stream1
Stream2
Stream3
Stream4
Application
q Multistreaming at transport layer avoids head of line blocking
Advanced Networking (SS 17): 07 - Transport Layer Evolution
S. Fu: SCTP: State of the art in research, products, and
technical challenges, IEEE Communications Magazine 42(4):64 - 76 · May 2004
46
SCPT – Connection management
q Four way handshakeq Server allocates states AFTER
cookie echoq INIT + INIT ACK may contain TLV
coded optionsq What does this mean to
extensibility? Think of the cookie mechanism
q Connection identified by two tags (cmp. IPsec SA)
q Shutdown leads to immediate packet flush
q No half-open connectionsq Smaller protocol state machine
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Hand-shake
ConnectionClose
47
SCTP – Chunks (I)
q SCTP common header only contains port numbers, a “verification tag” (i.e. connection id) & CRC-32 checksum
q Any payload & protocol data transported in “chunks”q Used even for internal purposes, e.g. address configuration
q Multiple chunks maybe aggregated in a packet
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Congestion and Flow Control 471
3.2 Counting Outstanding Bytes
As pointed out, cwnd has an influence on the network load and thus on thethroughput. Therefore, the way the outstanding bytes, that limit cwnd, arecounted, is important and should be examined.
Looking at an SCTP packet containing several data chunks, the amount ofuser data can vary significantly with the size of the individual chunks (i.e. mes-sages) assuming the same packet length.
IPHeader
SCTPCommonHeader
DataChunkHeader
12 16
User Data
143620
(a) One chunk with 1436 bytes of user data
IPHeader
SCTPCommonHeader
DataChunkHeader
12 16
UserData
DataChunkHeader
28 16
UserData
DataChunkHeader
UserData
28 281620
(b) 33 chunks, each containing 28 bytes of user data
length[bytes]
length[bytes]
Fig. 1. SCTP packet format
In Figure 1(b) the packet contains 33 DATA-chunks with 28 bytes of user dataeach, adding up to 924 bytes of user data compared to 1436 bytes in the packetin Figure 1(a). Both packets have a size of 1484 bytes. Whereas the overhead isjust 1 % in (a) the headers add up to 36 % in (b) and can be more than 60 %for even smaller user message sizes.
Therefore, we have to distinguish between the amount of data that is injectedinto the network and the user data that arrive at the application layer. Whereasthe first has a direct impact on the network load, the second results in thegoodput. Both depend on the number of packets (1), that are allowed by thecwnd.
NoOfPackets =!
cwndSizeP
"(1)
Calculating the size of a packet (SizeP ), the headers for IP (HIP ) and SCTP(HSCTP ) have to be considered as well as the size of the DATA-chunks (SizeChunk ).
SizeP = HIP + HSCTP + CPP · SizeChunk (2)
The number of the chunks per packet (CPP ) is calculated as
CPP =#
MTU − HIP − HSCTP
UMS + PUMS + HChunk
$(3)
The average user message size (UMS ) per packet and the corresponding paddingbytes (PUMS ) feature the variable parts of the packets.
I. Rüngeler et al.: Congestion and Flow Control in the Context of the Message-Oriented Protocol SCTP, Networking 2009
48
SCTP – Chunks (II)
q General chunk header format
q Well-known chunk types: 0 - Payload Data (DATA) 1 - Initiation (INIT) 2 - Initiation Acknowledgement (INIT ACK) 3 - Selective Acknowledgement (SACK) 4 - Heartbeat Request (HEARTBEAT) …
q If chunk type unknown highest 2 bit of chunk type code:q 00 - Stop processing the rest of SCTP packetq 01 - Stop and report an 'Unrecognized Chunk Type’q 10 - Skip this chunk and continue processingq 11 - Skip this chunk and continue processing, but error
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Chunk type Chunk flags Chunk length
Chunk specific value
8 bit
Padding (up to 3 bytes)
49
SCTP – Data chunk
q Chunks may add own headerq Example: Payload data chunk
q Flags carry reorder requirement and fragmentation flagsq Payload Protocol Identifier passed to application transparentlyq Stream field are used to transport multiple data streams over an SCTP
connection
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Type = 0 Chunk flags Chunk length
User Data
8 bit
Transmission Sequence Number (TSN)
Stream Identifier Stream Sequence NumberPayload Protocol Identifier
Padding (up to 3 bytes)
50
SCTP – Multiple paths
q Alternate paths are probed by HEARTBEAT messages including a 64-bit nonceq Addresses exchanged during INIT sequenceq Allows secure setup of alternative pathsq Support for dynamic addresses added with RFC5061
■ Addresses added and removed using authenticated chunks (iff globally addressable)
■ Still requiring verification
q Messages are only sent over the primary pathq Switch after failure detectionq Does not directly allow for load-sharing!q Multipath SCTP: https://tools.ietf.org/html/draft-tuexen-tsvwg-sctp-
multipath-13 (December 2016, but no significant changes lately)
Advanced Networking (SS 17): 07 - Transport Layer Evolution
51
SCTP – Current State
q Not widely deployed
q Many reasons:q No killer featureq Application developers
must explicitly enable itq Firewall & NATs?
q RFC for NAT support not even done yet
Advanced Networking (SS 17): 07 - Transport Layer Evolution
http://www.caida.org/data/realtime/passive/?monitor=equinix-chicago-
dirA&row=timescales&col=sources&sources=proto&graphs_sing=ts&counters_sing=bits×cales=24×cales=168&time
scales=672×cales=17520
52
Content
q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC
Advanced Networking (SS 17): 07 - Transport Layer Evolution
53
SPDY and HTTP/2
q Is this not application layer?!
Advanced Networking (SS 17): 07 - Transport Layer Evolutionhttp://www.caida.org/data/realtime/passive/?monitor=equinix-chicago-dirA
54
SPDY and HTTP/2
q Is this not application layer?!
Advanced Networking (SS 17): 07 - Transport Layer Evolution
55
SPDY and HTTP/2
q 2009 Google announced to develop an HTTP successor: SPDYq Goal: 50% reduction of page load timeq Includes HTTP header compressionq As of 2015 it is deprecated
q Now HTTP/2 is gold standard (RFC 7540)q Shares many of the ideas of SPDY
q Addressed key problem:q HTTP 1.1 pipelining is broken due to misbehaving applications and head
of line blockingq In practice disabled mostly� Problems with TCP congestion controlq Solution: building multi-stream support on top of TCP/TLS
■ Idea similar to SCTP but heavily optimized for web traffic & backward compatible with home routers
Advanced Networking (SS 17): 07 - Transport Layer Evolution
56
HTTP/2 – Binary encoding
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Ilya Grigorik: High Performance Browser Networking, O'Reilly, 2013
q HTTP/2 emulates “normal” HTTP to applicationq Internal encoding using binary data & compression
14.06.17, 14)13
Page 1 of 1https://hpbn.co/assets/diagrams/ae09920e853bee0b21be83f8e770ba01.svg
57
HTTP/2 – Streams
q Multiple streams may interleavedq Prevents head of line blocking
q Client initiated streams carry odd numbers
q Proactive object delivery by server over server initiated streamsq Promises allow server to advertise upcoming proactively pushed
objectsq Streams may be priorized
Advanced Networking (SS 17): 07 - Transport Layer Evolution
14.06.17, 14)17
Page 1 of 1https://hpbn.co/assets/diagrams/47ba5b32e42cf5a06c3741d29ef9b94a.svg
Ilya Grigorik: High Performance Browser Networking, O'Reilly, 2013
58
HTTP/2 – Performance (I)
q Obvious: Object pushing & binary encoding optimize speed
Advanced Networking (SS 17): 07 - Transport Layer Evolution
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
e
bingwikiped
eb site
Fig. 4. Page load time with an ADSL Livebox, 50ms latency.
time of 400ms. There is still some benefit: the page load timedecreases by 20% on average. Naturally, we expect to seeworse performances on a 3G network. The reality is that therewas not enough packet loss on the 3G network to influencethe page load time. The recent study by AT&T on SPDY’sperformances in [20] stated that the performances of SPDYwere worse than those of HTTP/1.1 over cellular networks.One would expect this to be valid over HTTP/2 as it is anevolution of SPDY.
0
5
10
15
20
25
e
wikiped
bing
Page
L
Web site
HTTP/1.1HTTP/2
Fig. 5. Page load time with a 3G modem, 400ms latency.
3) Local Area Network tests: Latency. Because the major-ity of the Internet browsing is moving to mobile devices, it isworthwhile to look at the influence of latency and packet losson HTTP/2. To this end, we first vary the network latency onour local platform.
Figure 6 shows the page load time in HTTP/1.1 andHTTP/2 for various latency values. For each value, we plottedthe minimum and maximum value, the lower and upperquartiles, along with the median. Interestingly, an increasinglatency widens the difference between HTTP/1.1 and HTTP/2,which means that HTTP/2 reacts well to latency. This sug-gests that this positive influence might also occur on cellularnetworks as they suffer from higher latency.
Packet loss. We saw HTTP/2 reacts positively to high la-tency. But another important characteristic of cellular networksis important packet losses. That is why we conduct a similarexperiment, this time with a fixed latency and varying the
0
2
4
6
8
10
12
14
0 50 100 150 200
Page
Loa
d Ti
me
in s
econ
ds
Latency in milliseconds
HTTP/1.1HTTP/2
Fig. 6. Impact of latency, 0% loss. By pairs, left: HTTP/1.1, right: HTTP/2.
packet loss. Figure 7 shows a poor behaviour: the higher thepacket loss, the lesser the benefits of HTTP/2. Furthermore,the page load time ratio between HTTP/2 and HTTP/1.1 oftenexceeds 1, meaning that HTTP/2 takes longer than HTTP/1.1.
This can be explained as follows: HTTP/2 uses only oneTCP connection to communicate between the client and theserver. When this single connection suffers from packet loss,all streams running over this unique TCP connection arenegatively impacted. In HTTP/1.1, the situation is differentas several TCP connections are open between the client andthe server and this mitigates the packet loss problem. AT&Tin [20] already found similar results for SPDY who is theancestor of HTTP/2.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
e
HTTP
/2 o
ver H
TTP/
1.1
ratio
Web site
0%6%
Fig. 7. Impact of packet loss, 100ms latency.
From an overall perspective, HTTP/2 decreases page loadtimes, because it goes past the head of line blocking issueby using multiplexing. However, several studies [9] [10] [20]have already stated that SPDY was negatively impacted bypacket loss on cellular networks. This statement is likely tohold true for HTTP/2 because it keeps the same idea as SPDYof multiplexing requests over a single TCP connection. Thisproblem stems from the underlying transport protocol, and assuch only a switch to another transport protocol can solve it.
B. Evaluations on Server push and Priority
Besides the multiplexing and compression mechanisms,there is a second class of new features which is optional
18th IEEE Global Internet Symposium
297
H. Saxcé et al.: Is HTTP/2 Really Faster Than HTTP/1.1?, 18th IEEE Global Internet Symposium, 2015
59
HTTP/2 – Performance (II)
q Key question: Does larger congestion control window outperform loss due to head of line blocking?q Discuss: Why may HOL still occur?q Discuss: What is the impact of loss and delay?
Advanced Networking (SS 17): 07 - Transport Layer Evolution
60
HTTP/2 – Performance (III)
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
e
bingwikiped
eb site
Fig. 4. Page load time with an ADSL Livebox, 50ms latency.
time of 400ms. There is still some benefit: the page load timedecreases by 20% on average. Naturally, we expect to seeworse performances on a 3G network. The reality is that therewas not enough packet loss on the 3G network to influencethe page load time. The recent study by AT&T on SPDY’sperformances in [20] stated that the performances of SPDYwere worse than those of HTTP/1.1 over cellular networks.One would expect this to be valid over HTTP/2 as it is anevolution of SPDY.
0
5
10
15
20
25
e
wikiped
bing
Page
L
Web site
HTTP/1.1HTTP/2
Fig. 5. Page load time with a 3G modem, 400ms latency.
3) Local Area Network tests: Latency. Because the major-ity of the Internet browsing is moving to mobile devices, it isworthwhile to look at the influence of latency and packet losson HTTP/2. To this end, we first vary the network latency onour local platform.
Figure 6 shows the page load time in HTTP/1.1 andHTTP/2 for various latency values. For each value, we plottedthe minimum and maximum value, the lower and upperquartiles, along with the median. Interestingly, an increasinglatency widens the difference between HTTP/1.1 and HTTP/2,which means that HTTP/2 reacts well to latency. This sug-gests that this positive influence might also occur on cellularnetworks as they suffer from higher latency.
Packet loss. We saw HTTP/2 reacts positively to high la-tency. But another important characteristic of cellular networksis important packet losses. That is why we conduct a similarexperiment, this time with a fixed latency and varying the
0
2
4
6
8
10
12
14
0 50 100 150 200
Page
Loa
d Ti
me
in s
econ
ds
Latency in milliseconds
HTTP/1.1HTTP/2
Fig. 6. Impact of latency, 0% loss. By pairs, left: HTTP/1.1, right: HTTP/2.
packet loss. Figure 7 shows a poor behaviour: the higher thepacket loss, the lesser the benefits of HTTP/2. Furthermore,the page load time ratio between HTTP/2 and HTTP/1.1 oftenexceeds 1, meaning that HTTP/2 takes longer than HTTP/1.1.
This can be explained as follows: HTTP/2 uses only oneTCP connection to communicate between the client and theserver. When this single connection suffers from packet loss,all streams running over this unique TCP connection arenegatively impacted. In HTTP/1.1, the situation is differentas several TCP connections are open between the client andthe server and this mitigates the packet loss problem. AT&Tin [20] already found similar results for SPDY who is theancestor of HTTP/2.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
e
HTTP
/2 o
ver H
TTP/
1.1
ratio
Web site
0%6%
Fig. 7. Impact of packet loss, 100ms latency.
From an overall perspective, HTTP/2 decreases page loadtimes, because it goes past the head of line blocking issueby using multiplexing. However, several studies [9] [10] [20]have already stated that SPDY was negatively impacted bypacket loss on cellular networks. This statement is likely tohold true for HTTP/2 because it keeps the same idea as SPDYof multiplexing requests over a single TCP connection. Thisproblem stems from the underlying transport protocol, and assuch only a switch to another transport protocol can solve it.
B. Evaluations on Server push and Priority
Besides the multiplexing and compression mechanisms,there is a second class of new features which is optional
18th IEEE Global Internet Symposium
297
Advanced Networking (SS 17): 07 - Transport Layer Evolution
H. Saxcé et al.: Is HTTP/2 Really Faster Than HTTP/1.1?, 18th IEEE Global Internet Symposium, 2015
61
HTTP/2 – Performance (IV)
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
e
bingwikiped
eb site
Fig. 4. Page load time with an ADSL Livebox, 50ms latency.
time of 400ms. There is still some benefit: the page load timedecreases by 20% on average. Naturally, we expect to seeworse performances on a 3G network. The reality is that therewas not enough packet loss on the 3G network to influencethe page load time. The recent study by AT&T on SPDY’sperformances in [20] stated that the performances of SPDYwere worse than those of HTTP/1.1 over cellular networks.One would expect this to be valid over HTTP/2 as it is anevolution of SPDY.
0
5
10
15
20
25
e
wikiped
bing
Page
L
Web site
HTTP/1.1HTTP/2
Fig. 5. Page load time with a 3G modem, 400ms latency.
3) Local Area Network tests: Latency. Because the major-ity of the Internet browsing is moving to mobile devices, it isworthwhile to look at the influence of latency and packet losson HTTP/2. To this end, we first vary the network latency onour local platform.
Figure 6 shows the page load time in HTTP/1.1 andHTTP/2 for various latency values. For each value, we plottedthe minimum and maximum value, the lower and upperquartiles, along with the median. Interestingly, an increasinglatency widens the difference between HTTP/1.1 and HTTP/2,which means that HTTP/2 reacts well to latency. This sug-gests that this positive influence might also occur on cellularnetworks as they suffer from higher latency.
Packet loss. We saw HTTP/2 reacts positively to high la-tency. But another important characteristic of cellular networksis important packet losses. That is why we conduct a similarexperiment, this time with a fixed latency and varying the
0
2
4
6
8
10
12
14
0 50 100 150 200
Page
Loa
d Ti
me
in s
econ
ds
Latency in milliseconds
HTTP/1.1HTTP/2
Fig. 6. Impact of latency, 0% loss. By pairs, left: HTTP/1.1, right: HTTP/2.
packet loss. Figure 7 shows a poor behaviour: the higher thepacket loss, the lesser the benefits of HTTP/2. Furthermore,the page load time ratio between HTTP/2 and HTTP/1.1 oftenexceeds 1, meaning that HTTP/2 takes longer than HTTP/1.1.
This can be explained as follows: HTTP/2 uses only oneTCP connection to communicate between the client and theserver. When this single connection suffers from packet loss,all streams running over this unique TCP connection arenegatively impacted. In HTTP/1.1, the situation is differentas several TCP connections are open between the client andthe server and this mitigates the packet loss problem. AT&Tin [20] already found similar results for SPDY who is theancestor of HTTP/2.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
e
HTTP
/2 o
ver H
TTP/
1.1
ratio
Web site
0%6%
Fig. 7. Impact of packet loss, 100ms latency.
From an overall perspective, HTTP/2 decreases page loadtimes, because it goes past the head of line blocking issueby using multiplexing. However, several studies [9] [10] [20]have already stated that SPDY was negatively impacted bypacket loss on cellular networks. This statement is likely tohold true for HTTP/2 because it keeps the same idea as SPDYof multiplexing requests over a single TCP connection. Thisproblem stems from the underlying transport protocol, and assuch only a switch to another transport protocol can solve it.
B. Evaluations on Server push and Priority
Besides the multiplexing and compression mechanisms,there is a second class of new features which is optional
18th IEEE Global Internet Symposium
297
Advanced Networking (SS 17): 07 - Transport Layer Evolution
H. Saxcé et al.: Is HTTP/2 Really Faster Than HTTP/1.1?, 18th IEEE Global Internet Symposium, 2015
62
Content
q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC
Advanced Networking (SS 17): 07 - Transport Layer Evolution
63
Quick UDP Internet Connections (QUIC)
q New transport layer protocol introduced by Google to remove shortcomings of SPDY/HTTP/2 over TCP
q Currently IETF draftq See https://tools.ietf.org/html/draft-ietf-quic-transport-04
q Goals:q Multi-streaming without HOLq Multi-homingq Backward compatibleq Built-in security (i.e. TLS)q Reduced latency by more simple handshakeq Decoupling of congestion control algorithm from protocolq FEC
Advanced Networking (SS 17): 07 - Transport Layer Evolution
64
QUIC in the protocol stack
q QUIC operates at session/transport/application layerq UDP only used for backward compatibility (port 80 or 443)q Sessions identified by 64 bit connection ID
Advanced Networking (SS 17): 07 - Transport Layer Evolution
TLS 1.2
HTTP/2
TCP
IP
QUIC
UDP
HTTP/2 API
J. Iyengar: QUIC - Redefining Internet Transport
65
C0 P
Payload
Payload
QUIC – Packet format (according to RFC draft)
q Long header:
q Short header:
Advanced Networking (SS 17): 07 - Transport Layer Evolution
8 bit
1 Type
Connection ID (64 bit)
Packet Counter (32 bit)Version (32 bit)
8 bit
Connection ID (0 or 64 bit)
Packet Counter (8, 16 or 32 bit)
66
QUIC – Connection “establishment”
Advanced Networking (SS 17): 07 - Transport Layer Evolution
TCP + TLS QUIC(equivalent to TCP + TLS)
0-RTT! No! Just no timeouts – properties may be cached “forever”
Magic?
67
QUIC – Actual connection establishment
q Indication of server: alternate-protocol:443:quic,p=0.02
q Client initiates with version and server name
q Server “rejects” giving certificates, configuration & “source-address token” to prevent spoofing
q Normal “0-RTT” handshake followsq Always contains source-address token q Contains servers DNS name
q Discuss: q What does this handshake mean for
DoS resistance?q What does it mean for PFS?q What happens if the first packet is
reordered?
Advanced Networking (SS 17): 07 - Transport Layer Evolution
Server starts to commit resources
68
QUIC – Change in security model significant
Client Attacker Server0-RTT key-exchange messages0-RTT data "request"
process "request"accept 0-RTT
key-exchange response messages
enforce loss of state (e.g., reboot)
replay 0-RTT key-exchange messagesreplay 0-RTT data "request"
reject after state lossfor security reasons
reject 0-RTTkey-exchange response messages
final key exchange messagesresend data "request" under final key(to ensure reliable transmission) process "request"
(again)
Figure 1: Generic replay attack discovered by Daniel Kahn Gillmor in the IETF TLS working groupdiscussion around TLS 1.3 [Res15b]. The 0-RTT data "request" could, e.g., be an HTTP request "POST/buy-something".
Note that the contrived requirement that the attacker is able to reboot the server (while the clientkeeps waiting for a response) vanishes in a real-world scenario with distributed server clusters, where theattacker instead simply forwards the 0-RTT messages to two servers and drops the first server’s response.The described attack hence in particular a�ects the cryptographic design of QUIC, which (among others)specifically targets settings with distributed clusters. Holding up the originally envisioned 0-RTT fullreplay protection being impossible, Langley and Chang write in the specification of July 2015 [LC15](Rev 20150720) that this design is “destined to die” and will be replaced by (an adapted version of) theTLS 1.3 handshake. We, however, argue here that QUIC’s strategy in Rev 20130620 still supports somekind of replay resistance, only at a di�erent level. TLS 1.3, in contrast, forgoes any protection mechanismsand instead accepts replays as inevitable (on the channel level). Developers using TLS 1.3 are supposedto be provided with a di�erent API call for sending 0-RTT data [Res16e, Appendix B.1], indicating itsreplayability, and responsible for taking replays into account for such data.
There is, then, a significant conceptual gap between replays (of key-exchange messages and keys) onthe key-exchange level, and the replay of user data faced on the level of the overall secure channel protocolin the 0-RTT setting. While the former can e�ectively be prevented within the key exchange protocol,this does not necessarily prevent the latter which can be (and in practice is) induced by the network stackof the channel actively and automatically re-sending (presumably) rejected 0-RTT data under the mainkey. The latter type of logical, network-stack replays is hence fundamentally beyond of what key exchangeprotocols can protect against.
5
Advanced Networking (SS 17): 07 - Transport Layer Evolution
M. Fischlin et al.: Replay Attacks on Zero Round-Trip Time: The Case of the TLS 1.3 Handshake Candidates, 2nd IEEE European Symposium on Security and Privacy (EuroS&P 2017)
69
QUIC – Countering opportunistic ACK attacks
q Danger of opportunistic ACKs: Hostile clientq Uses HTTP to “download” huge fileq Injects ACKs even though has not received dataq Server uses up much of its bandwidth
q TCP offers no protection itself
q QUIC does so by allowing servers to skip sequence rangesq Design criterionq May reduce the load induced by hostile clients
Advanced Networking (SS 17): 07 - Transport Layer Evolution
70
QUIC – Production but work in progress
q Latest value found: 9.05% of Google traffic is QUIC
q General standardization of QUIC – In progressq Using BBR with QUIC – In progressq FEC support – Removed due to performance decreaseq Multihoming & multipath – Not implemented yetq Requirement due to some middleboxes: There must always be a
WORKING fallback path to TCP
q Other applications? q Currently very tight bundling to HTTP/2q Various difficulties: First packet may be retransmit silently, fallback
requirement, privacy issues due tracking of connection ID?q See https://tools.ietf.org/html/draft-kuehlewind-quic-applicability-00
Advanced Networking (SS 17): 07 - Transport Layer Evolution
top related