richard hughes-jones the university of manchester hep.man.ac.uk/~rich/ then “talks”
DESCRIPTION
How do transport protocols affect applications & The relative importance of different protocol properties Panel Discussion. Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks”. Panellists. Pascale Primet INREA, France Ralph Niederberger - PowerPoint PPT PresentationTRANSCRIPT
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester1
How do transport protocols affect applications
&
The relative importance of different protocol properties
Panel Discussion
Richard Hughes-Jones The University of Manchester
www.hep.man.ac.uk/~rich/ then “Talks”
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester2
Panellists Pascale Primet
INREA, France
Ralph Niederberger Research Center Juelich, Germany
Tim Sheppard
Katsushi Kobayashi National Institute Adv. Industrial Science & Technology, Japan
Michael Welzl University of Innsbruck, Austria
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester3
Some Areas for Discussion What is the interaction between Application and
Transport Protocol? What is the relative importance of fairness vs throughput?
rtt fairness (OK what is fairness?) mtu fairness TCP friendliness
How to AIMD rate fluctuations relate to stability & sharing? Stability of Achievable Throughput
Does provable stability of protocols matter? Is the computational complexity of a protocol important? What is the relative importance of convergence time?
Link utilisation (by this flow or all flows)
Should there be a bias towards "mice“? – Applications Is conceptual simplicity of the protocol important?
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester4
Action of the transport protocol -
help or hindrance to the application ?
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester5
Remote Compute Farms: Application Req-Resp
0
50000
100000
150000
200000
250000
0 200 400 600 800 1000 1200 1400 1600 1800 2000time
Data
Byte
s O
ut
0
50
100
150
200
250
300
350
400
Data
Byte
s I
n
DataBytesOut (Delta DataBytesIn (Delta
CERN-Manc Round trip time 20 ms Web100 hooks for TCP status 64 byte Request green
1 Mbyte Response blue TCP in slow start 1st event takes 19 rtt or ~ 380 ms
0
50000
100000
150000
200000
250000
0 200 400 600 800 1000 1200 1400 1600 1800 2000time ms
Data
Byte
s O
ut
0
50000
100000
150000
200000
250000
Cu
rCw
nd
DataBytesOut (Delta DataBytesIn (Delta CurCwnd (Value
TCP Congestion windowgets re-set on each Request
TCP stack RFC 2581 & RFC 2861 reduction of Cwnd after inactivity
Even after 10s, each response takes 13 rtt or ~260 ms
020406080
100120140160180
0 200 400 600 800 1000 1200 1400 1600 1800 2000time ms
TC
PA
ch
ive M
bit
/s
0
50000
100000
150000
200000
250000
Cw
nd
Transfer achievable throughput120 Mbit/s peak
Event rate very low Application not happy!
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester6
VLBI Application Protocol
Data wave front send to Correlator
VLBI signal wave front
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester7
Visualising CBR/TCP
When packet loss is detected TCP: Reduces Cwnd Halves the sending rate
Expect a delay in the message arrival time
Message number / Time
Packet lossDelay in stream
Expected arrival time at CBR
Arrival time
Stephen Kershaw
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester8
CBR/TCP: UKLight JBO-JIVE-Manc
0 1 2 3 4 5 6 7 8 9 10
x 104
5
10
15
20
25
30
35
40
45
50
Message number
Tim
e /
s
Effect of loss rate on message arrival time
Drop 1 in 5k
Drop 1 in 10k
Drop 1 in 20kDrop 1 in 40k
No loss
Timely data arrival
Message size: 1448 Bytes Wait time: 22 us Data Rate: 525 Mbit/s Route:
JB-UKLight-JIVE-UKLight-Man
RTT ~27 ms
TCP buffer 32M bytes
BDP @512Mbit 1.8Mbyte Estimate catch-up possible
if loss < 1 in 1.24M
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester9
And now for the protocols …
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester10
SC2004 Disk-Disk bbftp bbftp file transfer program uses TCP/IP UKLight: Path:- London-Chicago-London; PCs:- Supermicro +3Ware RAID0 MTU 1500 bytes; Socket size 22 Mbytes; rtt 177ms; SACK off Move a 2 GByte file Web100 plots:
Standard TCP Average 825 Mbit/s (bbcp: 670 Mbit/s)
Scalable TCP Average 875 Mbit/s (bbcp: 701 Mbit/s
~4.5s of overhead)
Disk-TCP-Disk at 1Gbit/s
0
500
1000
1500
2000
2500
0 5000 10000 15000 20000
time msT
CP
Ach
ive M
bit
/s
050000001000000015000000200000002500000030000000350000004000000045000000
Cw
nd
InstaneousBW
AveBW
CurCwnd (Value)
0
500
1000
1500
2000
2500
0 5000 10000 15000 20000
time ms
TC
PA
ch
ive M
bit
/s
050000001000000015000000200000002500000030000000350000004000000045000000
Cw
nd
InstaneousBWAveBWCurCwnd (Value)
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester11
Transport Protocols TCP
Reno; HS-TCP; Scalable; H-TCP; C-TCP; BIC; CUBIC; LCTP
XCP UDP
Some applications NEED this form of delivery
RTP / RTSP Lots of streaming applications available now
DCCP multicast
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester12
DCCP: Datagram Congestion Control Protocol Unreliable
No re-transmissions
Has modular congestion control Can detect congestion and take avoiding action Different algorithms can be selected – ccid
TCP-likeTCP Friendly Rate Control
DCCP is like UDP with congestion control DCCP is like TCP without reliability Application uses
Multi-media – send new data instead of re-sending useless old data Applications that can choose data encoding & transmission rate e-VLBI – discussing a special ccid
RFCs 4340, CCIDs RFC 4341 4342 e-VLBI considering a ccid: UDP with congestion detection – API extension
Detect potential problems with other network users – unexpected route changes
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester13
Fairness and Throughput
Larger MTU is faster !
Smaller RTT is faster !
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester14
Low performance on fast long distance paths AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion) Net effect: recovers slowly, does not effectively use available bandwidth, so poor
throughput Unequal sharing
Rate fluctuations, Stability & SharingTCP Reno single stream
Congestion has a dramatic effect
Recovery is slow
Increase recovery rate
SLAC to CERN
RTT increases when achieves best throughput
Les Cottrell PFLDnet 2005
Remaining flows do not take up slack when flow removed
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester15
Which Protocol for my Network
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester16
Transports for LightPaths Host to host Lightpath
One Application No congestion Lightweight framing
Lab to Lab Lightpath Many application share Classic congestion points TCP stream sharing and recovery NEEDED Advanced TCP stacks
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester17
Transports for Academic Networks
Many different technologies – often low Bandwidths Cautious/conservative Transport Protocols
Standard TCP Linux & BIC Microsoft & C-TCP
High Bandwidth Backbones But care needed with Access links – Countries and Campus Many Application flows
Note the Digital Divide Roles for Advanced TCP stack and other transports.
Transports for Global Internet
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester18
Summary: Some Areas for Discussion What is the interaction between Application and
Transport Protocol? What is the relative importance of fairness vs throughput?
rtt fairness (OK what is fairness?) mtu fairness TCP friendliness
How to AIMD rate fluctuations relate to stability & sharing? Stability of Achievable Throughput
Does provable stability of protocols matter? Is the computational complexity of a protocol important? What is the relative importance of convergence time?
Link utilisation (by this flow or all flows)
Should there be a bias towards "mice“? – Applications Is conceptual simplicity of the protocol important?
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester19
Thanks to the Panellists Pascale Primet
INREA, France
Ralph Niederberger Research Center Juelich, Germany
Tim Sheppard
Katsushi Kobayashi National Institute Adv. Industrial Science & Technology, Japan
Michael Welzl University of Innsbruck, Austria
PFLDnet, Marina Del Ray, 7-9 Feb 2007, R. Hughes-Jones Manchester20
CBR/TCP: Catch-up?
If Throughput NOT limited by TCP buffer size / Cwnd maybe we can re-sync with CBR arrival times.
Need to store CBR messages during the Cwind drop in the TCP buffer Then transmit Faster than the CBR rate to catch up
Message number / Time
Packet lossDelay in stream
Expected arrival time at CBR
Arrival timethroughput
1Slope
Stephen Kershaw