collapse in g5kyuba.stanford.edu/trainwreck/g5k_tcp_train_sent.pdf · 24 conclusion a large scale...
TRANSCRIPT
1
TCP &Future very high speed
networks demo
Collapse in G5K ?
Stanford Trainwreck Workshop April 1st , 2008
Romaric Guillier, Pascale Vicat-Blanc PrimetLIP, Ecole Normale Supérieure de Lyon, INRIA, France
2
Outline•Context and challenges
•G5K experimental facility & NXE engine
•Demo principles
•Results analysis
•Conclusion and perspective
……100 Mbps - 1 Gbps +
//
3
Network & traffic evolution
TCP : « one fit all » solution for ever ?
70’s 2000’s 2010’s
?
?
4
Network infrastructure changes?
1) Aggregation factor (uplink capacity /downlink capacity : K )2) Multiplexing factor (number of contributing sources: M)3) Heterogeneity of access speeds (Kb/s - Gb/s: 6 order of magnitude)4) Heterogeneity of RTT (1ms - 300ms)
K= C/Ca ≈ 1Ca
C
Few big flows may congest the linksLong RTTs issue
5
Structural change: problem of sharing
TCP Transport protocol not designedfor very high node degrees & low aggregation factor context.
Example of theG5K network
6
Evolution of Traffic demand?
1) Increase of the traffic load wrt to offered access capacity2) Increase of the traffic heterogeneity: time sensitive and throughput
sensitive flows ratio in the mix3) Change in the ratio of collaborative nodes (TCP) and non-collaborative
nodes (UDP or multiple TCP streams) in the mix
4) File size distribution: poisson => heavy tail ( α ≈ 2 => α ≈ 1 )
5) Increase of the mean flow size : Kbytes => Gbytes (elephant) (6 orders)6) Increase of the symmetry of the traffic (P2P vs Client-Serveur)
Internet stability strongly relies on the cooperative behavior of all end hosts.Will this assumption hold in ten years ?
7
Outline•Context and challenges
•G5K experimental facility & NXE engine
•Demo principles
•Results analysis
•Conclusion and perspective
?
8
G5K: large scale experimental facility
9 sites gathering 5000 fully reservable & reconfigurable processors
Fully private & controllable 10Gb/s core network
Private link 10Gb/s to NL- DAS3Private 1Gb/s link to JP- Naregi
RENATER-4
2,5 Gbit/s
Fibre noire
CERN
Sophia
Collaboration withRENATER-4
2,5 Gbit/s
Fibre noire
CERN
10Gb/s Dedicated lambdas
Example of a site: Grid5000@Lyon
9
Reservation and Batch Scheduler
10
Experiment setupExperiment setup
100 independant sources 100 Independant sinks1 GbE
10 GbE
iperfiperfiperf iperfiperfiperfd
11
Long RTT emulation : GNET10
12
NXE engine: Experiment workflowNXE engine: Experiment workflow
13
Outline•Context and challenges
•G5K experimental facility & NXE engine
•Demo principles
•Results analysis
•Conclusion and perspective
?
14
Scenario description
Experiment 1: rtt_ref = 7 ms Experiment 2: rtt_ref= 87 ms
15
Cross-Traffic generation
K = 10Number of sources = 100
Access link = 1 GbpsRTT = [10; 20] ms
Fows size (Pareto = 1:2; = 1000)Congestion factor = 0 to 2
Non cooperative flows ratio = 0 to 0.8
16
Experiment schedule
Experiment 1 : normal RTT => T-reference’s delay = 7ms
Experiment 2 : long RTT => T_reference’s delay = 87ms
normal 0.5-congest 1-congest 2-congest 2-cg+udp 0.5-congest2-cg+udp+//
45s 105s 165s 225s 285s 345s
17
Reference transfer behaviorTransfer delay: optimal value: no congestion in the network
Transfer delay: critical value: delay > 10x optimal delay
Real time progress of the transfer
data transfered %
18
Demonstration is running….
19
Sagittaire clusterModel
Sun Fire V20z70 nodesCPU: AMD Opteron 2502.4GHz / 1MB / 400MHz
* 70 nodes x 2 cpus per node = 140 cpus* 140 cpus x 1 core(s) per cpu = 140 cores
Memory: 2 GBNetwork:
* Gigabit Ethernet * Gigabit Ethernet (management)
Driver: tg3Storage: 73 GB / SCSIDriver: mptspiM
20
Demonstration workflowDemonstration workflow
21
Outline•Context and challenges
•G5K experimental facility & NXE engine
•Demo principles
•Results analysis
•Conclusion and perspective
?
22
What we observed: Experiment 1
a) 940Mb/s (-0%)
b) 940Mb/s (-0%)
c) 800Mb/s (-10%)
d) 270Mb/s (-55%)
e) 270Mb/s (-55%)
f) 100Mbs (-80%)
g) 600Mb/s (-40%)
normal 0.5-cng 1-cng 2-cg 2-cg+udp
0.5-cng2-cg+udp+//
Comparison with oracle
23
Outline•Context and challenges
•G5K experimental facility & NXE engine
•Demo principles
•Results analysis
•Conclusion and perspectives
?
24
Conclusion
A large scale deployment of very High Speed Network to the homemay lead to a critical vicious cycle where:
• long RTT flows are starving• interactive and streaming traffics are suffering prohibitive delays• users become less & less patient• flows become more and more aggressive• traffic increases• more and more bandwidth is wasted in retransmissions…
• Need to rethink the way we share the capacity (via TCP only):• global cooperative behavior assumption ?• fairness objective function ?• end to end principle ?
25
Testbed & tools used during the demo
•G5K or Grid5000 : http://www.grid5000.fr
•NXE http://www.ens-lyon.fr/LIP/RESO/Software/NXE/index.html
•AIST-GtrcNET-10: http://projects.gtrc.aist.go.jp/gnet/gnet10p3e.html
•Iperf http://dast.nlanr.net/Projects/Iperf/
•CLPBar http://clpbar.sourceforge.net/
for any information, please contact: [email protected]