vlbi data transfer tests recent and current work

36
ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester 1 VLBI Data Transfer Tests Recent and Current Work. Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks”

Upload: baina

Post on 15-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

VLBI Data Transfer Tests Recent and Current Work. Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks”. Outline. Throughput Tests on Mark5s TCP Memory-2-memory tests CPU Load tests Data delay on a TCP link – How suitable is TCP? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester1

VLBI Data Transfer Tests

Recent and Current Work.

Richard Hughes-Jones The University of Manchester

www.hep.man.ac.uk/~rich/ then “Talks”

Page 2: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester2

Outline

Throughput Tests on Mark5s TCP Memory-2-memory tests CPU Load tests

Data delay on a TCP link – How suitable is TCP? 4th Year MPhys Project

Stephen Kershaw & James Keenan The effect of distance

Throughput on the 630Mbit JB-JIVE UKLight Link TCP Performance

Page 3: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester3

Why were Jodrell to JIVE VLBI data transfers not able to do 512 Mbit even on UKLight ?

Why can Onsala Mk5 achieve 512 Mbps to JIVE Mk5 ?Onsala can even high rates transatlantic – iGrid2005 SC|05? Identical Mk5 hardware to JBO Same kernel and drivers Longer links

Hint given as the general Network load increased: Normally Onsala – JIVE iperf TCP ~900-950 Mbit/s VLBI OK at 512 Mbit

Sometimes Onsala – JIVE iperf TCP ~750 Mbit/s VLBI not OK at 512 Mbit

Is it the network ?

Jodrell’s VLBI Mark5 Problem

Page 4: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester4

VLBI Network Topology

Page 5: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester5

VLBI Network Topology

Page 6: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester6

Standard Mark5 PCs 1.2GHz PIII End host iperf TCP flow memory-to-memory only

960 Mbit/s with rtt 1 ms JBO - Manchester Falls to 770 Mbit/s when rtt 15 ms JBO - JIVE

JBO - Manchester 94.7% kernel mode idle 1.5 %

JBO - JIVE 96.3% kernel mode idle 0.05 %

No Loss No Timeouts

200* more TCPPureACK seen for JBO-Manchester

TCPHPACKs about the same Help with meanings please

TCP Tests Jodrell’s Mark5

mk5-606-jive_9Dec05

0102030405060708090

100

0 1 2 3 4 5trial

% C

PU

ker

nel

00.511.522.533.544.55

% C

PU

mod

e

kernel

user

nice

idle

mk5-606-g7_9Dec05

0102030405060708090

100

0 1 2 3 4 5trial

% C

PU

ker

nel

00.511.522.533.544.55

% C

PU

mod

e

kernel

user

nice

idle

mk5-606-jive_9Dec05

0100002000030000400005000060000700008000090000

100000

0 1 2 3 4 5trial

No. P

ure

AC

Ks

mk5-606-jive_9Dec05

mk5-606-g7_9Dec05

Page 7: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester7

TCP Throughput & CPU Load Measure iperf TCP throughput and CPU load Run CPU intensive task with different priority (nice High number = low priority)

mk5-606-g7_10Dec05

0.0010.0020.0030.0040.0050.0060.0070.0080.0090.00

100.00

0 2 4 6 8 10 12 14 16 18 20nice large value - low priority

% C

PU

mo

de

se

nd

kernel

user

nice

idle

no CPU load

0

200

400

600

800

1000

0 2 4 6 8 10 12 14 16 18 20nice large value - low priority

Thr

ough

put

Mbi

t/s

no CPU load

JBO – Manchester 1.2 GHz PIII TCP Throughput falls as priority

increases

% Kernel mode drops and %nice increases ad priority increases

CPU mode shares with %nice

No Loss No Timeouts

JBO – Manchester Asus NCCH-DL2.8 GHz Xeon TCP Throughput constant as

priority increases

% Kernel and %nice constant

No Loss No Timeouts

0

200

400

600

800

1000

0 2 4 6 8 10 12 14 16 18 20nice large value - low priority

Thr

ough

put

Mbi

t/s

no CPU load

mk5-606-g7_17Jan05

0.0010.0020.0030.0040.0050.0060.0070.0080.0090.00

100.00

0 2 4 6 8 10 12 14 16 18 20nice large value - low priority

% C

PU

mo

de

se

nd

kernel

user

nice

idle

no CPU load

Onsala has a Faster Clock !

Page 8: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester8

TCP Throughput while reading SuperStor Measure iperf TCP throughput while reading data from disk to memory

Reading SuperStor from disk to memory only 1.48 Gbit/s

Reading SuperStor with iperf 1.15 Gbit/s Iperf TCP rate 420 Mbit/s

15 ms SS read spacing~1Gbit/s to memory

Corresponding CPU load

mk5-606-g7_17Jan05

0100200300400500600700800900

1000

0 2 4 6 8 10 12 14 16 18 20nice large value - low priority

Thr

ough

put M

bit/s

15 ms SS read spacing

mk5-606-g7_17Jan05

0.0010.0020.0030.0040.0050.0060.0070.0080.0090.00

100.00

0 2 4 6 8 10 12 14 16 18 20Test number

% C

PU

mo

de

se

nd kernel

user

nice

idle

no CPU load

Page 9: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester9

TCP Delay and VLBI Transfers

Manchester 4th Year MPhys Project

by

Stephen Kershaw & James Keenan

Page 10: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester10

VLBI Application Protocol

VLBI data is Constant Bit Rate

tcpdelay instrumented TCP program emulates sending CBR

Data. Records relative 1-way delay

Data1

●●●

Timestamp1

Time

TCP & Network Receiver

Timestamp2

Sender

Data2Timestamp4

Timestamp5

Data4

Timestamp3

Data3

Packet loss

RTT

Time

Sender Receiver

ACKSegment time on wire = bits in segment/BW

Remember Bandwidth*Delay Product BDP = RTT*BW

Page 11: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester11

Send time – 10000 packetsS

end

time

sec

1 sec

Check the Send Time

10,000 Messages Message size: 1448 B Wait time: 0 TCP buffer 64k

Slope 0.44 ms/message Expect 42 messages/rtt

~0.6ms/message

Message number

Page 12: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester12

Message 102Message 76

About 25 us One rtt

100 ms

Sen

d tim

e se

c

26 messages

Send Time Detail

Message number

Page 13: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester13

1 way delay – 10000 packets1

way

del

ay 1

00 m

s

Message number

1-Way Delay

10,000 Messages Message size: 1448 B Wait time: 0 TCP buffer 64k

Page 14: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester14

Message number

= 1.5 x RTT

= 1 x RTT 26 ms

≠ 0.5 x RTT

1 w

ay d

elay

100

ms

1-Way Delay Detail

10,000 Messages Message size: 1448 B Wait time: 0 TCP buffer 64k

Why not 1 rtt? Why does it vary?

Effect of “send time delay”TCP slow start?

Page 15: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester15

Message 102Message 76

100 ms

Sen

d tim

e se

c

26 messages

Comparison of Send Time & 1-way delay

Message number

Page 16: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester16

1 way delay

μs

Packet number

1 way delay – 10000 packets

Packet 1214

1575 packets

~ 5.5 x RTT

Page 17: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester17

10,000 Messages Message size: 724 Bytes Wait times: 20, 25, 30, 35,

40, 45 μs TCP buffer 64k

1 w

ay d

elay

100

ms

Message number

1-Way Delay 724 byte msg

Page 18: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester18

Packet number

1-Way Delay 724 bytes Detail

10,000 Messages Message size: 724 Bytes Wait times: 20, 25, 30, 35,

40, 45 μs TCP buffer 64k

Regular cycle of ~125 packets

1 w

ay d

elay

100

ms

Page 19: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester19

Route:Man-ukl-ams-prod-man

Rtt 27ms 10,000 Messages Message size: 1448 Bytes Wait times: 0 μs DBP = 3.4MByte TCP buffer 10MByte

1-Way Delay 1448 byte msgone-way

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

0 2000 4000 6000 8000 10000 12000Packet No.

1-w

ay d

elay

us

50 ms

Message number

0100

200300400

500600

700800

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

time ms

num

Pac

kets

0

500000

1000000

1500000

2000000

Cw

nd

P ktsOut (Delta)P ktsIn (Delta)CurCwnd (Value)

Web100 plot Starts after 5.6 Sec

due to Clock Sync. ~400 pkts/10ms Rate similar to iperf

Page 20: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester20

5 ms

Message number

Route:LAN gig8-gig1

Ping 188us

10,000 Messages Message size: 1448 Bytes Wait times: 0 μs

Drop 1 in 1000

1-Way Delay with packet drop

800 us

28 ms

Page 21: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester21

TCP on the 630 Mbit Link

Jodrell – UKLight – JIVE

Page 22: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester22

TCP Throughput on 630 Mbit UKLight Manchester gig7 – JBO 606 4 Mbyte TCP buffer

test 0 Dup ACKs seen Other Reductions

test 1

test 2

0

200

400

600

800

1000

0 20 40 60 80 100 120

time s

TC

PA

chiv

e M

bit

/s

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Cw

nd

InstaneousBWCurCwnd (Value)

0

200

400

600

800

1000

0 20 40 60 80 100 120

time s

TC

PA

chiv

e M

bit

/s

050000010000001500000200000025000003000000350000040000004500000

Cw

nd

InstaneousBW CurCwnd (Value

0

200

400

600

800

1000

0 20 40 60 80 100 120

time s

TC

PA

chiv

e M

bit

/s

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Cw

nd

InstaneousBW CurCwnd (Value

Page 23: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester23

Any Questions?

Page 24: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester24

More Information Some URLs 1 UKLight web site: http://www.uklight.ac.uk MB-NG project web site: http://www.mb-ng.net/ DataTAG project web site: http://www.datatag.org/ UDPmon / TCPmon kit + writeup:

http://www.hep.man.ac.uk/~rich/net Motherboard and NIC Tests:

http://www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt& http://datatag.web.cern.ch/datatag/pfldnet2003/ “Performance of 1 and 10 Gigabit Ethernet Cards with Server Quality Motherboards” FGCS Special issue 2004 http:// www.hep.man.ac.uk/~rich/

TCP tuning information may be found at:http://www.ncne.nlanr.net/documentation/faq/performance.html & http://www.psc.edu/networking/perf_tune.html

TCP stack comparisons:“Evaluation of Advanced TCP Stacks on Fast Long-Distance Production Networks” Journal of Grid Computing 2004

PFLDnet http://www.ens-lyon.fr/LIP/RESO/pfldnet2005/ Dante PERT http://www.geant2.net/server/show/nav.00d00h002

Page 25: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester25

Lectures, tutorials etc. on TCP/IP: www.nv.cc.va.us/home/joney/tcp_ip.htm www.cs.pdx.edu/~jrb/tcpip.lectures.html www.raleigh.ibm.com/cgi-bin/bookmgr/BOOKS/EZ306200/CCONTENTS www.cisco.com/univercd/cc/td/doc/product/iaabu/centri4/user/scf4ap1.htm www.cis.ohio-state.edu/htbin/rfc/rfc1180.html www.jbmelectronics.com/tcp.htm

Encylopaedia http://www.freesoft.org/CIE/index.htm

TCP/IP Resources www.private.org.il/tcpip_rl.html

Understanding IP addresses http://www.3com.com/solutions/en_US/ncs/501302.html

Configuring TCP (RFC 1122) ftp://nic.merit.edu/internet/documents/rfc/rfc1122.txt

Assigned protocols, ports etc (RFC 1010) http://www.es.net/pub/rfcs/rfc1010.txt & /etc/protocols

More Information Some URLs 2

Page 26: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester26

Backup Slides

Page 27: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester27

UDP/IP packets sent between back-to-back systems Processed in a similar manner to TCP/IP Not subject to flow control & congestion avoidance algorithms Used UDPmon test program

Latency Round trip times measured using Request-Response UDP frames Latency as a function of frame size

Slope is given by:

Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s)

Intercept indicates: processing times + HW latencies Histograms of ‘singleton’ measurements Tells us about:

Behavior of the IP stack The way the HW operates Interrupt coalescence

pathsdata dt

db1 s

Latency Measurements

Page 28: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester28

Throughput Measurements

UDP Throughput Send a controlled stream of UDP frames spaced at regular intervals

n bytes

Number of packets

Wait timetime

Zero stats OK done

●●●

Get remote statistics Send statistics:No. receivedNo. lost + loss patternNo. out-of-orderCPU load & no. int1-way delay

Send data frames at regular intervals

●●●

Time to send Time to receive

Inter-packet time(Histogram)

Signal end of testOK done

Time

Sender Receiver

Page 29: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester29

PCI Bus & Gigabit Ethernet Activity

PCI Activity Logic Analyzer with

PCI Probe cards in sending PC Gigabit Ethernet Fiber Probe Card PCI Probe cards in receiving PC

GigabitEthernetProbe

CPU

mem

chipset

NIC

CPU

mem

NIC

chipset

Logic AnalyserDisplay

PCI bus PCI bus

Possible Bottlenecks

Page 30: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester30

SuperMicro P4DP8-2G (P4DP6) Dual Xeon 400/522 MHz Front side bus

6 PCI PCI-X slots 4 independent PCI buses

64 bit 66 MHz PCI 100 MHz PCI-X 133 MHz PCI-X

Dual Gigabit Ethernet Adaptec AIC-7899W

dual channel SCSI UDMA/100 bus master/EIDE channels

data transfer rates of 100 MB/sec burst

“Server Quality” Motherboards

Page 31: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester31

“Server Quality” Motherboards

Boston/Supermicro H8DAR Two Dual Core Opterons 200 MHz DDR Memory

Theory BW: 6.4Gbit

HyperTransport

2 independent PCI buses 133 MHz PCI-X

2 Gigabit Ethernet SATA

( PCI-e )

Page 32: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester32

Network switch limits behaviour End2end UDP packets from udpmon

Only 700 Mbit/s throughput

Lots of packet loss

Packet loss distributionshows throughput limited

w05gva-gig6_29May04_UDP

0

100

200

300

400

500

600

700

800

900

1000

0 5 10 15 20 25 30 35 40Spacing between frames us

Recv W

ire r

ate

Mb

its/s

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

w05gva-gig6_29May04_UDP

0

10

20

30

40

50

60

70

80

90

100

0 5 10 15 20 25 30 35 40Spacing between frames us

% P

acket

loss

50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes

w05gva-gig6_29May04_UDP wait 12us

0

2000

4000

6000

8000

10000

12000

14000

0 100 200 300 400 500 600Packet No.

1-w

ay d

ela

y u

s

0

2000

4000

6000

8000

10000

12000

14000

500 510 520 530 540 550Packet No.

1-w

ay d

ela

y u

s

Page 33: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester33

10 Gigabit Ethernet: UDP Throughput

1500 byte MTU gives ~ 2 Gbit/s Used 16144 byte MTU max user length 16080 DataTAG Supermicro PCs Dual 2.2 GHz Xenon CPU FSB 400 MHz PCI-X mmrbc 512 bytes wire rate throughput of 2.9 Gbit/s

CERN OpenLab HP Itanium PCs Dual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz PCI-X mmrbc 4096 bytes wire rate of 5.7 Gbit/s

SLAC Dell PCs giving a Dual 3.0 GHz Xenon CPU FSB 533 MHz PCI-X mmrbc 4096 bytes wire rate of 5.4 Gbit/s

an-al 10GE Xsum 512kbuf MTU16114 27Oct03

0

1000

2000

3000

4000

5000

6000

0 5 10 15 20 25 30 35 40Spacing between frames us

Rec

v W

ire

rate

Mb

its/

s

16080 bytes 14000 bytes 12000 bytes 10000 bytes 9000 bytes 8000 bytes 7000 bytes 6000 bytes 5000 bytes 4000 bytes 3000 bytes 2000 bytes 1472 bytes

Page 34: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester34

10 Gigabit Ethernet: Tuning PCI-X

16080 byte packets every 200 µs Intel PRO/10GbE LR Adapter PCI-X bus occupancy vs mmrbc

Measured times Times based on PCI-X times from

the logic analyser Expected throughput ~7 Gbit/s Measured 5.7 Gbit/s

mmrbc1024 bytes

mmrbc2048 bytes

mmrbc4096 bytes5.7Gbit/s

mmrbc512 bytes

CSR Access

PCI-X Sequence

Data Transfer

Interrupt & CSR UpdateKernel 2.6.1#17 HP Itanium Intel10GE Feb04

0

2

4

6

8

10

0 1000 2000 3000 4000 5000Max Memory Read Byte Count

PC

I-X

Tra

nsfe

r tim

e

us

measured Rate Gbit/srate from expected time Gbit/s Max throughput PCI-X

DataTAG Xeon 2.2 GHz

0

2

4

6

8

10

0 1000 2000 3000 4000 5000Max Memory Read Byte Count

PC

I-X

Tra

nsfe

r tim

e

us

measured Rate Gbit/srate from expected time Gbit/s Max throughput PCI-X

Page 35: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester35

Tests on the UKLight switched light-path Manchester : Dwingeloo

Throughput as a function of inter-packet spacing (2.4 GHz dual Xeon machines)

Packet loss for small packet size Maximum size packets can reach full

line rates with no loss, and there was no re-ordering (plot not shown).

gig03-jiveg1_UKL_25Jun05

0100200300400500600700800900

1000

0 10 20 30 40Spacing between frames us

Recv W

ire r

ate

Mbit/s

50 bytes

100 bytes

200 bytes

400 bytes

600 bytes

800 bytes

1000 bytes

1200 bytes

1400 bytes

1472 bytes

gig03-jiveg1_UKL_25Jun05

0.0001

0.001

0.01

0.1

1

10

100

0 10 20 30 40Spacing between frames us

% P

acket

loss

50 bytes

100 bytes 200 bytes

400 bytes 600 bytes

800 bytes 1000 bytes

1200 bytes 1400 bytes

1472 bytes

Page 36: VLBI Data Transfer Tests Recent and Current Work

ESLEA VLBI Bits&Bytes Workshop , 4-5 May 2006, R. Hughes-Jones Manchester36

UKLight using Mk5 recording terminals