diagnosing wireless tcp performance problems: a case study tianbo kuang, fang xiao, and carey...
Post on 19-Dec-2015
216 views
TRANSCRIPT
Diagnosing Wireless TCP Performance Problems:
A Case Study
Tianbo Kuang, Fang Xiao, and Carey WilliamsonUniversity of Calgary
Agenda
Motivation Background TCP IEEE 802.11b Wireless LAN (WLAN) Universal Serial Bus (USB)
Experimental Methodology Results
Motivation
TCP performance often degrades over wireless networks; reasons “well-known”Solutions to improve TCP performance over wireless links exist, but how well do they work in a real wireless LAN environment?How do link-layer mechanisms interact with TCP and affect the overall performance?Where is the bottleneck in the network protocol processing path, and why?
Background - TCP
Widely used on the Internet (e.g. Web)Connection-oriented, reliable byte streamWindow-based flow controlSlow start and congestion avoidanceFast retransmission, fast recoveryOther extensions, including TCP SACKMany different versions in use
Background – IEEE 802.11b
An “Ethernet-like” LAN standard (11 Mbps)Infrastructure mode and ad hoc modeCarrier-sense multiple access with collision avoidance (CSMA/CA) to reduce collisionsMAC-layer: positive acknowledgment and retransmissions (to recover from channel errors)Dynamic rate adaptation: can choose data transmission rate of 1, 2, 5.5, or 11 Mbps
Background – USB
Widely used industry standard for connecting a computer to its peripherals (bus topology)Lots of USB-based (wireless) network cardsData transfers managed by Host Controller (HC)Synchronous bus: 1 msec slots for transfersTransfer requests are handled using vertical and horizontal linked-list data structuresTwo processing modes for HC: Breadth-First or Depth-First
High Speed Bandwidth Reclamation (HSBR)
Background – USB (cont’d)
Queued mode (keep HC busy) Transfer size: 64 – 1023 bytes each
DF
BF
Background – USB (cont’d)
Queued mode (keep HC busy) Transfer size: 64 – 1023 bytes each
FSBR
Experimental Methodology Experimental Setup (HW/SW) Laptop – Compaq Evo 719c with multiport USB
wireless card (Linux 2.4) Access point – Lucent RG-1000 Stationary host on Ethernet LAN – SunOS 5.8 Run netperf on laptop and netserver on wired host SnifferPro 4.6 wireless “sniffer” and tcpdump
Experimental Factors USB mode, driver settings Wireless channel (distance) between laptop and AP
netserver Ethernet APnetperf
Sniffer
data
acks
laptopstcpdump
Initial Result
Windows 2000 implementation of TCP is more than 3 times faster than Linux TCP!
Reason: Linux driver bug (2 Mbps vs 11 Mbps)
OSLinux
Windows 2000
1.52 Mbps
5.11 Mbps
Throughput
Results – USB Experiments
Results – USB ExperimentsWith FSBR disabled, USB is the bottleneckWith FSBR enabled (the default in Linux), the wireless network is the bottleneckQueued mode makes no difference with FSBR on, but helps when FSBR is turned offQueued mode (even with FSBR turned on) may be very important when higher speed wireless link is used (e.g. IEEE 802.11a)
Results – TCP ProblemsThe “ack holding” problem A bug in the NIC firmware or interrupt driver of
Linux OS causes excessive delays (> 100 ms) This leads to a spurious TCP timeout
The retransmission of previously acked data! Actually just an artifact of tcpdump observation
The lack of a TCP “fast retransmit” after receiving three duplicate Acks A deliberate (but not well-known) feature of TCP
Results – TCP “ack holding”
Results – TCP “ack holding”
(laptop) (wired)
(sniffer) (kernel)
Results – TCP “repeated data”
The spurious TCP timeout was not properly detected
Caused by initialization bug in Linux TCP implementation
The “repeated data” problem is an artifact induced by presence of link layer buffer
Results – TCP “suppressed FR”
This is a deliberate feature to prevent a false fast retransmit after a timeoutThis situation is quite likely to occur in a wireless environmentIt’s not a bug, but a feature! (correct)
Results – Wireless Problems
We observed unusually high collision rates on the wireless channel for TCP transfers, which we call the TCP data/ACK collision problem Scenario: laptop and AP are 1 m apart For TCP, MAC-layer retransmit rate: 4.58-4.73% For UDP, MAC-layer retransmit rate: 0.47-0.98% In general, a retransmission rate of 1.75%-7.2%
has been seen for other vendor HW/SW (N = 1) For TCP, disabling MAC-layer retransmission
degrades throughput by 23%
Results – Wireless Problems (TCP data/ACK collisions)
Results – Wireless Problems
The MAC-layer rate adaptation problem Scenario: laptop and AP are 100 m apart Lousy TCP throughput, lots of retransmits Reason: the multiplicative increase and
multiplicative decrease (MIMD) bandwidth probing mechanism causes network thrashing and wastes battery power
The small congestion window causes temporary deadlock if the TCP receiver uses delayed Ack
Results – Wireless Problems (MAC-layer rate adaptation)
Conclusions TCP performance on WLAN can be wacky! (at least for Compaq Multiport 802.11b USB wireless card under Linux 2.4)Several factors can affect overall performance Poorly configured USB bus could be the bottleneck Linux TCP implementation bug makes TCP unable
to recognize the first spurious timeout Poor MAC-layer rate adaptation algorithm can
cause a “network thrashing” problem TCP’s data/ACK structure may induce excessive
collisions at the MAC layer on wireless LANs
Questions?