less-than-best-effort service for community wireless networks...

Less-than-Best-Effort Service for Community Wireless Networks:��

Challenges at Three Layers

1

GAIA mee'ng, 89th IETF Mee'ng 6. 3. 2014 Michael Welzl

[email protected]

Context: the LCDNet vision

From h4p://publicaccesswifi.org/the-‐research/paws-‐approach/ 2

Layer 2 LBE Outside not under control

3

Inside under control

•  Can control: –  Both ways: PHY rates, PCF? 802.11e QoS? – Download: RA algorithm, DCF parameters like CW

download

upload

RA alg. impact

4

0%

20%

40%

60%

80%

100%

Upload Download

Upl

oad/

Dow

nloa

d PH

Y R

ate

Dis

tribu

tion

1 Mbps2 Mbps5.5 Mbps6 Mbps9 Mbps11 Mbps12 Mbps18 Mbps24 Mbps36 Mbps48 Mbps54 Mbps

(a) AMRR

0%

20%

40%

60%

80%

100%

Upload Download

Upl

oad/

Dow

nloa

d PH

Y R

ate

Dis

tribu

tion


(b) SampleRate

0%

20%

40%

60%

80%

100%

Upload Download

Upl

oad/

Dow

nloa

d PH

Y R

ate

Dis

tribu

tion


(c) Minstrel

Fig. 4. Rate distribution of TCP data packets

(256 KB) to provide the opportunity for cwnd to grow withno restriction. The Maximum Segment Size (MSS) was set to1448 bytes (MTU=1500 bytes). To gather the rate statistics wehave used horst [27] to monitor the wireless traffic in additionto the information dumped by the madwifi kernel modules andtools. The experiments were conducted in a typical wired-cum-wireless scenario where wireless nodes upload/download datato/from a server which is connected to the AP via a 100 Mbpswired link (Figure 1). All nodes were synchronized with eachother using ntp.

A. Mixed-mode Traffic

We have begun our experiments by evaluating differentrate adaptation mechanisms under a mixed-traffic scenariowhere a different number of uploading/downloading stationsis participating in contention. Figure 6 shows the impact of

TABLE IITEST-BED SETUP

Test-bed Emulab NDlabPC Pentium III 600 MHz Dell OptiPlex GX620

Memory 256 MB 1 GB802.11 device D-Link DWL-AG530 D-Link DWL-G520

Chipset AR5212 AR5001Xtx queue length 200 pktsDriver (madwifi) 0.9.3.3, 0.9.4 0.9.4

OS FC4, FC6 FC14Linux kernel 2.6.18.6, 2.6.20.6 2.6.35.11

Node numbers 6-26 10

Fig. 5. NDlab test-bed

0

5e+06

1e+07

1.5e+07

2e+07

2.5e+07

3e+07

3.5e+07

8DL 6DL/2UP 4DL/4UP 2DL/6UP 8UP

Aggr

egat

e Th

roug

hput

(b/s

)

Number of Uploading/Downloading Stations

54MAMRR

SampleRateMinstrel

Fig. 6. Mixed upload/download traffic using TCP SACK

the upload/download ratio on the system’s aggregate through-put. While both AMRR and SampleRate degrade the systemaggregate throughput extensively with increase of up/downratio, Minstrel is able to keep the throughput level almostas high as when 54 Mbps fixed-rate is used for all values ofup/down ratio. We carried out this experiment for the two mostcommonly used TCP variants, SACK and CUBIC; since thebehavior was similar, we only show SACK.

The result in Figure 6 matches our expectation: the adverseimpact of RA on the overall performance tends to grow withthe number of uploading stations. Since it seems to us thatcombining uploads and downloads does not add any relevantinformation to our results, for the rest of the paper, we focusonly on pure download and upload scenarios for the sake ofclarity. These scenarios constitute the best and worst case.

B. Uplink vs. Downlink

The overall uplink and downlink throughput of TCP SACKis shown in Figure 7 for different RA mechanisms when 8wireless nodes are contending to access the medium. AMRR,SampleRate and Minstrel achieve 66%-84%, 91% and 98%-100% of the 54 Mbps fixed rate throughput in the down-link scenario, respectively, while AMRR and SampleRate’sthroughput drop to 2%-6% and 38%-44% of the 54 Mbps fixedrate in the uplink scenario. However Minstrel is still able tokeep this ratio at 93%.

We have also evaluated the performance of other TCPvariants to see if the above phenomenon is observable fordifferent TCP flavours. These are CUBIC [28], HSTCP [29]and Westwood [30]. The reason for this choice is that CUBICis the default congestion control mechanism in the Linux

From: Naeem Khademi, Michael Welzl, Stein Gjessing: "Experimental Evalua'on of TCP Performance in Mul'-‐rate 802.11 WLANs”, IEEE WoWMoM 2012.

Layer 3 LBE •  Shortest path rouWng is THE rouWng method –  This overloads the shortest path –  If there are mulWple paths, LBE traffic could take a detour J •  Tools exist: e.g. MPLS TE, ECMP… but typically sWll shortest paths used

•  mulW topology (MT) rouWng, meant for IP fast reroute, could be a good fit

•  Small packets could be costly (power consumpWon) – May want to aggregate + tunnel them (TCMTF)

5

Layer 4 LBE •  End systems: LEDBAT isn’t great – Much more stuff exists –  LBE != LBE…

•  Routers / AP / Modem: AQM can differenWate (early mark/drop) Queue scheduling can differenWate (CBQ etc)

6

careful tuning [21]. LEDBAT has been shown to be proneto various sorts of problems, among them the well-known“late-comer advantage” [22]. Figure 3 shows another problem(described in detail in [23]): because it uses a minimum(“base”) delay in its calculation and repeatedly measures thisminimum delay while continuing to send traffic, it considersits self-caused delay as the new minimum, making the overalldelay of LEDBAT grow larger and larger when it is used longenough.

ROS et al.: ASSESSING LEDBAT’S DELAY IMPACT 1047

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

CD

F

Total transfer time (s)

With LEDBAT flowNo LEDBAT flow

(a) RTT for the TCP flows: 25 ms.

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6 7 8 9

CD

F

Total transfer time (s)

With LEDBAT flowNo LEDBAT flow

(b) RTT for the TCP flows: 100 ms.

Fig. 6. CDF of the total transfer time for web-like TCP flows, with andwithout a “crossed” LEDBAT flow. The dotted vertical lines indicate averagetransfer times.

Fig. 7. Ping delay across an Ethernet link that was set to 500 kbit / 50 mswith netem, with the libutp’s “utp test” program running.

series of chunks starts. The size of chunks is drawn from auniform distribution between 8 and 12 1500-byte packets. Theshort interval between two consecutive chunks in a “train” of10 close chunks is drawn from a uniform distribution, between0 and 5 ms.

The total transfer time measures the interval between start-ing sending the first chunk in a given series, and the end ofreception of the last ACK for the last chunk in that series. Thedistribution of transfer times, depicted in Fig. 5, clearly showhow the overall latency drastically increases in the presence ofa long-lived LEDBAT flow. Naturally, the impact of the addedTARGET ms of queue is relatively larger when the TCP flowhas a lower minimum RTT.

Contrary to Fig. 5, in Fig. 6 web mice go in the downstreamdirection, while the LEDBAT flow still goes upstream. Thereis not as much competition with LEDBAT for the bottleneckbuffer in the uplink (only TCP ACKs, not full-sized datapackets). Still, by its contributing to increasing the RTT, theLEDBAT flow noticeably augments the overall latency.

III. CONCLUSIONS

The results presented here seem alarming. We only carriedout simulations, and our models could be criticized—e.g.,there are various ways to generate web-like traffic. Given thatwe have sketched situations in which LEDBAT does not workwell, our results can however be regarded as “existence proof”,

and we note that there is probably reason to worry even ifonly a subset of them materializes in a practical setting. Asmentioned earlier, other authors have found other issues withthe mechanism. The oft-cited successful use of a LEDBATvariant in BitTorrent may not suffice to justify using LEDBATfor purposes such as backups or software updates: assumingdefault parameters, the latter involves continuous data transfersthat may cause base delay to be refreshed multiple times,whereas further investigations would be needed to understandif the effects presented here may also appear with BitTorrent.To check if BitTorrent’s underlying implementation itself maybehave as in our simulations, we carried out a simple test in aLAN with BitTorrent’s libutp6, shown in Figure 7, where wecan see a growing ping delay akin to the delay in Figure 1.

Stopping a transfer for at least TARGET ms before updatingbase delay, and choosing a smaller value for TARGET, couldserve as simple temporary fixes to the problems that wehave identified in this paper. We are not the first to saythat 100 ms is a large value; we can however not make amore concrete recommendation at this stage. Picking the rightvalue is a trade-off, the result of which should be evaluatedin comparison with the related work in [5].

IV. ACKNOWLEDGEMENTS

D. Ros would like to express his gratitude to Prof. SteinGjessing and the Department of Informatics (IFI) at theUniversity of Oslo for their kind support.

REFERENCES

[1] A. Abu and S. Gordon, “Impact of delay variability on LEDBATperformance,” in Proc. 2011 IEEE AINA.

[2] G. Carofiglio, L. Muscariello, D. Rossi, and S. Valenti, “The quest forLEDBAT fairness,” in Proc. 2010 IEEE GLOBECOM.

[3] J. Gettys, “Bufferbloat: dark buffers in the Internet,” IEEE InternetComputing, vol. 15, no. 3, pp. 95–96, 2011.

[4] R. Jesup, “Issues with LEDBAT in wide deployment,” in 2012 84thIETF Meeting.

[5] D. Ros and M. Welzl, “Less-than-best-effort service: a survey of end-to-end approaches,” IEEE Commun. Surveys and Tutorials, 2012, acceptedfor publication, to appear.

[6] D. Rossi, C. Testa, S. Valenti, and L. Muscariello, “LEDBAT: the newBitTorrent congestion control protocol,” in Proc. 2010 ICCCN.

[7] J. Schneider, J. Wagner, R. Winter, and H.-J. Kolbe, “Out of my way—evaluating Low Extra Delay Background Transport in an ADSL accessnetwork,” in Proc. 2010 ITC.

[8] S. Shalunov, G. Hazel, J. Iyengar, and M. Kuehlewind, “Low ExtraDelay Background Transport (LEDBAT),” RFC 6817, IETF, Dec. 2012.

[9] D. X. Wei and P. Cao, “NS-2 TCP-Linux: an NS-2 TCP implementationwith congestion control algorithms from Linux,” in Proc. 2006 WNS2.

[10] D. Wischik, “Short messages,” 2007 Royal Society Workshop on Net-works: Modelling and Control.

6Downloaded from https://github.com/bittorrent/libutp on Dec. 3, 2012. Theonly code change was to multiply the variable “g send limit” by 10 in thefile utp test.cpp to transfer long enough.

Fig. 3. Ping delay across an Ethernet link that was set to 500 kbit/s /50 ms with netem, with the libutp utp test program (BitTorrent’s LEDBATimplementation) running. Figure taken from [23].

LEDBAT is by no means the only end-to-end approach forLBE at the transport layer; a survey is given in [24]. Not allLBE mechanisms might have the same problems as LEDBAT– e.g. Delay-Gradient TCP [25] only considers the gradient ofthe delay signal and hence does not produce a standing queue.On the other hand, Delay-Gradient TCP bases its decisionon the Round-Trip Time (RTT), which can be affected bynoise on the return link. To get a more precise idea of thestate of the forward queue, LEDBAT uses changes in One-Way Delay (OWD). This requires the sender to timestamppackets, giving the receiver the necessary OWD information,which it would then have to feed back to the sender wherethe rate is calculated. Hence, other than Delay-Gradient TCPwhich is available in FreeBSD since version 9.0 as a pluggablecongestion control mechanism for TCP, LEDBAT can probablynot be correctly implemented as a one-sided change to TCP.

Such deployment considerations are important for theLCD-Net LBE scenario because the LBE user’s stack is notunder control of the system. Here, we basically have twochoices for realizing a LBE service at the transport layer:

1) Split TCP connections [26] at the LBE AP suchthat the AP acts like a receiver towards the remoteInternet node and a sender towards the LBE node fordownstream traffic, and like a receiver towards theLBE node and a sender towards the remote Internetnode for upstream traffic. Then, whenever the AP actslike a sender, sender-based LBE schemes could beused, and whenever it acts like a receiver, receiver-based LBE schemes could be used.

2) Do not split TCP connections but monitor their delayand change the TCP Receiver Window (rwnd) inpackets when needed to force the sender to slowdown. The Receiver Window is used by several ofthe receiver-based schemes in [24].

This is only considering “normal” TCP connections betweenthe remote Internet node and the LBE node, not UDP or other

relatively common transports such as Multipath TCP [27] orSCTP [28]. As with TCP, it should also be possible to interceptsuch other traffic at the LBE router and affect it accordingly.

Note that it may be too cautious to always impose LBE-oriented congestion control onto the LBE user: it is reallyonly necessary when the paying user is active, and even then,it is actually only necessary when the two users share thesame bottleneck. Shared bottlenecks can be detected usingactive or passive methods (cf. [29] and references therein),but such schemes are not yet in widespread use, as theyhave historically been regarded as either too hard to use(too computationally intensive) or not reliable enough. This isperhaps going to change now, as the RTP Media CongestionAvoidance Techniques (RMCAT) IETF group is planning onstandardizing a shared bottleneck detection method for the sakeof WebRTC communication.

V. CONCLUSIONS

The LCD-Net paradigm – making LBE-style Internet ac-cess possible for everyone – has a noble goal. However,in practice, it seems that very little well-functioning LBEtechnology is in place in today’s networks. In particular, havingno interference whatsoever with the paying customer’s trafficmay be hard to achieve; instead, it may be worth trying toquantify the degree of such interference. As we have seen inthis paper, problems exist at several layers – quite possibly alsothe physical and application layer in addition to the three layersthat we have focused on. These challenges are not necessarilyobstacles that cannot be overcome, but it seems clear that asignificant amount of research is needed before all the rightLBE mechanisms can be put in place.

REFERENCES

[1] A. Sathiaseelan and J. Crowcroft, “LCD-Net: Lowest Cost DenominatorNetworking,” SIGCOMM Comput. Commun. Rev., vol. 43, no. 2,pp. 52–57, Apr. 2013. [Online]. Available: http://doi.acm.org/10.1145/2479957.2479966

[2] A. Khalaj, N. Yazdani, and M. Rahgozar, “Effect of the contentionwindow size on performance and fairness of the IEEE 802.11standard,” Wireless Personal Communications, vol. 43, no. 4,pp. 1267–1278, 2007. [Online]. Available: http://dx.doi.org/10.1007/s11277-007-9300-5

[3] IEEE, “Standard for Information Technology- Telecommunications andInformation Exchange Between Systems-Local and Metropolitan AreaNetworks-Specific Requirements-Part 11: Wireless LAN Medium Ac-cess Control (MAC) and Physical Layer (PHY) Specifications,” IEEEStd 802.11-1997, pp. i –445, 2007.

[4] M. Lacage, M. H. Manshaei, and T. Turletti, “Ieee 802.11rate adaptation: A practical approach,” in Proceedings of the7th ACM International Symposium on Modeling, Analysis andSimulation of Wireless and Mobile Systems, ser. MSWiM ’04. NewYork, NY, USA: ACM, 2004, pp. 126–134. [Online]. Available:http://doi.acm.org/10.1145/1023663.1023687

[5] R. T. Morris, J. C. Bicket, and J. C. Bicket, “Bit-rate selection inwireless networks,” Masters thesis, MIT, Tech. Rep., 2005.

[6] D. Xia, J. Hart, and Q. Fu, “Evaluation of the Minstrel rate adaptationalgorithm in IEEE 802.11g WLANs,” in Communications (ICC), 2013IEEE International Conference on, June 2013, pp. 2223–2228.

[7] N. Khademi, M. Welzl, and S. Gjessing, “Experimental evaluationof TCP performance in multi-rate 802.11 WLANs,” in 13th IEEEInternational Symposium on a World of Wireless, Mobile and Multi-media Networks (WoWMoM) (IEEE WoWMoM 2012), San Francisco,California, USA, Jun. 2012.

From: David Ros, Michael Welzl: "Assessing LEDBAT's Delay Impact", IEEE Communica'ons LeZers 17(5), pp. 1044-‐1047, 2013.

Conclusion

•  LEDBAT isn’t enough, it’s a larger story – Research needed

7

Thank you!

QuesWons?

less-than-best-effort service for community wireless networks...

Documents