reflections on data plane performance, iptables and ipsets
TRANSCRIPT
![Page 1: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/1.jpg)
Reflections on data plane performance, iptables and ipsets
Neil Jerram – Metaswitch & Project Calico
@neiljerram www.projectcalico.org
![Page 2: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/2.jpg)
Who am I?
• Free software hacker since 1990s
• Metaswitch (previously Data Connection) since 1995
; line+.el
;
; version 1.1
;
; This has not (yet) been accepted by the Emacs Lisp archive,
; but if it is the archive entry will probably be something like this:
;; line+|Neil Jerram|[email protected]|
;; Line Numbering & Interrupt Driven Actions|
;; 1993-02-18|1.1|<archive pathname of line+.el>|
; Mished and mashed by Neil Jerram <[email protected]>,
; Monday 21 December 1992.
![Page 3: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/3.jpg)
Free software work
• Emacs
• Guile
• Openmoko and GTA04 smartphones
![Page 4: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/4.jpg)
Metaswitch and Project Calico
• 30+ year provider of high quality networking software, but mostly proprietary
• Software -> hardware -> and now back again!
• Now also leading projects as open source• Project Clearwater
• Project Calico
![Page 5: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/5.jpg)
So, Calico?
• Connectivity and security for workloads (aka endpoints, aka micro-services, aka containers or VMs) in an elastic computing environment• e.g. a data center
• Emphasis on simplicity and scalability
• Based on standard Linux features• routing, iptables• and Internet protocols (BGP)
• Mainline case L3 only
![Page 6: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/6.jpg)
Old, zone-based security
![Page 7: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/7.jpg)
Services in an elastic environment
![Page 8: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/8.jpg)
Distributed firewall security
![Page 9: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/9.jpg)
Calico architecture
![Page 10: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/10.jpg)
Data plane performance questions
• Can we get same bandwidth between endpoints as between those endpoints’ hosts?
• What is CPU cost, and how does it compare with other networking approaches?
• What are the effects of our iptables and ipset programming?
![Page 11: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/11.jpg)
Testing methodology
• Two hosts, directly connected by 10Gb link• 8 core
• 64Gb RAM
• 3.13 kernel
• No tuning
• qperf, using TCP
• Measure CPU usage, raw throughput and packet latency
![Page 12: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/12.jpg)
Configurations
• Bare metal, i.e. host to host
• Between OpenStack VMs• ‘TAP’ interface between VM and host
• Between containers• veth pair between container namespace and host namespace
• Between OpenStack VMs using Open vSwitch (OVS) and VXLAN
• MTU 1500, send sizes 20000 and 500
![Page 13: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/13.jpg)
Data plane throughput
• Saturation for 20k messages …(red bars)
• … but not for 500 messages(blue bars)
• Why?• OpenStack better than bare metal?
• OVS case reaches >8Gb/s if MTU is increased to 9000
![Page 14: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/14.jpg)
CPU usage
• CPU-limited for small messages
• OpenStack cases can use more cores
• Extra CPU cost for virtualization• Namespace
• TAP or veth interface
• Routing in guest as well as host
![Page 15: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/15.jpg)
CPU usage per throughput
• CPU required to drive each Gb/s of throughput
![Page 16: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/16.jpg)
Latency
• Tiny extra latency for containers
• More for VMs• But acceptable
• Note micro seconds• Not milli!
![Page 17: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/17.jpg)
Security rules
![Page 18: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/18.jpg)
iptables and ipsets
• iptables on a given host should be the composition of many logical security rules
• Will this impact data plane performance?
• Actually, no
-A felix-FORWARD -i tap+ -j felix-FROM-ENDPOINT-A felix-FORWARD -o tap+ -j felix-TO-ENDPOINT-A felix-FORWARD -i tap+ -j ACCEPT-A felix-FORWARD -o tap+ -j ACCEPT-A felix-FROM-ENDPOINT -i tap7f470881-51 -g felix-from-7f470881-51-A felix-FROM-ENDPOINT -j DROP-A felix-INPUT -i tap+ -j felix-FROM-ENDPOINT-A felix-INPUT -i tap+ -j ACCEPT-A felix-TO-ENDPOINT -o tap7f470881-51 -g felix-to-7f470881-51-A felix-TO-ENDPOINT -j DROP-A felix-from-7f470881-51 -m conntrack --ctstate INVALID -j DROP-A felix-from-7f470881-51 -m conntrack --ctstate RELATED,ESTABLISHED -j RETURN-A felix-from-7f470881-51 -p udp -m udp --sport 68 --dport 67 -j RETURN-A felix-from-7f470881-51 -s 10.28.0.40/32 -m mac --mac-source FA:16:3E:4E:7A:0E -g felix-p-_6b340324948a39b-o-A felix-from-7f470881-51 -m comment --comment "Anti-spoof DROP (endpoint 7f470881-5156-47ce-a67d-b971ef5e5cde):" -j DROP-A felix-p-_6b340324948a39b-i -p icmp -m set --match-set felix-v4-_6b340324948a39b src -j RETURN-A felix-p-_6b340324948a39b-i -s 172.18.203.20/32 -p tcp -m multiport --dports 22 -j RETURN-A felix-p-_6b340324948a39b-i -s 172.18.203.20/32 -p udp -m multiport --dports 5060 -j RETURN-A felix-p-_6b340324948a39b-i -s 172.18.203.20/32 -p tcp -m multiport --dports 80 -j RETURN-A felix-p-_6b340324948a39b-i -m comment --comment "Default DROP rule (72d696a9-f715-495f-9152-7f5e6a69fd0f):" -j DROP
![Page 19: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/19.jpg)
What saves us?
• conntrack
• ipsets scale well, thanks to hash table implementation
• Nested design for source/destination interface mapping
![Page 20: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/20.jpg)
Arjan Schaaf’s measurements
![Page 21: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/21.jpg)
What is happening here?
• http://www.slideshare.net/ArjanSchaaf/docker-network-performance-in-the-public-cloud
• Various approaches to networking between containers on AWS hosts• For this case Calico uses IP-in-IP between the hosts
• Calico bandwidth less than half of native
• We set up the same system, got same results as Arjan• For t2.micro bandwidth = 65.3 MB/sec compared with native = 125 MB/sec.• For m4.xlarge bandwidth = 108 MB/sec compared with native = 267 MB/sec• Why?
![Page 22: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/22.jpg)
It’s all about the MTU
• Calico in a public cloud uses IP-in-IP, with tunnel MTU = 1440
• 1440 was optimised for GCE, which has an MTU of 1460 on its VM interfaces
• But AWS instances have an MTU = 9001!• So native tests were using jumbo frames, and the calico test was using 1440.
• If Calico’s tunnel MTU is increased to 8980• For t2.micro, Calico bandwidth = 114 MB/sec
• For m4.xlarge, Calico bandwidth = 266 MB/sec
• Problem solved – Calico throughput is now close to native
![Page 23: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/23.jpg)
So what have we learned?
• With Calico connectivity, VMs or containers can saturate a 10Gb link between hosts, just as much as the hosts themselves could
• There is a CPU cost to virtualization• But mostly inevitable if you want virtualization at all (non-accelerated)
• Calico does not add any significant extra cost
• Conntrack largely saves us from the effects of complex iptables• ipsets and clever programming design also help
• Be humble about performance comparisons
![Page 24: Reflections on data plane performance, iptables and ipsets](https://reader031.vdocuments.us/reader031/viewer/2022022415/586485491a28ab0e3093b387/html5/thumbnails/24.jpg)
Further information, and thanks!
• Project Calico• http://www.projectcalico.org/• http://docs.projectcalico.org/en/latest/• https://github.com/projectcalico
• Blog on Calico data plane performance• http://www.projectcalico.org/calico-dataplane-performance/
• Thanks!• @neiljerram• @projectcalico• www.metaswitch.com