container networking meetup march 31 2016
TRANSCRIPT
Project Calico is sponsored by@projectcalico
Sponsored by
Networking in a Containerized Data Center: the Gotchas!MICROSERVICES FOR ENTERPRISES MEETUP
Andy Randall | @andrew_randall Palo Alto, March 31, 2016
Project Calico is sponsored by@projectcalico
(n) North American. “an instance of publicly tricking someone or exposing them to ridicule, especially by means of an elaborate deception.”
Project Calico is sponsored by@projectcalico
Calico’s Adventures in Containerland
Project Calico is sponsored by@projectcalico
Run anywhere Simple
Lightweight StandardSpeed
Cloud
Efficient
Project Calico is sponsored by@projectcalico
Project Calico is sponsored by@projectcalico
The original “container approach” to networking
All containers on a machine share the same IP address Gotcha #1:
WWW1
WWW2
80
80
Proxy8080
8081
Still most container deployments use this method!
Project Calico is sponsored by@projectcalico
World is moving to “IP per container”
Container Network Interface (CNI)
Container Network Model
(libnetwork, 0.19)
net-modules (0.26)(future: CNI?)
Project Calico is sponsored by@projectcalico
We’ve solved “IP per VM” before…
VM1
VM2
VM3
Virtual Switch
Project Calico is sponsored by@projectcalico
We’ve solved “IP per VM” before…
VM1
VM2
VM3
Virtual Switch
VM1
VM2
VM3
Virtual Switch
Project Calico is sponsored by@projectcalico
Consequences for containers (gotcha #2): Scale
Hundreds of servers, low churn Millions of containers, high churn
Project Calico is sponsored by@projectcalico
pHost 1
Virtual Switch / encapsulation
vNIC
pNIC
vNIC
VM1
Consequences for containers (gotcha #3): Layering
Packets are double encap’d!
ContainerA
ContainerB
ContainerC
Virtual Switch / encapsulation
veth0 veth1 veth2
pHost 2
Virtual Switch / encapsulation
VM2
ContainerD
ContainerE
ContainerF
Virtual Switch / encapsulation
pNIC
vNIC vNIC
veth0 veth1 veth2
Physical Switch
Project Calico is sponsored by@projectcalico
Consequences for containers (gotcha #4): walled gardens
Legacy App
pHost 1
Virtual Switch / encapsulation
vNIC
pNIC
vNIC
VM1
ContainerA
ContainerB
ContainerC
Virtual Switch / encapsulation
veth0 veth1 veth2
Physical Switch
Project Calico is sponsored by@projectcalico
“Any intelligent fool can make things bigger, more complex… It takes a touch of genius – and a lot of courage – to move in the opposite direction.”
Project Calico is sponsored by@projectcalico
A Saner Approach: just route IP from the container
pHost 1
Virtual underlay
vNIC
pNIC
vNIC
VM1
ContainerA
ContainerB
ContainerC
Linux kernel routing (no encapsulation)
veth0 veth1 veth2
pHost 2
Virtual Underlay
VM2
ContainerD
ContainerE
ContainerF
Linux kernel routing (no encapsulation)
pNIC
vNIC vNIC
veth0 veth1 veth2
Physical Underlay
Project Calico is sponsored by@projectcalico
Variant: 1 vm per host, no virtual underlay, straight-up IP
pHost 1 pNIC
vNIC
VM1
ContainerA
ContainerB
ContainerC
Linux kernel routing (no encapsulation)
veth0 veth1 veth2
pHost 2
VM2
ContainerD
ContainerE
ContainerF
Linux kernel routing (no encapsulation)
pNIC
vNIC
veth0 veth1 veth2
Physical Underlay
Project Calico is sponsored by@projectcalico
Results: bare metal performance from virtual networks
Bare metal Calico OVS+VXLAN0123456789
10
Throughput Gbps
Bare metal Calico OVS+VXLAN0
20
40
60
80
100
120
CPU % per Gbps
Source: https://www.projectcalico.org/calico-dataplane-performance/
Project Calico is sponsored by@projectcalico
Some container frameworks still assume port mapping E.g. Marathon load balancer service (but being fixed…)
Some PaaS’s not yet supporting IP per container But several moving to build on Kubernetes, and will likely pick it up
Gotcha #5: IP per container not yet universally supported
Project Calico is sponsored by@projectcalico
You can easily get your configuration wrong and get sub-optimal performance, e.g. select wrong Flannel back-end for your fabric turn off AWS src-dest IP checks get MTU size wrong for the underlay…
Gotcha #6: running on public cloud
Project Calico is sponsored by@projectcalico
Consequences of MTU size…
t2.micro m4.xlarge0
50
100
150
200
250
300
qperf bandwidth
Bare Metal Calico
Project Calico is sponsored by@projectcalico
Consequences of MTU size…
t2.micro m4.xlarge0
50
100
150
200
250
300
qperf bandwidth
Bare Metal Calico (MTU=1440) Calico (MTU=8980)
Project Calico is sponsored by@projectcalico
Suppose we assign a /24 per Kubernetes node (=> 254 pods) Run 10 VMs per server, each with a Kubernetes node 40 servers per rack 20 racks per data center 4 data centers => now need a /15 for the rack, a /10 space for the data center,
and the entire 10/8 rfc1918 range to cover 4 data centers. … and hope your business doesn’t expand to need a 5th data
center!
Gotcha #7: IP addresses aren’t infinite
Project Calico is sponsored by@projectcalico
Kubernetes CNI fairly stable Fine-grained policy being added – will move from alpha (annotation—based) to first-
class citizen API
Mesos – multiple ways to network your container Net-modules – but only supports Mesos containerizer Docker networking – but then not fully integrated e.g. into MesosDNS CNI – possible future, but not here today Roll-your-own orchestrator-network co-ordination – the approach some of our users
have taken
Docker Swarm / Docker Datacenter still early; libnetwork evolution? policy?
Gotcha #8: orchestration platforms support still evolving
Project Calico is sponsored by@projectcalico
Docker libnetwork provides limited functionality / visibility to plug-ins E.g. network name you specify as a user is NOT passed to the
underlying SDN
Consequences: Diagnostics hard to correlate Hard to enable ”side loaded” commands referring to networks created
on Docker command line (e.g. Calico advanced policy) Hard to network between Docker virtual network domain and non-
containerized workloads
Gotcha #9: Docker libnetwork is “special”
Project Calico is sponsored by@projectcalico
“Can you write a function that tells me when all nodes have caught up to the global state?”
Sure…
Gotcha #10: at cloud scale, nothing ever converges
function is_converged()return false
Project Calico is sponsored by@projectcalico