container networking: the gotchas (mesos london meetup 11 may 2016)

27
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tige Networking in a Containerized Data Center: the Gotchas! MESOS LONDON MEETUP Andy Randall | @andrew_randall May 11, 2016

Upload: andrew-randall

Post on 12-Jan-2017

156 views

Category:

Technology


0 download

TRANSCRIPT

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Networking in a Containerized Data Center: the Gotchas!MESOS LONDON MEETUP

Andy Randall | @andrew_randall May 11, 2016

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Background

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Calico’s Adventures in Containerland

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Run anywhere Simple

Lightweight StandardSpeed

Cloud

Efficient

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

The original “container approach” to networking

All containers on a machine share the same IP address Gotcha #1:

WWW1

WWW2

80

80

Proxy8080

8081

Still most container deployments use this method!

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

World is moving to “IP per container”

Container Network Interface (CNI)

Container Network Model

(libnetwork, 0.19)

net-modules (Mesos 0.26)(future: CNI?)

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

We’ve solved “IP per VM” before…

VM1

VM2

VM3

Virtual Switch

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

We’ve solved “IP per VM” before…

VM1

VM2

VM3

Virtual Switch

VM1

VM2

VM3

Virtual Switch

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Consequences for containers (gotcha #2): Scale

Hundreds of servers, low churn Millions of containers, high churn

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

pHost 1

Virtual Switch / encapsulation

vNIC

pNIC

vNIC

VM1

Consequences for containers (gotcha #3): Layering

Packets are double encap’d!

ContainerA

ContainerB

ContainerC

Virtual Switch / encapsulation

veth0 veth1 veth2

pHost 2

Virtual Switch / encapsulation

VM2

ContainerD

ContainerE

ContainerF

Virtual Switch / encapsulation

pNIC

vNIC vNIC

veth0 veth1 veth2

Physical Switch

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Consequences for containers (gotcha #4): walled gardens

Legacy App

pHost 1

Virtual Switch / encapsulation

vNIC

pNIC

vNIC

VM1

ContainerA

ContainerB

ContainerC

Virtual Switch / encapsulation

veth0 veth1 veth2

Physical Switch

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

“Any intelligent fool can make things bigger, more complex… It takes a touch of genius – and a lot of courage – to move in the opposite direction.”

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

A Saner Approach: just route IP from the container

pHost 1

Virtual underlay

vNIC

pNIC

vNIC

VM1

ContainerA

ContainerB

ContainerC

Linux kernel routing (no encapsulation)

veth0 veth1 veth2

pHost 2

Virtual Underlay

VM2

ContainerD

ContainerE

ContainerF

Linux kernel routing (no encapsulation)

pNIC

vNIC vNIC

veth0 veth1 veth2

Physical Underlay

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Variant: 1 vm per host, no virtual underlay, straight-up IP

pHost 1 pNIC

vNIC

VM1

ContainerA

ContainerB

ContainerC

Linux kernel routing (no encapsulation)

veth0 veth1 veth2

pHost 2

VM2

ContainerD

ContainerE

ContainerF

Linux kernel routing (no encapsulation)

pNIC

vNIC

veth0 veth1 veth2

Physical Underlay

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Results: bare metal performance from virtual networks

Bare metal Calico OVS+VXLAN0123456789

10

Throughput Gbps

Bare metal Calico OVS+VXLAN0

20

40

60

80

100

120

CPU % per Gbps

Source: https://www.projectcalico.org/calico-dataplane-performance/

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Some container frameworks still assume port mapping E.g. Marathon load balancer service (but being fixed…)

Some PaaS’s not yet supporting IP per container But several moving to build on Kubernetes, and will likely pick it up

Gotcha #5: IP per container not yet universally supported

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

You can easily get your configuration wrong and get sub-optimal performance, e.g. select wrong Flannel back-end for your fabric turn off AWS src-dest IP checks get MTU size wrong for the underlay…

Gotcha #6: running on public cloud

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Consequences of MTU size…

t2.micro m4.xlarge0

50

100

150

200

250

300

qperf bandwidth

Bare Metal Calico

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Consequences of MTU size…

t2.micro m4.xlarge0

50

100

150

200

250

300

qperf bandwidth

Bare Metal Calico (MTU=1440) Calico (MTU=8980)

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Suppose we assign a /24 per Kubernetes node (=> 254 pods) Run 10 VMs per server, each with a Kubernetes node 40 servers per rack 20 racks per data center 4 data centers => now need a /15 for the rack, a /10 space for the data center,

and the entire 10/8 rfc1918 range to cover 4 data centers. … and hope your business doesn’t expand to need a 5th data

center!

Gotcha #7: IP addresses aren’t infinite

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

DC/OS / Mesos – multiple ways to network your container Net-modules – but only supports Mesos containerizer Docker networking – but then not fully integrated e.g. into MesosDNS CNI – possible future, but not here today Roll-your-own orchestrator-network co-ordination – the approach some of our users

have taken

Kubernetes CNI fairly stable Fine-grained policy being added – will move from alpha (annotation—based) to beta

(first-class citizen API) in 1.3

Docker Swarm / Docker Datacenter still early; libnetwork evolution? policy?

Gotcha #8: orchestration platforms support still evolving

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

Docker libnetwork provides limited functionality / visibility to plug-ins E.g. network name you specify as a user is NOT passed to the

underlying SDN

Consequences: Diagnostics hard to correlate Hard to enable ”side loaded” commands referring to networks created

on Docker command line (e.g. Calico advanced policy) Hard to network between Docker virtual network domain and non-

containerized workloads

Gotcha #9: Docker libnetwork is “special”

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

“Can you write a function that tells me when all nodes have caught up to the global state?”

Sure…

Gotcha #10: at cloud scale, nothing ever converges

function is_converged()return false

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

The Future of Cloud Networking

Flat routed IP networking with fine-grained policy

Broad set of overlay options

De facto industry standard for policy-driven networking for cloud native applications

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io

https://www.projectcalico.org/calico-dcos-demo-security-speed-and-no-more-port-forwarding/

Check it out – Calico is in the Mesosphere Universe!

@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io