The Messy Underlay Dilemma
Lessons Learned Securing K8s
Rob Hirschfeld, @zehicle
Hang on tight!We’re going deep.
To automated live encryption key rotation
Is Operating Kubernetes
HARD?DF
No.
But underlay is hard.
From http://www.slideshare.net/rhirschfeld/Containers, Orchestration and Security, Oh My!
Underlay vs Overlay
Platform Overlay
InfrastructureUnderlay
Ready State0Ready
Prerequisites1Prereq
Cluster API &Control Services
2Control
Worker Nodes3Nodes
Cluster Add-ons4Add-Ons
User Applications5Apps
Application Overlay
Underlay = Crust
Overlay = Filling
App = Topping
Underlay components are the operational integrations and prerequisites that go into building a system to before we can install a platform.
Why is Underlay Hard?
It’s Sequential, Multi-node & Environment Specific
Unlike development environments, production cannot overlook integration points
HA/LB PKI DNS SDN IPAM BMC RAID BIOS
Highly Available & Load Balanced
Public Key Infrastructure
Domain Name Servers
Software Defined Networks
IP Address Management
Out of Band Management
Drive Arrays Firmware
Even in cloud only deployments, these critical components for production platforms and applications require a level of different systems thinking.
Strong underlay builds an IT foundation.
Platform & Infrastructure
Underlay
Ready State0Ready
Prerequisites1Prereq
Cluster API &Control Services
2Control
Worker Nodes3Nodes
Cluster Add-ons4Add-Ons
User Applications5Apps
Application Overlay
DevOps Is Struggling
Developers don’t want do this infrastructure specific stuff
Companies are turning to containers and application platforms (like Kubernetes) to abstract the messy underlay.
While platforms hide complexity from developers, the issues still need to be addressed by Ops.
What makes underlay hard?
Let’s look at Internal PKI
Protection via Tunnel Level Security (TLS)
This is pretty complex stuff….
At a very basic level:
1. Send public key to client2. Client encrypts token with public key3. Client returns encrypted package4. Server decrypts token with private key5. Server uses token to encrypt tunnel
Server
PrivateKey
Client
Public
1
2 Token
4 Token
3 5
TLS
Trusted3rd Party
TrustAnchor
Trust
Chain of Trust in Public Key Infrastructure (PKI)
PKI is doing something amazing!
It establishes asynchronous trust
By relying on strong encryption
And Trust Anchors.
Server
PrivateKey
CertAuth Client
Public
TLS
Digital Signature
Root
Half of all Internet Traffic is encrypted! HTTPS > 50%
That’s great for public traffic where trust is anchored / embedded into clients
What about the internal traffic?
We want a “narrow trust domain” so there’s no embedded trust mechanism and we maintain full control.
We also want to protect both sides.
PublicEnd Points
East-West Traffic
Nor
th-S
outh
Tra
ffic
Back End Services
End User Clients
Server
Client
TrustTrust
Shared Root in Public Key Infrastructure (PKI)
Self-signing keys is not considered secure.
Internal PKI uses a shared root strategy
The private root of the CA must not be known to in the trust relationship.
Members of the Trust Domain rely on the CA to verify membership and identity.
External Trust Anchors are not desirable because we want an exclusive Trust Domain.
PrivateKey
Public
TLS
D. SigRoot
PrivateKey
D. Sig
Public
Root
SiteCA
MasterNode 1
WorkerNodes
WorkerNodes
MasterNode 2+
“Narrow” Trust Domain Limits Shared Roots
etcd
Kublet Proxy
API Server
Controller
Controller
etcd API Server
Scheduler
Scheduler
User!
Illustration from Slideshare Rob Hirschfeld
MasterNode 1
Root
WorkerNodes
WorkerNodes
MasterNode 2+
RootRoot Root
Shared Roots Create Trust Zones
etcd
Kublet Proxy
API Server
Controller
Controller
etcd API Server
Scheduler
Scheduler
User!
Services
Files
1
2
ConfigApp
How do we automate this?
Mix of Service and Configuration
1. Run a Root CA Service2. Create a unique Root3. Generate Key Pair Certificate for
Server4. Generate Digital Signature with
Public Key for Client(s)5. Configure Server with Certificate6. Configure Client with Signature
PrivateKey
Digital Sig
Public
SiteCA
Root
ServerCertificate
ClientCertificate
3 4
5 6
Public
Old Root
New Root
Root Rotation protects Trust Zone - Do it daily?!
By design, root rotation breaks cluster communications!
Like an in-place upgrade, rotation can break APIs.
We need to change the keys without breaking communication between components.
PreviousClusterMember
Cluster Trust Zone
Old New New New
Old
New
Step 1: Root Rotation without Downtime
Relies on Client to using support multiple digital signatures for the server.
Create a new root and propagate new certificates in the cluster.
Update the client configurations to use either signature. Old Root
New Root
PreviousClusterMember
Cluster Trust Zone
Old
Old
New
Old OldOld
New New
Step 2: Root Rotation without Downtime
Ensure all the desired clients have new signature.
Replace the server private key with the new value.
Old keys will no longer work.
In a daily rotation, leave both old and new signatures in place. Old Root
New Root
PreviousClusterMember
Cluster Trust Zone
Old
Old
New
Old Old
New NewNew
Happily, this is a repeatable pattern for underlay automation.
Questions?Rob Hirschfeld
@zehicle
RackN.comRebar.Digital
PrivateKey
SiteCA Root Public