managing short lived kubernetes (production) deployments

Managing short livedKubernetes deployments

Martin Danielsson (@donmartin76)

Solution Architect @ Haufe Group

$ whoami

C:\> WINDOWS.EXE

C/C++/C# Background

15+ years

$ docker ps

Containers & Kubernetes

Since ~2 years

wicked.haufe.io maintainer

OSS API Management

“Solution

Architect”

Developer

since 2006

Setting The Scene

Strategic move

to containers Modular

Architecture

Without Container

Experience

Current Occupation – A Cloud Journey

Hosted with Hoster

Long Release

cycles

(LOTS of) Manual

Work for Releases

Little Operations

Insight

Error tracking

very difficult

Non-Parity

Dev/Test/Prod

(Cost!)

Legacy Web App

(Java based)

Solution – Let’s go DevOps in the cloud!

A Process Pattern

Enabling CI/CD

Automatic Provisioni

ngFull Insight

Minimize Ops

Top Priorities

Chosen Solution Outline

Kubernetes

Azure Container Services

Azure as IaaS provider

Alternative Solution Outline

Kubernetes

kops (kubernetes operations)

AWS as IaaS provider

Steps to DevOps Happiness (for us)

Provision

Deploy CI/CD

Weekly for Production, Daily for Dev/Test

Ship when ready!

But… Why?

Target

“No-Ops”

No long-running

systems

Enable validation of

3rd Party component

upgrades

Incremental

changes

Practice Disaster

Recovery Daily

100% Reproducible

Deployments

On-demand Production

Identical Environments

Dem geneigten Zuhörer mag aufgefallensein…

Stateless Components

Stateful Components

Adding State (Persistence)

Full Provisioning

Create backupProvision new infrastructure

• From backups

• Same as disaster recovery!

Deploy components

• Using deployment pipelines

• Partly parallelized

Top level DNS switch

• Using DNS traffic manager

Destroy old infrastructure

• If tests succeed

Persistence Options

Roll your own persistence Persistence “as a service”

Self managed VMs (incl. NFS) Managed Disks

(AWS EBS, Azure Managed Disks)

DBaaS (many options)

Files as a service

(AWS EFS, Azure

Files)

Gluster/Ceph FS (cluster)

Persistence Requirements

A) Backup on demand (or auto)

B) Restore to other instance

AB) Clone on demand

On-demand Environments

ProdDev/Test

Load Testing…

Example – SQL Schema Update

Create backupProvision new infrastructure

Deploy components

Top level DNS switch

Destroy old infrastructure

Test/Validate

Advantages

On Demand Dev/

Test EnvsEnables Test

Of Risky Updates

Built-In Disaster

Recovery

-as-a-Service

Less Complex

No Operations Overhead

Supports A+B, or AB?If not: Can I live without Prod

Data in Dev/Test Envs?

Do I trust Service Provider

to live up to SLA?

In case of

What can I do?

Limitations

Possible Constraints

Implementation

EffortSLA Requirements

(Downtime)

Data SizeBackup/Restore Time

Team Size

Ops Skills Needed

Our Solution Vector

Resource GroupKubernetes Cluster

Solution Architecture (Infrastructure)

ks8 Master

ks8 Agent

ks8 Agent n

NFSVM(s)

Postgres VM(s)

Database as a Service

NFS Storage/Postgres Storage

• Backup – Cloning disks from running system

• Restore – Cloning from backups

• Very much a transient technology!• But it works…

• Moving to DBaaS (e.g. Cosmos DB) over time

Endless Variants…

Conclusion and Takeaways

k8s Ops possible

as a Team

Requires full (test)

automation

Team dedicationRethinking ops is

challenging

No Silver Bullet

Assess your requirements

Thanks!

Twitter donmartin76GitHub donmartin76

linkedin.com/in/martindanielsson/

www.haufegroup.comwork.haufegroup.io

wicked.haufe.io

Backup Slides

Persistence problems and possible solutions

Data Type Solution Technology Backup/Restore Complexity

Plain Files NFS AB Low

CephFS/GlusterFS A+B High

SQL Database Azure SQL Server A+B Medium

Azure Postgres-aaS AB Low

AWS RDS for Postgres AB Low

NoSQL Azure Cosmos DB A+B Medium

AWS DynamoDB A+B (via tools) Medium

Integration & e2e Test

Build & Unit Test

Docker Image

Deploy

Building blocks of CI/CD pipelines

• E.g., Blue/Green• Rolling Updates• Also used for initial

deployment

Incremental Frontend Deployment

Merge feature to master

•After code review

• Including test suite changes

Build master branch

• Includes unit testing

•First integration tests

Deploy to integration system

•Run integration tests

•Rollback if failing

Deploy to Production

•Run e2e integration tests

•Rollback if failing

Incremental Backend Deployment

Merge feature to master

•After code review

• Including test suite changes

Build master branch

• Includes unit testing

•First integration tests

Deploy to integration system

•Blue/Green with integration tests

Deploy to Production

•Blue/Green with integration tests

managing short lived kubernetes (production) deployments

Software

docker, kubernetes, ccp...•ui –kubernetes, api...

monitoring kubernetes with prometheus (kubernetes ireland,...

kubernetes basics

lessons learned building kubernetes controllers · g’day....

airﬂow kubernetes

kubernetes security

audio will commence when the meeting begins ·...

“we’ve ﬁnally arrived at automated continuous...

introduction to kubernetes (k8) - televend...introduction to...

kubernetes ingress services by vmware nsx® advanced load...

troubleshooting kubernetes

zero downtime-java-deployments-with-docker-and-kubernetes

container & kubernetes

kubernetes for the ansible users - openinfradays 2019 ·...

kubernetes architecture and introduction – paris...

adaptto() 2019 - deep-dive into cloud-native aem...

kubernetes ha @ appdirect - montreal kubernetes meetup

endocode kubernetes meetup: architecture patterns for...

kubernetes workshop

kubernetes comparison