of running kubernetes for publication deep dive: the value · decoupling the os from the hardware...

68
#vmworld Deep Dive: The Value of Running Kubernetes on vSphere Frank Denneman, VMware, Inc. @FrankDenneman Michael Gasch, VMware, Inc. @embano1 CNA1553BE #CNA1553BE VMworld 2018 Content: Not for publication or distribution

Upload: others

Post on 09-Sep-2019

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#vmworld

Deep Dive: The Valueof Running Kubernetes

on vSphereFrank Denneman, VMware, Inc.

@FrankDennemanMichael Gasch, VMware, Inc.

@embano1

CNA1553BE

#CNA1553BE

VMworld 2018 Content: Not for publication or distribution

Page 2: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

Disclaimer

2©2018 VMware, Inc.

This presentation may contain product features orfunctionality that are currently under development.

This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new features/functionality/technology discussed or presented, have not been determined.

VMworld 2018 Content: Not for publication or distribution

Page 3: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

Agenda

3©2018 VMware, Inc.

Kubernetes Primer

Customer Scenario – Making the Case for Bare Metal

Experience Report – Kubernetes on Bare Metal vs. vSphere

QnA

VMworld 2018 Content: Not for publication or distribution

Page 4: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

4©2018 VMware, Inc.

Kubernetes Primer

VMworld 2018 Content: Not for publication or distribution

Page 5: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 5©2018 VMware, Inc.

Google Search(late 1990s)

The Origin of Kubernetes

VMworld 2018 Content: Not for publication or distribution

Page 6: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 6©2018 VMware, Inc.

Revolutionizing the Way we build Distributed(cloud-native) Applications today.

Google Search Pillars:• Commodity• Fault-Tolerant Software• Fraction of the Cost from High-End Servers

The Origin of KubernetesGoogle Search

Source: https://ai.google/research/pubs/pub49VMworld 2018 Content: Not for publication or distribution

Page 7: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 7©2018 VMware, Inc.

Platform Engineering Responsibilities

VMworld 2018 Content: Not for publication or distribution

Page 8: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 8©2018 VMware, Inc. CONFIDENTIAL

VMworld 2018 Content: Not for publication or distribution

Page 9: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 9©2018 VMware, Inc.

“We must treat the Datacenter itself as one massive Warehouse-scale Computer.”

The Origin of KubernetesThe Datacenter as a Computer

SSH

VMworld 2018 Content: Not for publication or distribution

Page 10: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 10©2018 VMware, Inc.

Google Search(late 1990s)

Borg(~2003)

The Origin of Kubernetes

VMworld 2018 Content: Not for publication or distribution

Page 11: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 11©2018 VMware, Inc.

Google Search(late 1990s)

Borg(~2003)

Cgroups(2007)

The Origin of Kubernetes

VMworld 2018 Content: Not for publication or distribution

Page 12: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 12©2018 VMware, Inc.

Google Search(late 1990s)

Borg(~2003)

Cgroups(2007)

Omega(~2012)

The Origin of Kubernetes

VMworld 2018 Content: Not for publication or distribution

Page 13: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 13©2018 VMware, Inc.

The Origin of KubernetesContainers become Mainstream

In Search for a Common Language

VMworld 2018 Content: Not for publication or distribution

Page 14: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 14©2018 VMware, Inc.

The Origin of KubernetesSo what is a Container, really?

Kernel Mode

Cgroups

Namespaces

Security Capabilities

Scheduler

Syscall

task_struct

Scheduling Entity (se)

“running”

syscall.Exec(ENTRYPOINT/CMD)*

A Structure in Kernel Memory. The Kernel has no Notion of a “Container”. It’s yet another Executable.

User Mode

Docker Engine

ContainerCreate()

* After Container Sandbox Initialization(nsenter.go/nsexec.c)

sched_classfair.c (CFS)

VMworld 2018 Content: Not for publication or distribution

Page 15: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 15©2018 VMware, Inc.

Google Search(late 1990s)

Borg(~2003)

Cgroups(2007)

Omega(~2012)

Docker(2013)

The Origin of Kubernetes

VMworld 2018 Content: Not for publication or distribution

Page 16: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 16©2018 VMware, Inc.

Google Search(late 1990s)

Borg(~2003)

Cgroups(2007)

Omega(~2012)

Docker(2013)

The Origin of Kubernetes

Kubernetes(2014)

VMworld 2018 Content: Not for publication or distribution

Page 17: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 17©2018 VMware, Inc.

"Kubernetes is an open-source System for automating Deployment, Scaling, and

Management of containerized Applications."

The Origin of KubernetesContainer Orchestration

VMworld 2018 Content: Not for publication or distribution

Page 18: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 18©2018 VMware, Inc.

Kubernetes Cluster

KubernetesHigh-Level Architecture

Infrastructure(Compute, Storage, Networking)

Control Plane Worker

Pod Pod Pod Pod PodAPI

Kubernetes Cloud Provider

VMworld 2018 Content: Not for publication or distribution

Page 19: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

19©2018 VMware, Inc.

Customer ScenarioMaking the Case for Bare Metal

VMworld 2018 Content: Not for publication or distribution

Page 20: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 20©2018 VMware, Inc.

ABC Inc. is the Leader in Manufacturing Wayback Machines

Enterprise IT Organization with separate Infrastructure, Linux/Middleware and Development Teams (Silos)

>90% standardized and virtualized on VMware vSphere

Going through Digital Transformation to become more Customer and Feedback driven• Need to develop (iterate) faster with an agile Approach

Technical Vehicle: Containers and “cloud-native” Application Architectures• Kubernetes as the Framework to build and run these new Applications• Embrace and contribute to Open Source Software

Meet ABC Inc.

VMworld 2018 Content: Not for publication or distribution

Page 21: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 21©2018 VMware, Inc.

Linux Team at ABC Inc. decided to deploy Kubernetes on Bare Metal

Justification:• New (cloud-native) Applications don’t need vSphere Features like HA and vMotion• Containers are more lightweight, replacing VM’s and the Hypervisor• Kubernetes provides Hypervisor Functionality, e.g. Resource Management and HA• Virtualization reduces Performance of containerized Applications• Reduce Complexity and Costs by eliminating the Hypervisor from the Stack• IT Infrastructure not agile enough (no Self-Service)

ABC Inc.’s vSphere Team reached out to VMware for Help

ABC Inc.’s Decision to go Bare Metal

VMworld 2018 Content: Not for publication or distribution

Page 22: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 22©2018 VMware, Inc.

Back to 2005

merchoid.com

VMworld 2018 Content: Not for publication or distribution

Page 23: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

23©2018 VMware, Inc.

Experience ReportKubernetes on Bare Metal vs. vSphere

VMworld 2018 Content: Not for publication or distribution

Page 24: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 24©2018 VMware, Inc.

Day 0 Planning and “Green Lights”

Day 1 Experiences with first Deployments

Day 2 Container and Cluster Sprawl

Day 3 Maintenance & Availability

Terminology

VMworld 2018 Content: Not for publication or distribution

Page 25: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

25©2018 VMware, Inc.

Day 0Planning and “Green Lights”

VMworld 2018 Content: Not for publication or distribution

Page 26: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 26©2018 VMware, Inc.

Day 0Planning and “Green Lights”

Kubernetes Cluster

Infrastructure(Compute, Storage, Networking)

Cloud Provider(Custom)

External Dependencies

DNS DBs IPAM

Images CA Auth

Secrets Monitoring Logging

CustomIntegrations

Label: AZ=AZ-1VMworld 2018 Content: Not for publication or distribution

Page 27: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 27©2018 VMware, Inc.

Day 0Realization: Managing Bare Metal Systems is hard

VMworld 2018 Content: Not for publication or distribution

Page 28: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

28©2018 VMware, Inc.

How vSphere Can Help

VMworld 2018 Content: Not for publication or distribution

Page 29: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 29©2018 VMware, Inc.

Average Time to get HW in DC – Unpredictable Process

VMworld 2018 Content: Not for publication or distribution

Page 30: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 30©2018 VMware, Inc.

Average Time to get Hardware in Data Center

86 Days

VMworld 2018 Content: Not for publication or distribution

Page 31: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 31©2018 VMware, Inc.

Hardware CompatibilityDecoupling the OS from the Hardware reduces operational Overhead

a simple NIC revision change can directly impact the Kubernetes host

Virtualized hardware decouples the OS from the underlying hardware.Hardware abstraction reduces operational overhead for supported firmware versions of components.

Configuration management done at the physical layer (firmware, drivers, etc). (Drift?)(Supported?)

VMworld 2018 Content: Not for publication or distribution

Page 32: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 32©2018 VMware, Inc.

Non Disruptive PatchingvMotion Workload away for Hardware, Firmware or Driver Update

a simple NIC revision change can directly impact the Kubernetes host

Need to Patch, vMotion workloadNo disruption

Need to patch – Kill workload

VMworld 2018 Content: Not for publication or distribution

Page 33: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 33©2018 VMware, Inc.

Kill Doesn’t matter for Stateless WorkloadsETCD isn’t stateless, and what about top 10 Workloads in Containers today?

VMworld 2018 Content: Not for publication or distribution

Page 34: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 34©2018 VMware, Inc.

Strong Security IsolationStrong Isolation between Workloads with efficient Resource Usage

a simple NIC revision change can directly impact the Kubernetes host

VMs provides strong isolation between guest, allowing multi tenancy. Efficient use of resources

Containers are processes in Linux Kernel, security concerns can lead to reduced resource utilization

Tenant A Tenant B

VMworld 2018 Content: Not for publication or distribution

Page 35: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 35©2018 VMware, Inc.

Modern DCs operate various workloads in different packaging formats

vSphere provides unified platform for these workloads

Use your current tool and skillset to manage this workloads

Focus on creating value

Functional Use of HardwareGeneral vs Dedicated

general purpose allows mixed workloads and superior resource utilization

dedicated hardware to a particular function hinders resource optimization

VMworld 2018 Content: Not for publication or distribution

Page 36: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

36©2018 VMware, Inc.

Day 1Experiences with first Deployments

VMworld 2018 Content: Not for publication or distribution

Page 37: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 37©2018 VMware, Inc.

Physical Host

Day 1In the old Bare Metal Days (Pre-Virtualization Era)

Kernel

App

M

Hardware

16 Cores 128GB RAID NIC-Teaming

G G W W W

NUMA sysctl nice IRQ-Balance

Bins/Libs

Almost exclusive Access to Host Resources for this App (1:1)

Best Practices for this Deployment Type were developed

App uses Host Information for Runtime Tuning

Downside: Utilization & Agility

M

G

W

OS Thread: main()

OS Thread: GC

OS Thread: Worker/PoolVMworld 2018 Content: Not for publication or distribution

Page 38: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 38©2018 VMware, Inc.

Physical Host

Day 1Containers and Kubernetes on Bare Metal to the Rescue?

Kernel

Hardware

64 Cores(HT)

384GB RAID NIC-Teaming

NUMA sysctl Cgroups IRQ-Balance

Container Runtime

Kubelet

Not all Runtimes are Cgroup-aware!

How to tune per Workload?

Resource Contention and Workload Interference!

Isolation (Security)?

Utilization & Agility kube-scheduler

Node: BM001Capacity:

cpus: 64memory: 384GB

Allocatable:cpus: 60memory: 360GB

How much to reserve vs. Waste?

VMworld 2018 Content: Not for publication or distribution

Page 39: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

41©2018 VMware, Inc.

How vSphere Can Help

VMworld 2018 Content: Not for publication or distribution

Page 40: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 42©2018 VMware, Inc.

Hyperthreading

VMworld 2018 Content: Not for publication or distribution

Page 41: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 43©2018 VMware, Inc.

Hyperthreading in vSphere

VMworld 2018 Content: Not for publication or distribution

Page 42: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

‹#› 44©2018 VMware, Inc.

Consistent performance is obtained by avoiding NUMA boundaries

VMworld 2018 Content: Not for publication or distribution

Page 43: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 45©2018 VMware, Inc.

NUMA Architecture

VMworld 2018 Content: Not for publication or distribution

Page 44: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 46©2018 VMware, Inc.

NUMA Architecture

VMworld 2018 Content: Not for publication or distribution

Page 45: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

47©2018 VMware, Inc.

Day 2Container and Cluster Sprawl

VMworld 2018 Content: Not for publication or distribution

Page 46: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 48©2018 VMware, Inc.

Day 2Container and Cluster Sprawl

Org

aniz

atio

n

EnvironmentVMworld 2018 Content: Not for publication or distribution

Page 47: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 49©2018 VMware, Inc.

Day 2Container and Cluster Sprawl

Org

aniz

atio

n

EnvironmentVMworld 2018 Content: Not for publication or distribution

Page 48: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 50©2018 VMware, Inc.

Day 2Container and Cluster Sprawl

Imb

alan

ce

CostVMworld 2018 Content: Not for publication or distribution

Page 49: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

51©2018 VMware, Inc.

How vSphere Can Help

VMworld 2018 Content: Not for publication or distribution

Page 50: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 52©2018 VMware, Inc.

Multi-Tenancy

general purpose allows mixed workloads and superior resource utilization

VM

vSphere Cluster

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VMworld 2018 Content: Not for publication or distribution

Page 51: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 53©2018 VMware, Inc.

Multi-TenancyIncrease in Utilization: Scale out & redistribute

general purpose allows mixed workloads and superior resource utilization

VM

vSphere Cluster

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VMworld 2018 Content: Not for publication or distribution

Page 52: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

55©2018 VMware, Inc.

Day 3Maintenance & Availability

VMworld 2018 Content: Not for publication or distribution

Page 53: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 56©2018 VMware, Inc.

Day 3Maintenance & Availability (Control Plane)

Admission Control and Failover Capacity (MTTR)?Proactive HA?Impact of Host Maintenance/

Failure on Control Plane?

VMworld 2018 Content: Not for publication or distribution

Page 54: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 57©2018 VMware, Inc.

Day 3Maintenance & Availability (Workloads)

Kubernetes Control Plane

Controller Manager Scheduler

“QA”“Dev” “Prod” “QA”

4 CPUs 4 CPUs 4 CPUs

“Prod” “QA”“Dev”

2 CPUs 1 CPU 1 CPU 2 CPUs 1 CPU1 CPU 1 CPU

Only considering beta/stable and in-tree Kubernetes FeaturesDisruptive Pod Priority & Preemption, incl. Priority Queue, beta in v1.11

Example assumes shared File System for Persistent Volumes

Queue: “QA”“Dev”“Prod”pick

* Default, configurable** Fixed

pod-eviction-timeout

5min*ReconcilerMaxWaitForUnmountDuration

6min**

“Dev”

1 CPU

“QA”

1 CPU

VMworld 2018 Content: Not for publication or distribution

Page 55: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

58©2018 VMware, Inc.

How vSphere Can Help

VMworld 2018 Content: Not for publication or distribution

Page 56: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 59©2018 VMware, Inc.

Multi Cluster ConfigurationPriority

VM

vSphere Cluster

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

Important Important Important Important

More Important More Important More Important More Important

Meh Meh Meh

VMworld 2018 Content: Not for publication or distribution

Page 57: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 60©2018 VMware, Inc.

HA Restart PriorityEnsure “Prod” Systems get restarted first

VMworld 2018 Content: Not for publication or distribution

Page 58: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 61©2018 VMware, Inc.

Restart Dependency

Works based on VM to VM rules

Only 1 level, so for A-B-C create two rules

• VM Group B depends on A• VM Group C depends on B• ETCD-Masters-Workers

Specify when to start next batch

• Resources allocated• Powered On• Guest Heartbeat• App Heartbeat

Or “HA Orchestrated Restart” as it is also called

VMworld 2018 Content: Not for publication or distribution

Page 59: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 62©2018 VMware, Inc.

Kubernetes Cluster Node RolesControl Plane (Masters) and Workers

VM

vSphere Cluster

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

(Master) (Master) (Master) (Workers) (Worker)

VMworld 2018 Content: Not for publication or distribution

Page 60: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 63©2018 VMware, Inc.

Multi-TenancyDRS Affinity Rules

VM

vSphere Cluster

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

(Master) (Master) (Master) (Workers) (Worker)

VMworld 2018 Content: Not for publication or distribution

Page 61: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 64©2018 VMware, Inc.

Multi-TenancyDRS Affinity Rules

VM

vSphere Cluster

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

(Master) (Master) (Master) (Workers) (Worker)

VMworld 2018 Content: Not for publication or distribution

Page 62: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 65©2018 VMware, Inc.

Multiple Fault DomainsQuorum dictates Design

VM

Fault Domain A

VM VM VM VM VM VM VMVM

K8S Prod

VM

K8S Prod

V

K8S Prod

VM

K8S Prod

VM

K8S Test

VM

K8S Test

VM

K8S Test

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

VM

K8S Prod

(Master) (Master)

Fault Domain B

(Worker)

VM

K8S Prod(Master) (Worker)

VM

K8S Prod

(Worker) (Worker)

(VM Anti-Affinity)

Host-VM Rules

VMworld 2018 Content: Not for publication or distribution

Page 63: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 66©2018 VMware, Inc.

DRS proactively avoid needing the use of HA• Integrates with server vendor’s monitoring software• Health states are passed to DRS• DRS reacts based on health state of hardware

None of DRS affinity/anti-affinity rules are violated

Quarantine Mode accepts workloads if performance degradation is imminent

Proactive HAMoving Workloads away at first Signs of Trouble

VMworld 2018 Content: Not for publication or distribution

Page 64: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 68©2018 VMware, Inc.

VM Latency SensitivityCPU Core Isolation

VMworld 2018 Content: Not for publication or distribution

Page 65: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

DON’T FORGET TO FILL OUT YOUR SURVEY.

#vmworld #CNA1553BE

VMworld 2018 Content: Not for publication or distribution

Page 66: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

THANK YOU!

#vmworld #CNA1553BE

VMworld 2018 Content: Not for publication or distribution

Page 67: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

#CNA1553BE 71©2018 VMware, Inc.

Global Services: Reimagining Support Giving you a more proactive, personalized, effortless experience

Read All About ItSupport Insider Blog

https://blogs.vmware.com/kb/

Meet the Team Connect at the

VMVillage’s Listening Post and the Global

Services Meeting Center

Download VMware Skyline™

Visit the VMware Skyline station in the Solutions

Exchange VMworld 2018 Content: Not for publication or distribution

Page 68: of Running Kubernetes for publication Deep Dive: The Value · Decoupling the OS from the Hardware reduces operational Overhead. a simple NIC revision change can directly impact the

More Sessions on Kubernetes

Try HOL

VMware Cloud-Native Apps

Follow Us

https://blogs.vmware.com/cloudnativehttps://www.youtube.com/c/VMwareCloudNativeApps

@cloudnativeapps

Tuesday, Nov 6CNA1656BE Put a Lid on It: Securing Containers and Kubernetes on vSphereCNA1634BE Container Portfolio at VMwareCNA1816BE Container and Kubernetes 101 for Admins

Wednesday, Nov 7CNA2755BE Architecting PKS for Production Lessons Learned from PKS DeploymentsCNA2084BE Intro to VMware Kubernetes Engine – Managed K8sService on Public CloudDC3845KE Cloud and Developer Keynote: Public Clouds and Kubernetes at ScaleCNA1674BE Deep Dive: Run Kubernetes in Production with PKS

Thursday, Nov 8CNA3124BE Deep Dive: VMware Kubernetes Engine – Kubernetes as a Service on Public CloudCNA2009BE Run Stateful Apps on Kubernetes with PKS: Highlight WebLogic Server

1932 VMware Kubernetes Engine – Getting Started

1931 VMware Pivotal Container Service and Kubernetes – Getting Started1935 VMware Pivotal Container Service on VMware NSX-T

VMworld 2018 Content: Not for publication or distribution