kubernetes love at first sight? hofman - ah.pdf · name status roles age version k8snode2098...
Post on 22-May-2020
20 Views
Preview:
TRANSCRIPT
KubernetesLove at first sight?
15, February 2018
Joost Hofman (Lead Developer @ Albert Heijn IT Online)Milo van der zee (Senior Developer @Albert Heijn IT Online)
Agenda
How?Kubernetes
Why at AH?
Relational problems Is it real love?
Questions
kubectl get
Kubernetes
is an open-source system for automating deployment, scaling, and management of containerized applications.
kubectl get
Kubernetes - Searches
kubectl get
Kubernetes
kubectl get
Service
POD POD
1 … n1 … n
Kubernetes
kubectl get
Operator /Developer
Kubernetes Master
API Server Controller Manager
Scheduler
ETCD
Kubernetes Node
Kubelet cAdvisor kube-proxy
Pod Pod Pod Pod…
Kubernetes Node
Kubelet cAdvisor kube-proxy
Pod Pod Pod Pod…Up to 5000
Users
Plugin Network - Calico
user@host $ kubectl get nodesNAME STATUS ROLES AGE VERSIONk8snode2098 Ready,SchedulingDisabled master 12d v1.8.4+coreos.0k8snode2099 Ready,SchedulingDisabled master 12d v1.8.4+coreos.0k8snode2100 Ready,SchedulingDisabled master 12d v1.8.4+coreos.0k8snode2101 Ready node 12d v1.8.4+coreos.0k8snode2102 Ready node 12d v1.8.4+coreos.0k8snode2103 Ready node 12d v1.8.4+coreos.0k8snode2104 Ready node 12d v1.8.4+coreos.0k8snode2105 Ready node 12d v1.8.4+coreos.0k8snode2107 Ready node 12d v1.8.4+coreos.0k8snode2108 Ready node 12d v1.8.4+coreos.0k8snode2109 Ready node 12d v1.8.4+coreos.0k8snode2110 Ready node 12d v1.8.4+coreos.0k8snode2111 Ready node 12d v1.8.4+coreos.0
Kubernetes
user@host $ kubectl get pods -o wideNAME READY STATUS IP NODEshoppinglist-widget-3162246403-q7c1x 1/1 Running 10.233.106.55 k8snode1657subscription-service-8cc4c97fb-dh9zz 1/1 Running 10.233.87.218 k8snode1656subscription-service-8cc4c97fb-t7wrj 1/1 Running 10.233.73.169 k8snode1651
taxonomy-neo4j-neo4j-core-0 1/1 Running 10.233.124.123 k8snode1814taxonomy-neo4j-neo4j-core-1 1/1 Running 10.233.73.147 k8snode1651taxonomy-neo4j-neo4j-core-2 1/1 Running 10.233.79.109 k8snode1813taxonomy-service-7b4fb7f8d5-c6mvb 1/1 Running 10.233.79.105 k8snode1813taxonomy-service-7b4fb7f8d5-h2hjk 1/1 Running 10.233.68.145 k8snode1655
gateway-3060515939-57r22 1/1 Running 10.233.124.98 k8snode1814gateway-3060515939-9lqzk 1/1 Running 10.233.68.185 k8snode1655gateway-3060515939-fkt9k 1/1 Running 10.233.71.29 k8snode1654gateway-3060515939-ls9pv 1/1 Running 10.233.79.101 k8snode1813
Kubernetes
# kubectl -n online-prd describe pod gateway-3060515939-57r22
Name: gateway-3060515939-57r22
Namespace: online-prd
Node: k8snode1814/150.83.153.243
Start Time: Wed, 14 Feb 2018 13:12:03 +0100
Labels: name=gateway
Status: Running
IP: 10.233.124.98
Containers:
gateway:
Image: regisry-docker.online.ah.nl:443/ah-open-api-gateway:0.1.2
Port: <none>
Pods – kubectl describe pod api gateway
# kubectl describe svc gatewayName: gatewayNamespace: online-prdLabels: run=gatewayAnnotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"run":"gateway"},"name":"gateway","namespace":"online-prd"},"spec":{"ports":...Selector: run=gatewayType: ClusterIPIP: 10.233.52.234Port: <unset> 8080/TCPTargetPort: 8080/TCPEndpoints: 10.233.124.98:8080,10.233.68.185:8080,10.233.71.29:8080 + 1 more...Session Affinity: NoneEvents: <none>
Service - kubectl describe svc api gateway
-A KUBE-SERVICES -d 10.233.52.234/32 -p tcp -m tcp
--dport 443 -j SVC-JFMNS
-A SVC-JFMNS --mode random --probability 0.25 -j KUBE-SEP-JPX2Q
-A SVC-JFMNS --mode random --probability 0.33 -j KUBE-SEP-KUJYT
-A SVC-JFMNS --mode random --probability 0.5 -j KUBE-SEP-HTGFR
-A SVC-JFMNS --mode random -j KUBE-SEP-JP5GT
-A SEP-JPX2Q -p tcp -m recent
-j DNAT --to-destination 143.54.22.4:6443
kubectl get
api service – iptables
Why @ Albert Heijn?
kubectl get
2015
MonolithBinary couplingScalability problemsGrowth issuesCI/CD impossibleDowntime
ScalableDecouplingRolling updatesServicesCI/CD to the maxIsolation of codeZero downtimeTechnology agnostic
NOW and future
Why @ Albert Heijn?
kubectl get
… on a modern, scalable, automated platform
Scalable architecture and technology
Commodity
hardware
Virtualization
Virtual hardware
Container management platform
- Manual
- Within months
- Semi-automated
- Within weeks
- Fully automated
- Within minutes
Containers
On Premise VS Cloud
kubectl get
No cloud options in 2016 and 2017
How?
kubectl get
How?
kubectl get
A HTTP call to appietoday.nl
kubectl get
Users
Loadbalancer
Nginx - Ingress
Frontend (service)
Frontend (pod)
API Gateway (service)
API Gateway (pod)
API (service)
API (pod)
IDP (service)
IDP (pod)
Our setup?
kubectl get
PlatformServicesAPI GatewayFrontend
25+ services
5 Clusters
40+ nodes
650+ Docker containers
Continuous delivery
Continuous delivery –
Automated from
development to
production
Authorization
Authentication
Throttling
RoutingAutomate platform
deployment with Ansible
Relational problems: Communication.
kubectl get
Relational problems: Storage.
kubectl get
On premise Storage
vSphere volumes
Host path
NFS
Relational problems: Storage.
kubectl get
On premise Storage
GlusterFS
Relational problems: Postgres on Gluster.
kubectl get
pg_restore: [archiver (db)] Error from TOC entry 53398; 0 16503 TABLEDATA l1aaux_sci sdmcleodpg_restore: [archiver (db)] COPY failed for table "l1aaux_sci": ERROR:unexpected data beyond EOF in block 9391 of relation base/16386/17043HINT: This has been seen to occur with buggy kernels; considerupdating your system.CONTEXT: COPY l1aaux_sci, line 319329: "1854661 \N1.05156717906094999 1378796678.44843268 2012-02-0107:04:39.5+00 2012-02-01 07:04:38.4484..."pg_restore: [archiver (db)] Error from TOC entry 53399; 0 16528 TABLEDATA l1afts_dbl sdmcleodpg_restore: [archiver (db)] COPY failed for table "l1afts_dbl": ERROR:unexpected data beyond EOF in block 10097 of relationbase/16386/17068HINT: This has been seen to occur with buggy kernels; considerupdating your system.
Relational problems: Postgres on Gluster.postgres source code: src/backend/storage/buffer/bufmgr.c
kubectl get
/** We get here only in the corner case where we are trying to extend* the relation but we found a pre-existing buffer marked BM_VALID.* This can happen because mdread doesn't complain about reads beyond* EOF (when zero_damaged_pages is ON) and so a previous attempt to* read a block beyond EOF could have left a "valid" zero-filled* buffer. Unfortunately, we have also seen this case occurring* because of buggy Linux kernels that sometimes return an* lseek(SEEK_END) result that doesn't account for a recent write. In* that situation, the pre-existing buffer would contain valid data* that we don't want to overwrite. Since the legitimate case should* always have left a zero-filled buffer, complain if not PageIsNew.*/bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);if (!PageIsNew((Page) bufBlock))
ereport(ERROR,(errmsg("unexpected data beyond EOF in block %u of relation %s",
blockNum, relpath(smgr->smgr_rnode, forkNum)),errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
Relational problems: Communication.
kubectl get
Nodes can’t reach each other anymore
KubeProxy can’t reach API
iptables are broken
Network interface changes
Subnet Flannel and Docker mismatch (magicly)
Relational problems: Communication.
kubectl get
Nodes can’t reach each other anymore
Migration from Flannel to Calico resulted in a small downtime but a very stable network afterwards
Created a Network test DaemonSet, as our own relation therapist
Relational problems: Communication.
kubectl get
[prd-node1:root@k8snode1650 ~]# bridge fdb | grep cali33:33:00:00:00:01 dev calif8b8ce32fae self permanent01:00:5e:00:00:01 dev calif8b8ce32fae self permanent...
[prd-node1:root@k8snode1650 ~]# ip -d link show calif8b8ce32fae8: calif8b8ce32fae@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> state UP mode DEFAULT
link/ether 7e:3f:ee:5e:d4:ed brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 veth addrgenmode eui64
[prd-node1:pnlmv17y@k8snode1650 ~]$ route -n | grep cali10.233.65.153 0.0.0.0 255.255.255.255 UH 0 calif8b8ce32fae10.233.65.155 0.0.0.0 255.255.255.255 UH 0 cali2b5d60cd0be10.233.65.156 0.0.0.0 255.255.255.255 UH 0 cali9fa8da3783210.233.65.158 0.0.0.0 255.255.255.255 UH 0 cali4c2e295795a10.233.65.159 0.0.0.0 255.255.255.255 UH 0 cali5c975203c3b
Relational problems: Communication.
kubectl get
[pnlmv17y@k8snode2110 ~]$ ip addr...13: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN qlen 1
link/ipip 0.0.0.0 brd 0.0.0.0inet 10.233.65.133/32 scope global tunl0
valid_lft forever preferred_lft forever...
[pnlmv17y@k8snode2101 ~]$ route -n | grep tunl010.233.76.192 162.53.123.117 255.255.255.192 UG 0 0 0 tunl010.233.78.192 162.53.123.110 255.255.255.192 UG 0 0 0 tunl010.233.85.192 162.53.123.115 255.255.255.192 UG 0 0 0 tunl010.233.88.64 162.53.123.111 255.255.255.192 UG 0 0 0 tunl0
Relational problems: Communication.
kubectl get
Not much knowledge about Calico... And that is a good thing. It just works.
We know a lot more about Flannel and that also says enough...
Relation problems: Containers drop
kubectl get
Relation problems: Communication.
kubectl get Kubernetes Node
Network Test (Pod)
DS
Kube DNS (service)
Kubernetes Master
Network Test (Pod)
DS
Kube DNS (pod)
Kubernetes Node
Network Test (Pod)
DS
Kube DNS (pod)
Kubernetes gives more benefits than doubts on premise
kubectl get
A lot of open source tools around
Helm packages
Fast delivery of software
Auto healing
Very very stable (Only got called out of bed once at night in 2017)
Happy developers
Enabler for DevOps
Etc..
Open source tools that boosts our relationship
kubectl get
Projects that boosts our relationship
kubectl get
Kubespray saved months of work setting up Kubernetes on premise.
Easily deploying production-ready Kubernetes clusters.
Projects that boosts our relationship
kubectl get
Helm makes upgrading and maintaining our applications
predictable and super easy.
Package manager for Kubernetes
Love
kubectl get
Joost Milo
Questions?
kubectl get
top related