it minds mindblown networking event 2016
Post on 22-Jan-2018
120 Views
Preview:
TRANSCRIPT
SOFTWARE IS EATING THE WORLD
_ SOFTWARE IS EATING THE WORLD
Martin Jensen
Kasper Nissen @phennex
@mrjensens
…original quote by Marc Andreessen, 2011
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
VIR
TU
ALI
ZA
TIO
N
DEVOPS
CLOUD NATIVEDOCKER
MICROSERVICES
UNIKERNELS CE
NTO
S
AMAZON WEB SERVICES
GOOGLE CLOUD PLATFORM
KUBERNETES
RA
NC
HE
R
COREOS
MESOS
BO
RG
PR
OM
ET
HE
US
SITE RELIABILITY ENGINEERING
LINUX
CG
RO
UP
S
NAMESPACES
VPCGKECLOUD NATIVE COMPUTING FOUNDATION
gRPC
SPANNER
BORGMON
SERVICE DISCOVERY
ORCHESTRATION
OPEN CONTAINER INITIATIVE
AZURE
SW
AR
M
CHAOS MONKEY
CH
AO
S E
NG
INE
ER
ING
CO
NTA
INE
RS R
OC
KE
T
LINUX
RESILIENCE
TECHTONIC
OPENSHIFT
OPENSTACK
IAAS
foto: Lars Kruse, Aarhus UniversitetPervasive Systems group, Section of Electrical and Computer Engineering, Department of Engineering, Aarhus University
Who are we?
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Martin Jensen
• Bachelor • B. Eng. ICT
• Master • M. Eng. Computer Engineering (Distributed Systems and
Software Engineering) • Experience
• Software Developer @ IT Minds ( 3 1/2 years) • Interests:
• Cloud Computing/architecture, technology evolution, distributed systems, mobile development
8
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Kasper Nissen
• Bachelor • B. Eng. ICT
• Master • M. Eng. Computer Engineering (Distributed Systems and
Software Engineering) • Experience
• Software Developer @ IT Minds ( 3 1/2 years) • Partner @ Drivelogger
• Interests: • Cloud Computing/architecture, technology evolution,
distributed systems, mobile development
9
_ KUBECLOUD A SMALL-SCALE CLOUD COMPUTING
ENVIRONMENT Authors:
Kasper Nissen - KN87372
Supervisor:
Christian Fischer Pedersen - cfp@eng.au.dk
Date: June 6th, 2016
DEPARTMENT OF ENGINEERING
AARHUS UNIVERSITY AU
KubeCloudA Small-Scale Tangible Cloud Computing Environment
Master’s Thesis in Computer EngineeringAarhus University, Department of Engineering
Martin Jensen - 20106561
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Investigation of how Universities can teach Cloud Computing using a small-scale cluster • Designing a learning activity around Microservices, Containers, and Cluster Management • Designing and building a Small-Scale Cloud Computing Cluster • More information: www.kubecloud.io
KubeCloud: A Small-Scale Cloud Computing Environment
12
kubecloud.ioPodcast
KubeCloud: Tangible Cloud Computing with Kasper Nissen and Martin Jensen
http://bit.ly/29k9Adr
_ CLOUD COMPUTING
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
“Cloud computing refers to applications and services that run on a distributed network using virtualized resources and accessed by common Internet protocols and networking standards.”
- B. Sosinsky, Cloud Computing Bible (2011)
A definition
15
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
“Cloud computing refers to applications and services that run on a distributed network using virtualized resources and accessed by common Internet protocols and networking standards.”
- B. Sosinsky, Cloud Computing Bible (2011)
A definition
15
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
“Cloud computing refers to applications and services that run on a distributed network using virtualized resources and accessed by common Internet protocols and networking standards.”
- B. Sosinsky, Cloud Computing Bible (2011)
A definition
16
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Abstraction
Cloud Computing - key enabling concepts
17
Virtualization
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 18
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Cloud Computing - Service models
19
Networking Networking Networking Networking
Storage
Servers
Virtualization
O/S
Middleware
Runtime
Data
Applications
Storage
Servers
Virtualization
O/S
Middleware
Runtime
Data
Applications
Storage
Servers
Virtualization
O/S
Middleware
Runtime
Data
Applications
Storage
Servers
Virtualization
O/S
Middleware
Runtime
Data
Applications
You
man
age
You
man
age
Oth
ers man
age
Oth
ers man
age
Oth
ers man
age
You
man
age
On premises Infrastructureas a Service
Platformas a Service
Softwareas a Service
Source: The Cloud Computing Bible, page 10 IaaS PaaS SaaS
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Cloud Computing - Deployment models
20
Public
PrivateHybrid
Source: The Cloud Computing Bible, page 7
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Cloud Computing - Providers
21
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Cloud Computing - Providers
22Source: http://amzn.to/2cg3Obk
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• On-demand self-service • Broad network access • Resource pooling • Rapid elasticity • Measured service • Lower cost • Ease of utilization • Quality of Service • Reliability • Outsource IT Management
Cloud Computing - Benefits
23Source: The Cloud Computing Bible, page 17
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• On-demand self-service • Broad network access • Resource pooling • Rapid elasticity • Measured service • Lower cost • Ease of utilization • Quality of Service • Reliability • Outsource IT Management
Cloud Computing - Benefits
24
Rapid elasticity
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• On-demand self-service • Broad network access • Resource pooling • Rapid elasticity • Measured service • Lower cost • Ease of utilization • Quality of Service • Reliability • Outsource IT Management
Cloud Computing - Benefits
25
Resource Pooling
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• On-demand self-service • Broad network access • Resource pooling • Rapid elasticity • Measured service • Lower cost • Ease of utilization • Quality of Service • Reliability • Outsource IT Management
Cloud Computing - Benefits
26Source: The Cloud Computing Bible, page 17 Outsourced IT Management
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• In 2011 • More suitable for large organization • Less customizable • Latency • Privacy and security
• In 2016 • Privacy and security?
Cloud Computing - Drawbacks
27Source: The Cloud Computing Bible, page 17
_ SOFTWARE IS EATING THE WORLD
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 29
"In the new world, it is not the big fish which eats the small fish, it’s the fast fish which eats the slow fish.”
- Professor Klaus Schwab, Founder and Executive Chairman of the World Economic Forum
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 30
In the last 15 years, 52% of the Fortune 500 companies have disappeared
1955: 75 years
life expectancy
2016: 15 years
life expectancy ( … and falling)
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 31Anybody remembers Blockbuster?
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 32
_ MICRO SERVICES
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Monoliths vs Microservices
34Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Bland-selv-slik eller hele poser?
35Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Bland selv slik eller hele poser?
36Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
“Small Autonomous services that work together, modeled around a business domain” - Sam Newman (2015)
Microservices - what?
37Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Componentization via Services • Organized around Business Capabilities • Products not Projects • Smart endpoints and dumb pipes • Decentralized Governance • Decentralized Data Management • Infrastructure Automation • Design for failure • Evolutionary Design
Microservices - characteristics?
38Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Conway’s Law
39Source: http://martinfowler.com/articles/microservices.html
“Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.”
- Conway’s Law (1967)
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Organized around Business Capabilities
40Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Decentralized Data Management
41Source: http://martinfowler.com/articles/microservices.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Infrastructure Automation
42
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
“The enabling idea of infrastructure as code is that the systems and devices which are used to
run software can be treated as if they, themselves, are software”
- Kief Morris, Cloud Specialist
Infrastructure as Code (IaC)
43
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Many tools: Vagrant, Ansible, Chef, Puppet, etc. • AWS, GCP, etc. has API’ers for communicating with
their services
Infrastructure as Code (IaC) - How?
44
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Demo Ansible
45
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Building datacenters in the cloud - an example
46
Scenario: Public and Private Networks
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Building datacenters in the cloud - an example
47
Scenario: Public and Private Networks and Hardware VPN Access
_ DEV OPS
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Development / Operations
49Source: http://bit.ly/2dqtp3Q
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Development / Operations
50
Deadline for delivery
Development Ops
Need this version of X Need this version of Y
Need this version Z Damn, the version of Z needs
another version of X
OK OK OK WAT? We gave you what you asked for and 2 minutes before release, you are telling me we have to do it all over?
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
DevOps is the practice of operations and development engineers participating together in the entire service lifecycle, from design through the
development process to production support.
DevOps
51Source: http://valueflowit.com.au/it-operations-only-does-4-things/
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
The Three Ways
52Source: http://itrevolution.com/the-three-ways-principles-underpinning-devops/
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
DevOps simplified
53Source: http://itrevolution.com/the-three-ways-principles-underpinning-devops/
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 54Source: The Phoenix Project
The Phoenix Project
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• The Four Work types of IT Operations • Business Projects • Internal Projects • Operational Change • Unplanned Work (the killer)
The Phoenix Project: Key takeaways
55Source: http://valueflowit.com.au/it-operations-only-does-4-things/
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
The Phoenix Project: Key takeaways
56Source: http://valueflowit.com.au/it-operations-only-does-4-things/
Business units
IT DEVELOPMENT/OPERATIONS
Release
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
The Phoenix Project: Key takeaways
56Source: http://valueflowit.com.au/it-operations-only-does-4-things/
Business units
IT DEVELOPMENT/OPERATIONS
Release
THE KILLER
UNPLANNED WORK
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 57Source: http://itrevolution.com/the-three-ways-principles-underpinning-devops/
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 57Source: http://itrevolution.com/the-three-ways-principles-underpinning-devops/
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• “Reduce the cost, time, and risk of delivering incremental changes to users” - Jez Humble (2012)
• Why? • Build the right thing (short feedback cycles) • Reduce risk of release (if it hurts, do it often)
Continuous Delivery
58Source: http://itrevolution.com/the-three-ways-principles-underpinning-devops/
_ SITE RELIABILITY ENGINEERING
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Site Reliability Engineering
61Source: http://martinfowler.com/articles/microservices.html
• Google's lessons learned • Essays • War stories
• Internal role at Google - SRE
• Conflict of interest • Launch anything at any time without hindrance • Don't change anything when the system works
• "SRE is what happens when you ask a software engineer to design an operations team."
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Site Reliability Engineering
• SRE • Maximum 50% "ops" work • Remaining time used on development/automation • Post-mortem
62
"Hope is not a strategy" - Traditional SRE saying
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• System administrator approach • More errors -> More system administrators • Scales linearly
• Recurring errors • Why fix the same error many times?
Fire-fighting vs. Automation
63
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• System administrator approach • More errors -> More system administrators • Scales linearly
• Recurring errors • Why fix the same error many times?
Fire-fighting vs. Automation
63
– Beyer et al (Google - SRE)
"Software engineering has this in common with having children: the labor before birth is painful and difficult, but the labor after the birth is where you actually spend most of your effort. Yet software engineering as a discipline spends much
more time talking about the first period opposed to the second, despite estimates that 40-90% of the total costs of a
system are incurred after birth."
SPØRGSMÅL?
PAUSE 5 min!
_ CONTAINERS
Source: http://bit.ly/2djCXMv
[…] The value of this utilitarian object lies not in what it is, but in how it is used. The container is at the core of a highly automated system for moving goods from anywhere, to anywhere, with a minimum of cost and complication on the way.
–Marc Levinson
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Containers - Cargo transportation before 1960
68Source: http://bit.ly/2c76bQ7
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Containers - Intermodal shipping container
69Source: http://bit.ly/2c76bQ7
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Abstraction for applications with dependencies • Standard format • Decouple application and server • Sandboxed environment
• Consist of filesystem layers • Lightweight (compared to VMs)
Containers - what is a container?
70
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Containers - Software counterpart
71Source: http://bit.ly/2c76bQ7
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Containers for software
72Source: http://bit.ly/2c76bQ7
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
(New) Challenges
• Containers in production • Avoiding to nursing VMs
• Orchestration • Interconnecting containers • Storage • Service models • Cloud native challenges
74
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Choosing a Service Model
75
PaaS
IaaS
PaaS
IaaS
CaaS
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Cloud Native Computing Foundation (CNCF) defines
1. Container packaged
2. Dynamically managed
3. Micro-services oriented
Defining Cloud Native
76
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Open Container Initiative
77
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
| 78
Rumors of the Docker fork
_ CLUSTER MANAGEMENT
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Orchestration • Resource optimization
• Bin packing • Scheduling
• Consensus algorithm • Resilience • High Availability • Scalability • Service discovery • Secret management • Storage orchestration
Cluster management concepts
80
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
History - Google and Containers
• Running containers 10+ years using Borg • Borg paper: http://bit.ly/2cnaYO4
• Everything at Google runs in containers • Gmail, Web search, MapReduce, batch, GFS, … • Google’s Cloud Platform
• (Even VMs)
• launches over 2 billion containers per week. (2014)
81Source: http://bit.ly/2cvcQ7W
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Configuration Options
82
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Google's Cluster Management - Borg
84
–Burns et al., Borg, Omega, and Kubernetes 2016
“Over time it became clear that the benefits of containerization go beyond merely enabling higher levels of utilization. Containerization transforms the data center from
being machine-oriented to being application-oriented”
Source: http://bit.ly/2cDZDqA
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Machine shapes • Workload shapes • Bin packing problem
Bin-packing
88Source: bit.ly/1PrkwoN
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Machine shapes • Workload shapes • Bin packing problem
Bin-packing
88
2.5GB 1Core
Source: bit.ly/1PrkwoN
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Machine shapes • Workload shapes • Bin packing problem
Bin-packing
88
2.5GB 1Core
Source: bit.ly/1PrkwoN
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Machine shapes • Workload shapes • Bin packing problem
Bin-packing
88
2.5GB 1Core
Source: bit.ly/1PrkwoN
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Tetris
90
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Tetris
90
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Compute resources at eBay
91
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Compute resources at eBay
91
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Cluster management solutions
• Orchestration • Kubernetes • Mesos • Docker Swarm Mode
• Enterprise services • Rancher • Tectonic (Kubernetes) • OpenShift (Kubernetes) • Mesosphere (Mesos)
92
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Demo Rancher
93
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Based on Google's 10 years experience with Borg • Designed for containers • Declarative definition of desired state • Modular • Built with fault tolerance and resilience in mind
• Run by CNCF (Cloud Native Computing Foundation) • Companies
• E.g. eBay, Wikimedia Foundation, Viacom, SoundCloud, box,The New York Times
Kubernetes
94
Scheduler
API Server
Kubernetes master
Kubelet Kubelet Kubelet
Application containers
Nodes
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Demo minikube & GKE
95
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Distributed systems kernel • Scales to 10,000+ slave nodes • Well-configured for data processing
• E.g. Hadoop, Kafka, Spark • Two-phase scheduling
• 1) Mesos Master Scheduler • 2) Framework Scheduler
• Run containers with Marathon framework
• Run by Apache • Companies
• E.g. Twitter, Airbnb, and Apple
Apache Mesos
96
ZooKeeperZooKeeper
ZooKeeper
Service 1Service 1Mesos Master
Service 1Service 1MarathonCoordination
& Configurations
Long running tasks
JobsMesos Slaves
Framework
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Demo minimesos
97
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Built into Docker Engine (v1.12) • Nodes
• Manager(s) • Workers
• Services • Replicas
• Run by Docker (the company)
Docker Swarm Mode
98
_ RESILIENCE
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Definition • a) The capacity to recover quickly from difficulties; toughness • b) The ability of a substance or object to spring back into shape; elasticity:
Resilience
100Source: http://bit.ly/2djKZVO
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Resilience triangle
101Source: http://bit.ly/2drlGSc
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Rocky's Fault Tolerance
102
"But it ain't how hard you're hit; it's about how hard you can get hit, and keep moving forward. How much you can take, and keep moving forward. That's how winning is done."
-> Fault tolerance
Source: Jonas Bonér, “Without Resilience Nothing Else Matters”
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|103Source: Michael T. Nygard - Release It!
Release It!
• Anti patterns • Integration points • Chain reactions • Cascading failures • Users • Blocked threads
• Patterns • Timeouts • Circuit breakers • Bulkheads • Fail fast
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Demo resilience
104
_ .NET and Azure
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Open sourced • .NET Core • ASP.NET Core • Working with Linux infrastructure
• Service Fabric • "Born in the cloud" • "About thinking in microservices" • Utilized for some of Azure's own services
• Windows Containers
.NET and Azure
106
GitHub ~ Sep 2016
Source: http://read.bi/2duWWfz
_ MONITORING
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Monitoring
• Why monitor? • Analyzing long-term trends • Comparing over time or experiment groups • Alerting • Building dashboards • Conducting ad hoc restrospective analysis (i.e. debugging)
• Purpose: What’s broken? and why?
109
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Hosts • CPU, Memory, I/O, Network, Filesystem
• Containers • CPU, Memory, restarts, throttling
• Applications • Throughput, latency
What to monitor?
110
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Latency • Traffic • Errors • Saturation
What to monitor?: The Four Golden Signals
111
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• Heavily inspired by Borgmon • Built by ex-Googlers at SoundCloud. • Comes with a powerful query language • Pull-based (scrapes at regular intervals) • Many integrations
• CloudWatch exporter • Node exporter
Prometheus
112
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Monitoring - Prometheus at SoundCloud
113
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Alerting
114
• Use symptom based alerting • Only page if something needs immediate human intervention
• Prevent alert fatigue (Alert grouping, Provide easy silencing, Dependencies, Avoid Static Thresholds)
• Use ticketing systems (Avoid email spam) • Warning are tasks like new features
• Provide runbooks • Keep them concise • Explanation, hints, links • Dynamic - include recent observations
• Practice outages • “Firedrills”, “Gamedays” - repeat regularly
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Chaos Engineering
116
• “Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production” - principlesofchaos.org
• We need to identify weaknesses before they manifest in system-wide, aberrant behaviors. Systemic weaknesses could take the form of: • Improper fallback settings when a service is unavailable • Retry storms from improperly tuned timeouts • Outages when a downstream dependency receives to much traffic • Cascading failures when a single point of failure crashes
Source: http://principlesofchaos.org
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Chaos Engineering
117
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Chaos Engineering
118
Chaos Monkey Virtual monkey lose in the production environment. Is run on a daily basis in normal work hours
Chaos Gorilla Is a tool that can be used to simulate availability zone outage, to ensure that the rest of the zones can handle the extra load.
Chaos Kong Is a tool that can take out an entire region.
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Intuition Engineering
119
• Instead of numerical information, Netflix build a tool that surfaces relevant information to a human, for situations that would be too onerous to create a heuristic. These situations require an intuition that cannot be codified.
• A tool that gives an intuitive understanding of the system.
Source: http://techblog.netflix.com/2015/10/flux-new-approach-to-system-intuition.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Intuition Engineering
119
• Instead of numerical information, Netflix build a tool that surfaces relevant information to a human, for situations that would be too onerous to create a heuristic. These situations require an intuition that cannot be codified.
• A tool that gives an intuitive understanding of the system.
Source: http://techblog.netflix.com/2015/10/flux-new-approach-to-system-intuition.html
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
• “Backbone” for many new things • Big data
• More insight and personalized content • IoT
• Cisco expects 25 billion devices in 2020 • Mobile services
• 4.61 billion devices and rising
Why should you care?
120
Mindblown netværksmøde Torsdag d. 13. Oktober 2016
|
Thank you!
121
top related