a jouney through wonderland - jimdo

32
A Journey Through Wonderland Paul Seiffert Mathias Lafeldt

Upload: johann-paulus-almeida

Post on 15-Jan-2017

42 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: A Jouney Through Wonderland - Jimdo

A Journey Through Wonderland

Paul SeiffertMathias Lafeldt

Page 2: A Jouney Through Wonderland - Jimdo

The Purpose of

Wonderland

Page 3: A Jouney Through Wonderland - Jimdo
Page 4: A Jouney Through Wonderland - Jimdo

● Took Jimdo 5 years to migrate core infrastructure from bare metal to AWS

● Teams started to love the cloud● Many experiments in different AWS

accounts● “Reinvented” production stacks

How we got here

Page 5: A Jouney Through Wonderland - Jimdo

● Founded to solve common infrastructure problems of Jimdo teams

● Provides standard platform that is reliable and simple to use: Wonderland

● Allows Jimdo developers to focus on product development

Werkzeugschmiede Team

Page 6: A Jouney Through Wonderland - Jimdo

Wonderland’s History

Page 7: A Jouney Through Wonderland - Jimdo

Wonderland 101

Page 8: A Jouney Through Wonderland - Jimdo

PaaS allowing

Jimdo developers

to run their

dockerized applications

Page 9: A Jouney Through Wonderland - Jimdo

● Long-running stateless services○ DNS, load balancing, health checks,

auto scaling, …● One-off tasks and cron jobs● Centralized logging and metrics

collection via external providers

Features

Page 10: A Jouney Through Wonderland - Jimdo

● APIs● CLI tool wl● Chatbot Alice● Docker registry● Vault● No SSH access

Interfaces

Page 11: A Jouney Through Wonderland - Jimdo

● SLA● Status page● Documentation● Workshops● Use-case-driven development

Internal service provider

Page 12: A Jouney Through Wonderland - Jimdo

Wonderland Internals

Page 13: A Jouney Through Wonderland - Jimdo

We run...

● AWS infrastructure● Services providing our APIs

Page 14: A Jouney Through Wonderland - Jimdo

AWS Infrastructure

● Networking● Cluster of EC2 instances● Jenkins● Route 53, DynamoDB, S3, SQS, SNS, ...

Page 15: A Jouney Through Wonderland - Jimdo

“Crims” Cluster

● Runs user applications + system services● EC2 auto-scaling group● Providing resources to ECS● CoreOS

Page 16: A Jouney Through Wonderland - Jimdo

AWS ECS

AWS EC2 Auto-Scaling Group

Page 17: A Jouney Through Wonderland - Jimdo
Page 18: A Jouney Through Wonderland - Jimdo

Two-Dimensional

● Services (based on resource consumption)

● Cluster(based on available slots)

Auto-Scaling

Page 19: A Jouney Through Wonderland - Jimdo
Page 20: A Jouney Through Wonderland - Jimdo

AWS/AutoScaling GroupDesiredCapacity

Wonderland/ECSDesiredClusterSizeDelta

1 week

Page 21: A Jouney Through Wonderland - Jimdo

ECSAgent

Log Forwarder

DatadogAgent

AWS ECSService

A

Service B

Service C

ELB

ELB

HTTP :80

HTTPS :443

HTTP :11411

TCP :1234 TCP :11412

A Crims Cluster Instance

Page 22: A Jouney Through Wonderland - Jimdo

● Infrastructure as code● CloudFormation and Ansible● Applied by a Central State Enforcer● Workflow based on GitHub pull requests● Automated rollout to production

Infrastructure Development

Page 23: A Jouney Through Wonderland - Jimdo

● We test everything● Unit, integration, and system tests● Tests in staging environment● Staging is set up from scratch every week● Periodic GameDays

QA

Page 24: A Jouney Through Wonderland - Jimdo

Our Services

● provide APIs● deploy other services● are Wonderland services

Page 25: A Jouney Through Wonderland - Jimdo

SQS Queue

StatusCheck

ServiceAutoScaler

Deployer API

(Dash-)Boards

Oraculum(Logs)

AWSRoute53

AWSApplicationAutoScaling

Notifi-cations

AWS SNS

Alice(Chatbot)

Deployer Worker

WL (CLI Tool)

AWSS3

Page 26: A Jouney Through Wonderland - Jimdo

Service Configuration$ cat wonderland-autoscaler/wonderland.yaml---scale: 2components: - name: autoscaler image: registry.example.com/wonderland-autoscaler:v1.0.3 env: DYNAMODB_TABLE_NAME: wonderland-autoscaling-configsendpoint: domain: autoscaler.example.com load-balancer: healthcheck: path: /v1/health ports: - port: 443 protocol: HTTPS component: autoscaler port: 80

Page 27: A Jouney Through Wonderland - Jimdo

Deploy it!$ wl deploy autoscaler -f wonderland-autoscaler/wonderland.yamlautoscaler/1466583476 This is try 1autoscaler/1466583476 Updating ELB autoscaler-1466437217autoscaler/1466583476 Configuring health check HTTP:11011/v1/healthautoscaler/1466583476 Enabling cross-zone load balancingautoscaler/1466583476 Configuring connection draining with a timeout of 180sautoscaler/1466583476 Not enabling access logautoscaler/1466583476 Letting autoscaler.example.com point to autoscaler-1363526915.eu-west-1.elb.amazonaws.comautoscaler/1466583476 Registered new ECS TaskDefinition (autoscaler:58) for service autoscalerautoscaler/1466583476 Updating ECS service autoscaler-1466437217autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 180s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 170s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 160s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 150s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 140s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 130s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 120s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 110s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 100s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 90s)autoscaler/1466583476 Waiting for service autoscaler-1466437217 to complete rolling update (timeout in 80s)autoscaler/1466583476 Rolling update completed successfully.autoscaler/1466583476 Waiting for ELB to have at least one healthy instanceautoscaler/1466583476 Deleting old ECS Task Definition service-autoscaler:57autoscaler/1466583476 Marking deployment autoscaler/1466583476 activeautoscaler/1466583476 [Boards] Creating Board for Service [werkzeugschmiede] autoscalerautoscaler/1466583476 [Datadog] Creating Deployment Eventautoscaler/1466583476 [Notifications] Notification channel is /v1/teams/werkzeugschmiede/channels/autoscalerautoscaler/1466583476 [StatusCheck] CheckID is f85ded4d-9ad0-4375-81b4-5989964e8ed5autoscaler/1466583476 Deployment successful

Page 28: A Jouney Through Wonderland - Jimdo

Monitor it!$ wl status autoscalerCurrent deployment: 1466583491Desired scale: 2

Machine Component Status Started Deployment ELB------- --------- ------ ------- ---------- ---i-7db992f7 autoscaler RUNNING 22 Jun 16 11:14 CEST 1466583491 InServicei-fb2f5b77 autoscaler RUNNING 24 Jun 16 01:13 CEST 1466583491 InService

$ wl logs -f autoscaler...

Page 29: A Jouney Through Wonderland - Jimdo
Page 30: A Jouney Through Wonderland - Jimdo

The Future

Page 31: A Jouney Through Wonderland - Jimdo

● Persistent disk storage● Dynamic load balancing● Long-running / memory hungry jobs● Speed up ECS cluster rotation● Make crons more reliable● Outsource Docker registry

Improvements

Page 32: A Jouney Through Wonderland - Jimdo

Twitter: @seiffertp / @mlafeldt

https://medium.com/production-ready

Questions?

Thank you.