terraform at scale

85
Terraform at Scale Hashiconf Calvin French-Owen Co-Founder of Segment @calvinfo September 7, 2016

Upload: calvin-french-owen

Post on 06-Jan-2017

50 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Terraform at Scale

Terraform at ScaleHashiconf

Calvin French-OwenCo-Founder of Segment

@calvinfo

September 7, 2016

Page 2: Terraform at Scale
Page 3: Terraform at Scale
Page 4: Terraform at Scale

💖

Page 5: Terraform at Scale
Page 6: Terraform at Scale

Scaling vectors

Page 7: Terraform at Scale

Complexity

Page 8: Terraform at Scale

People

Complexity

Page 9: Terraform at Scale

People

Complexity ❌

Page 10: Terraform at Scale

People

Complexity

✅

Page 11: Terraform at Scale

How do we move nimbly–while adding people?

Page 12: Terraform at Scale

This talk- Terraform at Segment- What makes “good” Terraform- What’s next

Page 13: Terraform at Scale

Terraform at Segment

Page 14: Terraform at Scale

By the numbers- 16 developers working with Terraform- 94 microservices- thousands of AWS resources

Page 15: Terraform at Scale

A year with TerraformDecember 2012 – Launch dayApril 2015 – Terraform first attempt (v1)November 2015 – Terraform “redux” (v2)

Page 16: Terraform at Scale

Before Terraform

Page 17: Terraform at Scale
Page 18: Terraform at Scale

đŸ˜±

Page 19: Terraform at Scale

Terraform

Page 20: Terraform at Scale
Page 21: Terraform at Scale

Migrating to TerraformApril 2015

Page 22: Terraform at Scale
Page 23: Terraform at Scale
Page 24: Terraform at Scale

Migrating to Terraform

Page 25: Terraform at Scale

Migrating to Terraform1. AWS accounts per environment

Page 26: Terraform at Scale

dev stage prod old prodvpc peering

Page 27: Terraform at Scale

dev stage prod old prodvpc peering

managed by Terraform

Page 28: Terraform at Scale

Separate accounts- confidence to apply ‘at will’- test the waters without screwing up the old

account- any sort of ‘global’ configs are okay

Page 29: Terraform at Scale

Migrating to Terraform1. AWS accounts per environment2. Docker and ECS

Page 30: Terraform at Scale
Page 31: Terraform at Scale

Terraform: First Attempt

Page 32: Terraform at Scale

Terraform (our first attempt)├── Makefile├── README.md└── environments    ├── dev    ├── production    └── stage

Page 33: Terraform at Scale

Terraform (our first attempt)├── Makefile├── README.md└── environments    ├── dev    ├── production    └── stage

Page 34: Terraform at Scale

Terraform (our first attempt)environments/stage├── api.tf├── bastion.tf├── dns.tf├── elasticache.tf├── elbs.tf├── iam.tf├── outputs.tf├── redis.tf├── s3.tf├── terraform.tfstate├── terraform.tfvars└── vpc.tf

Page 35: Terraform at Scale

Terraform (our first attempt)resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 768, "environment": [ { "name": "NODE_ENV", "value": "stage" } ], "image": "segment/app:1.54.14", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 8000 } ] }]EOF}

Page 36: Terraform at Scale

Life was better

Page 37: Terraform at Scale

Life was better!Life was better


Page 38: Terraform at Scale

Life was better!Life was better


but notgood.

Page 39: Terraform at Scale

1. environment drift

Page 40: Terraform at Scale

Terraform first attempt├── Makefile├── README.md└── environments    ├── ops    ├── production    └── stage

Page 41: Terraform at Scale

resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 768, "environment": [ { "name": "NODE_ENV", "value": "stage" } ], "image": "segment/app:1.54.14", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 8000 } ] }]EOF}

<= stage

Page 42: Terraform at Scale

resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 768, "environment": [ { "name": "NODE_ENV", "value": "stage" } ], "image": "segment/app:1.54.14", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 8000 } ] }]EOF}

<= stage

resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 3072, "environment": [ { "name": "NODE_ENV", "value": "production”, } ], "image": "segment/app:1.54.17", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 3000 } ] }]EOF}

prod =>

Page 43: Terraform at Scale

2. one massive local state

Page 44: Terraform at Scale
Page 45: Terraform at Scale
Page 46: Terraform at Scale

3. production drift

Page 47: Terraform at Scale

$ terraform plan –target=aws_elb.feels_so_easy

Page 48: Terraform at Scale

$ terraform plan –target=aws_elb.oh_no_what_have_we_done

Page 49: Terraform at Scale

Terraform Redux (v2)

Page 50: Terraform at Scale

Terraform v1 Problems1. massive shared state2. locally stored state3. drift between environments

Page 51: Terraform at Scale

Terraform v1 Problems1. massive shared state: split states2. locally stored state: remote state3. drift between environments: modules

Page 52: Terraform at Scale

v2: state management

Page 53: Terraform at Scale

core(vpc, networking, security groups, asgs)

auth api site db cdn

services

Page 54: Terraform at Scale

core(vpc, networking, security groups, asgs)

auth api site db cdn

services→

read

onl

y →

Page 55: Terraform at Scale

/** * Remote state. */

resource "terraform_remote_state" "state" { backend = "s3" config { bucket = "segment-ops" key = "terraform/${var.environment}/terraform.tfstate" }}

data "template_file" ”test" { template = "${file("${path.module}/init.tpl")}"

vars { zone_id = "${terraform_remote_state.state.zone_id}" }}

Page 56: Terraform at Scale

/** * Remote state. */

resource "terraform_remote_state" "state" { backend = "s3" config { bucket = "segment-ops" key = "terraform/${var.environment}/terraform.tfstate" }}

data "template_file" ”test" { template = "${file("${path.module}/init.tpl")}"

vars { zone_id = "${terraform_remote_state.state.zone_id}" }}

read only!

Page 57: Terraform at Scale

/** * Remote state. */

resource "terraform_remote_state" "state" { backend = "s3" config { bucket = "segment-ops" key = "terraform/${var.environment}/terraform.tfstate" }}

data "template_file" ”test" { template = "${file("${path.module}/init.tpl")}"

vars { zone_id = "${terraform_remote_state.state.zone_id}" }}

read only!

reference

Page 58: Terraform at Scale
Page 59: Terraform at Scale

v2: modules

Page 60: Terraform at Scale

Modules enforce configuration parity.

Page 61: Terraform at Scale
Page 62: Terraform at Scale
Page 63: Terraform at Scale
Page 64: Terraform at Scale
Page 65: Terraform at Scale

What makes good* Terraform?

*for some definitions of good

Page 66: Terraform at Scale
Page 67: Terraform at Scale

Docker AMIs by Packer

Page 68: Terraform at Scale

Service Config by Terraform

Page 69: Terraform at Scale

1. Variables2. Composition3. State4. Versioning

Page 70: Terraform at Scale

1. Variables- anything a user might want to override should be

a variable- use defaults liberally

Page 71: Terraform at Scale

1. Variablesresource "aws_instance" "bastion" { ami = "${module.ami.ami_id}" source_dest_check = false instance_type = "${var.instance_type}" subnet_id = "${var.subnet_id}" key_name = "${var.key_name}" vpc_security_group_ids = ["${split(",",var.security_groups)}"] monitoring = true tags { Name = "bastion" Environment = "${var.environment}" }}

configurableconfigurable

configurableconfigurable

configurable

Page 72: Terraform at Scale

1. Variablesresource "aws_instance" "bastion" { ami = "${module.ami.ami_id}" source_dest_check = false instance_type = "${var.instance_type}" subnet_id = "${var.subnet_id}" key_name = "${var.key_name}" vpc_security_group_ids = ["${split(",",var.security_groups)}"] monitoring = true tags { Name = "bastion" Environment = "${var.environment}" }}

configurableconfigurable

configurableconfigurable

configurable

non-configurablenon-configurable

non-configurable

Page 73: Terraform at Scale

1. Variablesresource "aws_instance" "bastion" { ami = "${module.ami.ami_id}" source_dest_check = ${var.source_dest_check} instance_type = "${var.instance_type}" subnet_id = "${var.subnet_id}" key_name = "${var.key_name}" vpc_security_group_ids = ["${split(",",var.security_groups)}"] monitoring = ${var.monitoring} tags { Name = "bastion" Environment = "${var.environment}" }}

Page 74: Terraform at Scale

2. Composition- build modules as you need them- it’s okay if not everything fits the abstraction

Page 75: Terraform at Scale

2. Composition – “full stack”module “stack” { source = “github.com/segmentio/stack” name = “my-stack” environment = “production”}

Page 76: Terraform at Scale

2. Composition – inside stackmodule "vpc" { source = "./vpc” 
}

module "security_groups" { source = "./security-groups” 
}

module "bastion" { source = "./bastion” 
}

module "dhcp" { source = "./dhcp” 
}

Page 77: Terraform at Scale

2. Composition – byo editionmodule “cluster” { source = “github.com/segmentio/stack//ecs-cluster”

environment = “prod” name = “cdn” vpc_id = “vpc-eff2eada” image_id = “ami-204faaf3”}

Page 78: Terraform at Scale

3. State management- separate core from services- states per service- use atlas or s3- use binary plans

Page 79: Terraform at Scale

core(vpc, networking, security groups, asgs)

auth api site db cdn

services→

read

onl

y →

Page 80: Terraform at Scale

4. Versioningmodule “stack” { source = “github.com/segmentio/stack?ref=v1.x”}

Page 81: Terraform at Scale

What’s next

Page 82: Terraform at Scale

What’s next- Applying in CI- Atlas- Data sources- Terraform generation

Page 83: Terraform at Scale

People

Complexity

✅

Page 84: Terraform at Scale

Fin

Page 85: Terraform at Scale

Prior ArtStack: github.com/segmentio/stackAtlas Examples: github.com/hashicorp/atlas-examples