netflix oss meetup season 5 episode 1

63
Netflix Open Source Netflix Open Source - @NetflixOSS Season 5, Episode 1

Upload: aspyker

Post on 28-Jan-2018

816 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Netflix OSS Meetup Season 5 Episode 1

Netflix Open Source

Netflix Open Source - @NetflixOSS

Season 5, Episode 1

Page 2: Netflix OSS Meetup Season 5 Episode 1

Agenda

6:00-7:00 Registration, Food/Drink, Networking

7:00-8:00 Talks:

• RepoKid - Travis McPeak and Patrick Kelley, Netflix

• BetterTLS - Ian Haken, Netflix

• Authorization at Netflix - Manish Mehta, Netflix

• Open Policy Agent - Torin Sandall, OPA project

• PADME - Kamil Pawlowski, PADME project

8:00-9:00 Demos, Networking

Page 3: Netflix OSS Meetup Season 5 Episode 1

HeadlineRightsizing Permissions @Scale

Patrick Kelley

9-27-2017

Page 4: Netflix OSS Meetup Season 5 Episode 1

The Antagonist

Page 5: Netflix OSS Meetup Season 5 Episode 1

Set Builder: (Me)

● Name: Patrick Kelley @monkeysecurity

● ~ 5 years @ Netflix

● Decent trampoline jumper

● OSS Fan

○ SecurityMonkey

○ CloudAux

○ PolicyUniverse

○ Aardvark

○ Repokid

○ SWAG

Page 6: Netflix OSS Meetup Season 5 Episode 1

You are Entitled to Nothing

Permissions granted to new apps:

● Permissions are automatically granted to applications on deploy.

● Apps start with a small base-set of permissions.

● Manual interaction with the security team is limited.

Eventually:

● Default permission set is empty. We peek inside your AMI to build policies.

● Library owners define required permissions.*

Page 7: Netflix OSS Meetup Season 5 Episode 1

Remove Unused

PermissionsRepokid gathers data from multiple plugins and

determines which permissions may be removed.

After sending notifications, repokid will “repo”

unused permissions. If something goes wrong,

repokid allows for easy rollback.

https://github.com/Netflix/repokid

https://github.com/Netflix-Skunkworks/aardvark

Page 8: Netflix OSS Meetup Season 5 Episode 1

AWS Policy Anatomy{

"Action": "s3:GetObject","Resource": "arn:aws:s3:::test-bucket-*","Effect": "Allow"

}

Service Access Advisor

Event CloudTrail

Resource S3 Access Logs

Page 9: Netflix OSS Meetup Season 5 Episode 1

Thank You !

Netflix Open Source - @NetflixOSS

Page 10: Netflix OSS Meetup Season 5 Episode 1

BetterTLS

Netflix Open Source - @NetflixOSS

A test suite for HTTPS clients implementing verification of

the Name Constraints certificate extension

Page 11: Netflix OSS Meetup Season 5 Episode 1

How Does Web PKI Work?

google.co

mVerisign

172.317.5.110

Symantec

Digicert

Verisign

google.com

Page 12: Netflix OSS Meetup Season 5 Episode 1

On Trusting Your Truststore

nsa.govWoSign

China

23.210.7.329

Verisign

DigicertWoSign

China

nsa.gov

Page 13: Netflix OSS Meetup Season 5 Episode 1

Another Use Case

password

reset

.acme

.internal

ACME

Root CA

74.304.23.58

passwordreset.acme.internal

ACME

Root CA

Page 14: Netflix OSS Meetup Season 5 Episode 1

Responsibility, Risk, and Transparency

bankof

america

.comACME

Root CA

17.59.228.350

ACME

Root CA

bankofamerica.com

Page 15: Netflix OSS Meetup Season 5 Episode 1

We want to apply authorization

rules to CAs.

Is ACME Root CA authorized to

create a certificate for

bankofamerica.com?

Page 16: Netflix OSS Meetup Season 5 Episode 1

The Name Constraints X509 Extension

● RFC 5280 (May 2008)

● Applies only to CA certificates. Specifies:

○ Type of name to which it applies (DNS, IP, etc)

○ Subtree (DNS prefix or IP range)

○ Whitelisted or blacklisted

● Constraints on CA hierarchy can be nested!

Implementations should “intersect” the constraints.

○ The ACME Root CA can be whitelisted for *.internal

○ The ACME Test Environment CA can be blacklisted

for *.prod.internal

Page 17: Netflix OSS Meetup Season 5 Episode 1

How Name Constraints Works

ACME Root

CA

ACME

Internal CA

NC: *.internal

passwordreset

.acme.internal

ACME Root

CA

ACME

Internal CA

NC: *.internal

bankofamerica

.com

×

Page 18: Netflix OSS Meetup Season 5 Episode 1

The Name Constraints extension is

only useful if clients implement it.

Page 19: Netflix OSS Meetup Season 5 Episode 1
Page 20: Netflix OSS Meetup Season 5 Episode 1
Page 21: Netflix OSS Meetup Season 5 Episode 1
Page 22: Netflix OSS Meetup Season 5 Episode 1
Page 23: Netflix OSS Meetup Season 5 Episode 1

...correctly.

The Name Constraints extension is

only useful if clients implement it...

Page 24: Netflix OSS Meetup Season 5 Episode 1

Let’s Test! Thoroughly!

● Put the server name in both CN and SAN

● Use both DNS names and IP names

○ Use both valid and invalid names

● Use both NC whitelisting and blacklisting

○ Use both passing and non-passing

whitelists/blacklists

● Mix and match all of these

○ Computers are really good at brute forcing all

combinations of things

● Let’s contact vendors about any issues we find

● And let’s make it public!

Page 25: Netflix OSS Meetup Season 5 Episode 1

Introducing BetterTLS.com

Page 26: Netflix OSS Meetup Season 5 Episode 1

Making TLS Better

● Chrome now has 100% pass on Windows and Linux

○ Chrome on OSX still has some blacklist failures

because of unfixed bugs in Apple’s proprietary TLS

implementation. :(

● Go found a bug in their NC verification

○ They’ve fixed it and included a bettertls certificate in

their own test suite!

● Java has fixed bugs in their NC verification

○ Release including the fix is pending

Page 27: Netflix OSS Meetup Season 5 Episode 1

What Should I Do?

● If you use TLS in your project, consider utilizing the

bettertls.com test suite.

● Contribute!

○ Help us extend BetterTLS with other (e.g. more

specific) Name Constraints tests

○ Submit additional client test results

○ Invent another TLS extension suite (HPKP, HSTS, …)

● If you manage any sort of CA, use name constraints to

reduce risk to your users, to reduce your own liability, and

to increase transparency!

Page 28: Netflix OSS Meetup Season 5 Episode 1

Thank You !

Netflix Open Source - @NetflixOSS

Page 29: Netflix OSS Meetup Season 5 Episode 1

Authorization at Netflix

Netflix Open Source - @NetflixOSS

Netflix’s architecture for implementing

Authorization at scale

Page 30: Netflix OSS Meetup Season 5 Episode 1

Background - Definitions

Transfer $1000 from Account X to Account Y

Me My Bank

1. Verify the Identity of the Requester (Authentication or AuthN)

2. Verify that the Requestor is authorized to perform

the requested operation (Authorization or AuthZ)

These 2 steps do not need to be tied together !!

Page 31: Netflix OSS Meetup Season 5 Episode 1

Background – Netflix Architecture

Page 32: Netflix OSS Meetup Season 5 Episode 1

AuthZ Problem

A way to define and enforce rules that read

Identity I

can/cannot perform

Operation O

on

Resource R

For ALL combinations of I, O, and R in the ecosystem.

Page 33: Netflix OSS Meetup Season 5 Episode 1

Design Considerations

● Resource types

● Identity types

● Underlying Protocols

● Implementation Languages

● Latency

● Flexibility of Rules

● Company Culture

● Capture Intent

Page 34: Netflix OSS Meetup Season 5 Episode 1

Result

DistributorDistributorDistributor

AuthZ Agent

App

Code

S

S

H

Policy

Portal

App CodeAuthZ Agent

DistributorDistributorAggregator

Policy DB

Other Data

Sources

Service A

Service B

Page 35: Netflix OSS Meetup Season 5 Episode 1

Zooming In

AuthZ Agent

API Stager

Open Policy Agent Engine

Updater

Periodic updates on policies

and associated data

Page 36: Netflix OSS Meetup Season 5 Episode 1

Did it work?

Resource types REST, SSH, Keys, Kafka Topics

Identity types VM/Container Services, Batch Jobs, FTEs, Contractors

Underlying Protocols HTTP, gRPC, Kafka Protocol

Implementation Languages Java, Node JS, Ruby, Python

Latency < 0.5 ms for basic policies

Flexibility of Rules OPA Policy Engine

Company Culture Policy Portal

Capture Intent Policy Portal UI hides Policy text for most use cases

Page 37: Netflix OSS Meetup Season 5 Episode 1

Take Away

● AuthZ is a fundamental security problem

● Seek comprehensive solution for better Control and Visibility

● Get there faster with Open Source Tools (e.g. OPA)

● Get involved in communities (e.g. PADME)

Page 38: Netflix OSS Meetup Season 5 Episode 1

Thank You !

Netflix Open Source - @NetflixOSS

Page 39: Netflix OSS Meetup Season 5 Episode 1

Open Policy Agent

Netflix Open Source - @NetflixOSS

An open source, general-purpose policy engine

www.openpolicyagent.org

Page 40: Netflix OSS Meetup Season 5 Episode 1

PolicyWhy it’s important

Page 41: Netflix OSS Meetup Season 5 Episode 1

The Policy Problem

ratings

details

commentslanding_page

master

nodes nodes

instance-976

elb-east

bucket-acme

lambda-xyz

keypair-foo

Application Platform Infrastructure

Page 42: Netflix OSS Meetup Season 5 Episode 1

The Policy Problem

ratings

details

commentslanding_page

master

nodes nodes

instance-976

elb-east

bucket-acme

lambda-xyz

keypair-foo

Can user X do operation Y

on resource Z?

Application Platform Infrastructure

Page 43: Netflix OSS Meetup Season 5 Episode 1

The Policy Problem

ratings

details

commentslanding_page

master

nodes nodes

instance-976

elb-east

bucket-acme

lambda-xyz

keypair-foo

Which cluster should this

workload be deployed on?

Can user X do operation Y

on resource Z?

Application Platform Infrastructure

Page 44: Netflix OSS Meetup Season 5 Episode 1

The Policy Problem

ratings

details

commentslanding_page

master

nodes nodes

instance-976

elb-east

bucket-acme

lambda-xyz

keypair-foo

Which cluster should this

workload be deployed on?

Which resources are not

tagged correctly?

Can user X do operation Y

on resource Z?

Application Platform Infrastructure

Page 45: Netflix OSS Meetup Season 5 Episode 1

Writing Policy Is Hard!

http.body: null

http.method: GET

http.path:

- salary

- bob

http.query_params: {}

protocol.scheme: https

service.source:

ipv4: 10.0.0.128

namespace: production

port: 32757

service: landing_page

service.target:

ip: 10.0.1.95

namespace: production

port: 8080

service: details

ingress.user: alice

kind: Pod

metadata:

labels:

app: nginx

name: nginx-1493591563-bvl8q

namespace: production

spec:

containers:

- image: nginx

imagePullPolicy: Always

name: nginx

securityContext:

privileged: true

nodeName: minikube

status:

containerStatuses:

- name: nginx

ready: true

restartCount: 0

hostIP: 192.168.99.100

phase: Running

podIP: 172.17.0.4

startTime: 2017-08-01T06:34:13Z

aws_autoscaling_group.lamb:

availability_zones#: '1'

availability_zones.3205: us-west-1a

desired_capacity: '4'

destroy: false

health_check_grace_period: '300'

launch_configuration: kitten

wait_for_capacity_timeout: 10m

aws_instance.puppy:

ami: ami-09b4b74c

instance_type: t2.micro

source_dest_check: 'true'

aws_launch_configuration.kitten:

associate_public_ip_addr: 'false'

destroy: false

image_id: ami-09b4b74c

instance_type: t2.micro

name: kitten

Application Platform Infrastructure

Page 46: Netflix OSS Meetup Season 5 Episode 1

Infrastructure

Writing Policy Is Hard!

http.body: null

http.method: GET

http.path:

- salary

- bob

http.query_params: {}

protocol.scheme: https

service.source:

ipv4: 10.0.0.128

namespace: production

port: 32757

service: landing_page

service.target:

ip: 10.0.1.95

namespace: production

port: 8080

service: details

ingress.user: alice

kind: Pod

metadata:

labels:

app: nginx

name: nginx-1493591563-bvl8q

namespace: production

spec:

containers:

- image: nginx

imagePullPolicy: Always

name: nginx

securityContext:

privileged: true

nodeName: minikube

status:

containerStatuses:

- name: nginx

ready: true

restartCount: 0

hostIP: 192.168.99.100

phase: Running

podIP: 172.17.0.4

startTime: 2017-08-01T06:34:13Z

aws_autoscaling_group.lamb:

availability_zones#: '1'

availability_zones.3205: us-west-1a

desired_capacity: '4'

destroy: false

health_check_grace_period: '300'

launch_configuration: kitten

wait_for_capacity_timeout: 10m

aws_instance.puppy:

ami: ami-09b4b74c

instance_type: t2.micro

source_dest_check: 'true'

aws_launch_configuration.kitten:

associate_public_ip_addr: 'false'

destroy: false

image_id: ami-09b4b74c

instance_type: t2.micro

name: kitten

Context DependentApplication Platform

Page 47: Netflix OSS Meetup Season 5 Episode 1

Infrastructure

Writing Policy Is Hard!

http.body: null

http.method: GET

http.path:

- salary

- bob

http.query_params: {}

protocol.scheme: https

service.source:

ipv4: 10.0.0.128

namespace: production

port: 32757

service: landing_page

service.target:

ip: 10.0.1.95

namespace: production

port: 8080

service: details

ingress.user: alice

kind: Pod

metadata:

labels:

app: nginx

name: nginx-1493591563-bvl8q

namespace: production

spec:

containers:

- image: nginx

imagePullPolicy: Always

name: nginx

securityContext:

privileged: true

nodeName: minikube

status:

containerStatuses:

- name: nginx

ready: true

restartCount: 0

hostIP: 192.168.99.100

phase: Running

podIP: 172.17.0.4

startTime: 2017-08-01T06:34:13Z

aws_autoscaling_group.lamb:

availability_zones#: '1'

availability_zones.3205: us-west-1a

desired_capacity: '4'

destroy: false

health_check_grace_period: '300'

launch_configuration: kitten

wait_for_capacity_timeout: 10m

aws_instance.puppy:

ami: ami-09b4b74c

instance_type: t2.micro

source_dest_check: 'true'

aws_launch_configuration.kitten:

associate_public_ip_addr: 'false'

destroy: false

image_id: ami-09b4b74c

instance_type: t2.micro

name: kitten

Context Dependent

Complex Data

Application Platform

Page 48: Netflix OSS Meetup Season 5 Episode 1

Writing Policy Is Hard!

http.body: null

http.method: GET

http.path:

- salary

- bob

http.query_params: {}

protocol.scheme: https

service.source:

ipv4: 10.0.0.128

namespace: production

port: 32757

service: landing_page

service.target:

ip: 10.0.1.95

namespace: production

port: 8080

service: details

ingress.user: alice

kind: Pod

metadata:

labels:

app: nginx

name: nginx-1493591563-bvl8q

namespace: production

spec:

containers:

- image: nginx

imagePullPolicy: Always

name: nginx

securityContext:

privileged: true

nodeName: minikube

status:

containerStatuses:

- name: nginx

ready: true

restartCount: 0

hostIP: 192.168.99.100

phase: Running

podIP: 172.17.0.4

startTime: 2017-08-01T06:34:13Z

aws_autoscaling_group.lamb:

availability_zones#: '1'

availability_zones.3205: us-west-1a

desired_capacity: '4'

destroy: false

health_check_grace_period: '300'

launch_configuration: kitten

wait_for_capacity_timeout: 10m

aws_instance.puppy:

ami: ami-09b4b74c

instance_type: t2.micro

source_dest_check: 'true'

aws_launch_configuration.kitten:

associate_public_ip_addr: 'false'

destroy: false

image_id: ami-09b4b74c

instance_type: t2.micro

name: kitten

Context Dependent

Complex Data

Search and

Aggregation

Application Platform Infrastructure

Page 49: Netflix OSS Meetup Season 5 Episode 1

OPA: Unified, Declarative, Context-aware

Application: “Employees can access

their own salary data. Managers can

access their subordinates salary

data.”

Platform: “Workloads that require

EU jurisdiction must be deployed on

clusters in European zones.”

Infrastructure: “Allow plans without

deletes unless the number of new

resources exceeds 100.” Data

(JSON)

Policy

(Rego)

Service

Policy

Query

Policy

Decision

Page 50: Netflix OSS Meetup Season 5 Episode 1

OPA: Unified, Declarative, Context-aware

“Employees can access their own salary data. Managers

can access their subordinates salary data.”

allow {

input.path = [“salary”, employee_id]

input.user = employee_id

}

allow {

input.path = [“salary”, employee_id]

input.user = data.manager_of[employee_id]

}

Page 51: Netflix OSS Meetup Season 5 Episode 1

OPA: Unified, Declarative, Context-aware

“Employees can access their own salary data. Managers

can access their subordinates salary data.”

allow {

input.path = [“salary”, employee_id]

input.user = employee_id

}

allow {

input.path = [“salary”, employee_id]

input.user = data.manager_of[employee_id]

}

Context

Pattern Matching

Page 52: Netflix OSS Meetup Season 5 Episode 1

OPA: Unified, Declarative, Context-aware

“Workloads that require EU jurisdiction must be deployed on

clusters in European zones.”

placement[cluster.name] {

input.metadata.labels[“requires-eu-jurisdiction”]

cluster = data.clusters[_]

startswith(cluster.status.region, “eu-”)

}

Page 53: Netflix OSS Meetup Season 5 Episode 1

OPA: Unified, Declarative, Context-aware

“Workloads that require EU jurisdiction must be deployed on

clusters in European zones.”

placement[cluster.name] {

input.metadata.labels[“requires-eu-jurisdiction”]

cluster = data.clusters[_]

startswith(cluster.status.region, “eu-”)

}

References Search

Page 54: Netflix OSS Meetup Season 5 Episode 1

OPA: Unified, Declarative, Context-aware

“Allow plans without deletes unless the number of new

resources exceeds 100.”

deny { score > 100 }

weights = {“create”: 1, “modify”: 0, “delete”: 1000}

score = s {

sum([weights[op] | input.plan[_] = [op, _]], s)

}

AggregationComposition

Page 55: Netflix OSS Meetup Season 5 Episode 1

The Open Policy Agent Project

● Declarative Language

● Document-oriented

● Daemon, Library

● Policy, Query, Data APIs

● Tooling (REPL, Tracing, Testing)

● Apache License 2.0

Data

(JSON)

Policy

(Rego)

Page 56: Netflix OSS Meetup Season 5 Episode 1

Thank You !

Netflix Open Source - @NetflixOSS

Page 57: Netflix OSS Meetup Season 5 Episode 1

PADME

Netflix Open Source - @NetflixOSS

Access Control In a Distributed World

www.padme.io

Page 58: Netflix OSS Meetup Season 5 Episode 1

Goals

• Provable, Composable, Security

• Simplicity (ease of use)

• Well Defined Behavior in a Distributed Environment

Page 59: Netflix OSS Meetup Season 5 Episode 1

The Problem

Configuring Access Policies is Hard

• Every component is different (heterogeneity)

• Web servers, networking gear, etc

• Services evolve, and policies need to change with them (temporality)

• Policies don’t understand the CAP Theorem (temporality)

Page 60: Netflix OSS Meetup Season 5 Episode 1

Current State

• Recruited Core Team

• Use cases

• Skeletal Reference Architecture

Page 61: Netflix OSS Meetup Season 5 Episode 1

How You Can Help!

• Looking for design partners to validate use cases

[email protected]

Page 62: Netflix OSS Meetup Season 5 Episode 1

Thank You !

Netflix Open Source - @NetflixOSS

Page 63: Netflix OSS Meetup Season 5 Episode 1

Demo Stations

Open Policy Agent

StethoscopeHubCommander

Titus