aws re:invent 2016: global traffic management with amazon route 53 traffic flow (net302)

54
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sergey Royt, SDM, Amazon Route 53 11/30/2016 NET302 Managing Global Traffic with Amazon Route 53 Traffic Flow

Upload: amazon-web-services

Post on 16-Apr-2017

567 views

Category:

Technology


0 download

TRANSCRIPT

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Sergey Royt, SDM, Amazon Route 53

11/30/2016

NET302

Managing Global Traffic with Amazon

Route 53 Traffic Flow

What to Expect from the Session

• Concepts for managing global traffic

• Introducing Amazon Route 53 traffic flow

• Using traffic flow for traffic management

• Case study: Amazon VPN endpoint selection

What is traffic management?

• Connecting clients to servers

• End users

• Programmatic clients

• Internal clients (components of your systems)

How to manage traffic

• Load balancing: network proxy to servers

• Service discovery: let clients decide

• Content delivery network (CDN): fully managed

distribution

• DNS level: more flexibility than CDN; multiple origins

What are typical application architectures?

• One typical progression

• Growing demands lead to increased level of

sophistication

Single server

Start with a single server

Single server

Start with a single server (modern version)

Auto Scaling group

instance

Load balancing

Fleet of servers behind a load balancer

Auto Scaling group

Elastic Load Balancinginstances

Expanded footprint

Multiple load balancers

serversload balancer A

serversload balancer B

Multi-AZ service

Load balancers across multiple Availability Zones

Au

to S

ca

lin

g g

rou

p

Elastic Load

Balancing

instances

Availability Zone 1

Au

to S

ca

lin

g g

rou

p

instances

Availability Zone 2

Global service

Load balancers and/or servers across multiple regions

globally

Region A

Region B

Region C

user

user

user

Serving a global user base

• Get closer to users; latency matters

• In the old days: create separate “stacks” with domain

names: mycorp.com (USA), mycorp.ca (Canada),

mycorp.fr (France), etc.

• Can more advanced DNS help?

Domain Name System

What is DNS?

How does it work?

ONLY JOKING, LET’S GO ON

NET202: DNS Demystified

The role of DNS

• Point of entry to your service

• The one global piece of infrastructure

• Recognize geographical locations

• Recognize client networks

Evolution of DNS for a global service

Static records

• Need capabilities to route traffic

MX 10 mail.example.com.

MX 20 mail2.example.com.

mail A 10.0.1.2

svc A 10.0.1.3

www CNAME svc.example.com.

Evolution of DNS for a global service

Dynamic source-discriminating DNS records

• All these records -- what about management?

Evolution of DNS for a global service

Policy-based configuration

• More meaning, less overhead!

mycorp.com

Europe

eu-central-1

Madrid

Americas

East Coast

California

Introduction to Route 53 traffic flow

• Traffic policy is a versioned document composed of rules

and endpoints

• Versioning provides atomic roll back/roll forward

• Traffic policy is applied to an actual domain name, so all

rules and endpoints apply to that domain name

• You can use the same traffic policy for more than one

domain name

Traffic flow terminology

• Traffic policy – rules routing to endpoints

• Traffic policy record – domain name with an applied

traffic policy version

Traffic policy example

Create new policy for a web server

Traffic policy example

Traffic flow: endpoints

• Hybrid/low level infrastructure: IP address or CNAME

• ELB Classic Load Balancer / Application Load Balancer

• Amazon S3 website

• Amazon CloudFront distribution

• AWS Elastic Beanstalk environment

Traffic flow: quick overview of rules

• Failover

• Weighted

• Geolocation

• Latency

Traffic policy example

Applying the traffic policy

Applying the traffic policy

Traffic flow: basic rules

• Failover

• Primary/secondary

• Weighted

• Round robin across multiple items

Traffic policy example

Editing a traffic policy record

Traffic flow: geo

• Routes traffic to endpoints based on location

• Location is [Continent [Country [Subdivision]]]

• Used for:

• Specializing content

• Balancing traffic distribution between data centers

• Pre-optimizing network link selection

Under the hood: geo

• Identify request

• DNS resolver address

• EDNS0 client subnet option (if available)

• Check geo database for location – continent, country,

subdivision

• Find the most specific entry in the rule items to match

the location

Traffic flow: latency

• Routes traffic to closest AWS Region based on latency

• Good default choice for routing between endpoints in

AWS Regions

Under the hood: latency

• Identify request

• Check latency database for preference order of all

regions

• Go to top healthy choice among the rule items

Where is the latency data from?

• Large scale experimentation system

• Measures web client latency to different AWS Regions

• Relates latency data to the DNS resolver used by the

client

• Output: IP address -> region preference vector

Test record set

• Simple “what-if” testing for troubleshooting policy

configurations

• Check a source based on resolver or client subnet

• Works for traffic policy records or regular records

Test record set

Case study: Amazon VPN endpoints

• AWS and Amazon retail are highly operationally focused

• VPN enables 24/7 support of all our services

• Need to manage endpoints for end users to connect to

the VPN

• Multi-regional highly available service

The problem

• Manual region selection by users

• Overloaded VPN servers

• Single server faults require retry by users in order to

switch to a healthy server

Implementing with traffic flow

• Model the desired flow based on available endpoints

• Convert the model into a traffic policy

User’s country

Default: closest region

South Africa

Romania

[AWS Region]

US East 1

US West 1

Server round robin

1.1.1.1

2.2.2.2

Country selection (geo)

Region selection (latency)

Server selection (weighted)

Server endpoint

Case study: endpoint selection improvement

“Our fleet consists of different types of hardware and

Route 53 allows us to send more connections to VPN

servers with higher capacity than to the ones with

lower capacity”

Case study: endpoint health checking

Pre-create per endpoint health checks for non-AWS

endpoints

“With the use of Route 53 policies and health checks

we have been able to avoid bad user experience in

cases of downtime in the VPN servers”

How does DNS failover work?

Route 53 health checks provide highly available failover

with a predictable window

Failover Time =

Interval (30s or 10s) * Failure Threshold (min: 1) +

Health check aggregation time (10s) +

Record TTL (60s is typical for dynamic domain names)

~= 70 to 90 seconds

Failover considerations

For planned downtime, let traffic drain to avoid the failover

window and accommodate non-conforming DNS resolvers:

• Update policy record to remove the endpoint from DNS

• Wait for server request metric to go down

• Deactivate the server

Note on circular dependencies

• When designing for high availability, don’t forget a fail-

safe

• The VPN client includes an option that bypasses Route

53 as a way to break out of the dependency cycle for

Route 53 operators

Case study: ongoing management

Adding a new endpoint is easy and roll back is quick

Case study: management benefits

“Route 53 has helped to make simple and easy the

execution of operational tasks in our VPN fleet: with

the use of traffic policies we are able to add or remove

VPN servers from production before maintenance

without user impact”

Related Sessions

• NET202 - DNS Demystified: Getting Started with

Amazon Route 53, featuring Warner Bros.

Entertainment

• NET203 - From EC2 to ECS: How Capital One uses

Application Load Balancer Features to Serve Traffic at

Scale

• NET403 - Elastic Load Balancing Deep Dive and Best

Practices

Amazon Route 53 survey

Give us your feedback about Route 53’s features and

usability at http://amzn.to/Route53_300

Meet the Route 53 team and get Route 53 swag at the

Networking, Content Delivery, & Media Solutions booth.

Thank you!

Remember to complete

your evaluations!