keeping the internet fast and resilient for you and your customers

23
Unreliable Internet Nick Wondra @ Cloudflare Martin J. Levy @ Cloudflare November 2017

Upload: cloudflare

Post on 23-Jan-2018

165 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Keeping the Internet Fast and Resilient for You and Your Customers

Unreliable Internet Nick Wondra @ Cloudflare

Martin J. Levy @ Cloudflare

November 2017

Page 2: Keeping the Internet Fast and Resilient for You and Your Customers

Todays agenda

● Introduction (Tim Fong)

○ Why does the Internet sometime “misbehave” when it

comes to delivering applications?

○ What are some ways to solve this?

● Martin J. Levy (20 min)

○ The Internet and how it’s tied together

○ BGP and topology

○ Testing (example of tools and techniques)

● Nick Wondra (20 min)

○ Approaches to solve the problem

○ Examples of mechanisms in place

● Summary (5 min)

● Audience Q/A (10 min)

Page 3: Keeping the Internet Fast and Resilient for You and Your Customers

The Internet and how it’s tied togetherMartin J. Levy

Page 4: Keeping the Internet Fast and Resilient for You and Your Customers

The Internet

● Technically – a somewhat complex subject○ The Internet is a collection of networks

○ No network stands alone (all interconnected)

○ Robustness can be created

○ Multi-homing (more than one transit/path)

○ Peering between “like” networks

○ Diversity (physical and logical)

○ Nothing is static!

● Internet was developed for something different

● Many types of data (and data layers)

● TCP/UDP vs FTP/HTTP/SMTP vs TLS vs XML/JSON

Page 5: Keeping the Internet Fast and Resilient for You and Your Customers

The Internet - just how complex? (hint: very!)This is the representation of

a single network (a medium

sized telco) and its

interconnections globally to

various other backbone

networks.

A full diagram would have

upwards of 60,000

independent networks

depicted on a single

diagram, which is hard to

follow.

Page 6: Keeping the Internet Fast and Resilient for You and Your Customers

Glueing the Internet together - BGP routing

● The IETF specified a protocol (BGP4) that can handle:○ Massive routing tables

○ CIDR routing (ability to specify IP network address plus a network size)

○ IPv4 & IPv6

○ Rules for routing internally within a network

○ Rules for routing to an external network

○ Much more!

● BGP in real-life is used by every network on the Internet○ Every destination on the globe exists within the BGP global routing tables

○ Everything is public, visible, exposed, and recorded

Page 7: Keeping the Internet Fast and Resilient for You and Your Customers

What works? What breaks? What’s the fix?

● There’s no steady state within the Internet

○ The path from A to Z is forever changing. Sometime for the better

○ The BGP routing protocol has to address many factors:

■ Physical interruptions (a fiber break)

■ Planned maintenance (upgrades to facilities or services)

■ Increases in capabilities - for example, a new undersea cable

■ Third party “hiccups”

○ Commercial agreements (and disagreements)

■ Purchasing from a different Internet service provider

■ Ending contracts and changing service providers

● What we do know is that it keeps our network engineers busy!

Page 8: Keeping the Internet Fast and Resilient for You and Your Customers

The Internet - it keeps on growing

A new undersea is laid

between the African

coast and the

Seychelles island

(replacing a satellite

connection)

Page 9: Keeping the Internet Fast and Resilient for You and Your Customers

The Internet - When it breaks, it breaks!

This is an example of what happens each-and-every day all

around the globe. The physical layer of the Internet is fragile.

All those bright spray-painted lines you see on a street (before

someone digs it up) is meant to stop this from happening.

It doesn’t!

Page 10: Keeping the Internet Fast and Resilient for You and Your Customers

Protocol stack - what’s above the physical layer

● Layers provide capabilities○ Application - the end-users view

○ Transport - HTTP & TLS

○ Internet - IP and routing

○ Data link - that fiber in the ground

● Each layer has its possible failures

Physical

Data Link

Network

Transport

Session

Presentation

Application

OSI Network Model

Data Link

Internet

Transport

Application

The TCP/IP Model

Page 11: Keeping the Internet Fast and Resilient for You and Your Customers

Distance, Latency, Variable Paths, and more

150 msec

70 msec

230 msec

400 msec

● Speed of light○ Very constant![1]

● Distance ~= Hops○ Reliability decreases

● Variable paths

○ Redundancy .vs.

○ Non-deterministic

● Variable providers

○ Sometimes useful

[1] https://www.quora.com/What-is-precisely-the-speed-of-light-in-fiber-optics

Page 12: Keeping the Internet Fast and Resilient for You and Your Customers

Monitoring tools and more

● Beyond ping - or what’s really happening to your packets?

[1] http://bgp.he.net/

[2] http://atlas.ripe.net/

[3] http://stat.ripe.net/

Page 13: Keeping the Internet Fast and Resilient for You and Your Customers

Approaches to solve the problemNick Wondra

Page 14: Keeping the Internet Fast and Resilient for You and Your Customers

Change the model of the Internet?

● Address the content, not the server○ Content centric networking, et al

○ Route requests based on content location

○ Content is decentralized, moves through the network

● Requires changes deep in the protocol stack○ … but lots of investment built into current infrastructure

Page 15: Keeping the Internet Fast and Resilient for You and Your Customers

Change the core Internet protocols?

● Can we build a better BGP?○ Low-level distance and performance metrics may not translate to

application performance

○ Many networks = many systems to change

● Can we build a better transport?○ TCP and UDP deeply ingrained in end-user systems and network

middleboxes (firewalls, LBs, WAN optimizers, etc)

Page 16: Keeping the Internet Fast and Resilient for You and Your Customers

● Evolve new solutions on top of existing frameworks○ Solve for problems in the malleable network layers

○ Example: TLS 1.3

■ More secure and faster (fewer RTTs)

○ Example: TCP+HTTP => UDP+QUIC

■ Speed: connection establishment, session multiplexing

■ Resilience: congestion control, forward error correction

■ Flexibility: connection migration

● The challenge is distribution○ Clients and servers must opt-in

Evolution, not revolution

Page 17: Keeping the Internet Fast and Resilient for You and Your Customers

● Cloudflare has Points of Presence (PoPs) across the globe○ PoPs close to every Internet user and server

○ Transit/peering with multiple networks at every PoP

○ Proxies 10% of all web requests

● Global Internet performance and reliability monitoring○ Real-time feedback as data traverses the network

○ Can “test” network paths that BGP wouldn’t use

○ Use performance metrics that matter to web applications (TTFB,

response time)

Value of a large global network

Page 18: Keeping the Internet Fast and Resilient for You and Your Customers

Global footprint = path control

● Force routing paths by pinning to intermediate PoPs

150ms

200ms

Page 19: Keeping the Internet Fast and Resilient for You and Your Customers

● Evolution inside the network, transparent to client and server

Global footprint = distribution channel

TCP+HTTPTCP+HTTP UDP+QUIC

Page 20: Keeping the Internet Fast and Resilient for You and Your Customers

SummaryNick Wondra & Martin J. Levy

Page 21: Keeping the Internet Fast and Resilient for You and Your Customers

Summary, Questions, and Thank You!

Martin J. Levy - Network Strategy

@Cloudflare

@mahtin

Nick Wondra - Systems Engineer

@Cloudflare

@nickwondra

Page 22: Keeping the Internet Fast and Resilient for You and Your Customers

Appendix

Page 23: Keeping the Internet Fast and Resilient for You and Your Customers

Additional Reading (via Cloudflare blog)

● Argo & Warp:○ https://blog.cloudflare.com/argo/

○ https://blog.cloudflare.com/the-making-of-cloudflare-warp/

● Railgun:○ https://blog.cloudflare.com/cacheing-the-uncacheable-cloudflares-railgun-

73454/

● Load Balancing:○ https://blog.cloudflare.com/introducing-load-balancing-intelligent-failover-with-

cloudflare/

● TLS:○ https://blog.cloudflare.com/introducing-tls-client-auth/