Download - Scaling Systems: Architectures that grow

Scaling Systems: Architectures that Grow

Fundamental Patterns for scaling you can implement incrementally

Kendall Miller

Who Am I?

• Kendall Miller• One of the Founders of Gibraltar

Software• Small Independent Software Vendor Founded in 2008• Developers of VistaDB and Loupe• Engineers, not Sales People

• Enterprise Systems Architect & Developer since 1995

• BSE in Computer Engineering, University of Illinois Urbana-Champaign (UIUC)

What Do We Do?

LoupeAdvanced logging and analysis of errors, performance, and usage patterns for .NET web apps, desktop apps and services

VistaDB The easy-to-deploy, SQL Server-compatible, pure .NET embedded database.

Fair Warning

What is Scale?

Scaling is the ability to cope and perform

under an increasing workload.

What is Scale?

Scaling to a load = available sustaining

that load

What is Scale?

Being available is really about a request being completed in a

period of time.

What is Scale?

•Requests per Unit Time

•Maximum Request Latency

1.00E+03 1.00E+04 1.00E+05 1.00E+06 1.00E+07 1.00E+08

Microsoft.com

Twitter.com

Amazon.com

Target.com

Slashdot.org

DevExpress.com

Hanselman.com

Gibraltar Software

What’s your Target?

Average daily traffic in Visitors / Day

What’s your Target?

25,000 Visitors/Day = 125,000

Pages/Day

11 High Traffic Hours/Day = 12,000

Pages/Hour

12,000 Pages/Hour = 3.3

Pages/Second

Specific Architectures• Gossip • Map Reduce• Tree of

Responsibility• Stream Processing• Scalable Storage• Publish/Subscribe• Distributed Queues

• Load Balancers + Shared Nothing Units

• Load Balancers + Stateless Nodes + Scalable Storage

• Content Addressable Networks

• General Peer to Peer

ACD/C

• Async – Do the work whenever• Caching – Don’t do any work

you don’t have to• Distribution – Get as many

people to do the work as you can• Consistency – We all agree on

these key things

Async

• Decouple operations so you do the minimum amount of work in performance critical paths

• Queue work that can be completed later to smooth out load

• Speculative Execution• Scheduled Requests (Nightly

processes)

Caching

• Save results of earlier work nearby where they are handy to use again later

• Apply in front of anything that’s time consuming

• Easiest to apply from the left to the right

• Simple strategies can be really effective (EF Dump all on update)

Why Caching?

• Loading the world is impractical• Apps ask a lot of repeating

questions.• Stateless applications even more so

• Answers don’t change often• Authoritative information is

expensive

Distribution

• Distribute requests across multiple systems

• Classic web “Scale Out” approach

• The less state held, the easier to distribute work. • Distributed database = hard• Distributed static content server = easy

• Request routing for distribution can serve other availability purposes

Consistency

• The degree to which all parties observe the same state of the system at the same time

• Scaling inevitably requires compromise• Forces one source of the truth for absolute

consistency and requires extensive locking to ensure parties agree

• The real world doesn’t require the consistency we tend to demand of our systems

Consistency Challenges

• Singleton Data Structures (Order numbers..)

• State held between the endpoints of a process

• Consistent results of queries across partitioned datasets

Typical Application

Client (Web

Browser)

Server(Web

Server)Storage

(Database)

Session StateSSL Session

Log ContentionMemory Allocation/GC

Network SocketsRequest Queue

Transaction IsolationReader/Writer Locks

Singleton Data Structures

Caching

Client (Web

Browser)

Server(Web

Server)Storage

(Database)

Browser Cache

Output Cache

Content Cache

Query Cache

100% 50% 10% 1%

Client (Web

Browser)

Distribution

Server(Web

Server)

Storage(Database)

Client (Web

Browser)Client (Web

Browser)Client (Web

Browser)

Server(Web

Server)

Reverse Proxy

Session State and Identity need to be factored outPartition (Sticky Session)

First, then stateless nodes

Server(Web

Server)Client (Web

Browser)

Partitioned Storage Zones

Server(Web

Server)

Storage(Database)Client

(Web Browser)

Client (Web

Browser)Client (Web

Browser)

Server(Web

Server)

Customer A Server(Web

Server)Storage

(Database)

Customer B

Server(Web

Server)

Client (Web

Browser)

Partitioned Storage Intra-Zone

OrdersClient (Web

Browser)Client (Web

Browser)Client (Web

Browser)

Server(Web

Server)

Customer A

Products

Customer B

Server(Web

Server)Server(Web

Server)

Inventory

Server(Web

Server)

Asynchronous Processing

OrdersServer(Web

Server)

Products

Server(Web

Server)Server(Web

Server)

Inventory

Order Queue

Order Processing

Server

Fresh Problems

Fallacies of Distributed Computing

• The network is reliable• Latency is zero• Bandwidth is infinite• The network is secure• Topology doesn’t change• There is one administrator• Transport cost is zero• The network is homogeneous

Client (Web

Browser)

Fresh Problems: Partial Failures

Server(Web

Server)

Storage(Database)

Client (Web

Browser)Client (Web

Browser)Client (Web

Browser)

Server(Web

Server)

Fresh Problems: Partial Failures

• Break system into individual failure zones

• Monitor each instance of each zone for problems

• Route around bad instances

Without monitoring, redundancy is

worthless

Server(Web

Server)Client (Web

Browser)

Fresh Problems: Upgrades

Server(Web

Server)

Storage(Database)Client

(Web Browser)

Client (Web

Browser)Client (Web

Browser)

Server(Web

Server)

Customer A Server(Web

Server)Storage

(Database)

Customer B

Fresh Problems: Upgrades

• Break system into individual upgrade zones

• Upgrade each zone – Drain & Stop, Upgrade, Verify.

• Cut traffic over to updated zones

Design for Software Update From the Start• Don’t forget Data Schemas

Bring It All HomeDon’t worry, we got this.

Bringing Home the Bacon

TestingTestingTesting

Critical Lessons Learned

• ACD/C• Clear

Consistency Strategy

• Build in monitoring and management

Thanks!Twitter@KendallMiller

[email protected]

BlogRocksolid.GibraltarSoftware.com

Download - Scaling Systems: Architectures that grow

Top Related