Scalability is not a Singleton
You have less than 5 seconds to
deliver content.
Maintain availability Maintain performance
Average cost of
critical app failure per
hour is $500,000
to $1 million.
Scale is achieved by optimally distributing load
TCP Connections Response time HTTP Requests
Least connections Fastest responseFastest appRound Robin
Thresholds Methods
Y-Axis
Like routing Uses identifiable variables
(usually from HTTP headers)
Operates at layer 7 (HTTP)
Z-Axis
Like sharding Uses one or more identifiable variables (usually from HTTP
headers) Operates at layer 7 (HTTP)
APIs and MicroservicesAPI as Façade API as Endpoint
SERVICE
API
API
SERVICE
API
SERVICE
API
SERVICE
API
API Façade: Interpolated in Network
Standard Y-axis Scaling Pattern
Microservice A
Microservice B
Microservice C
API
Most often combined in 2-3 tiers
Y-axis routing pattern used to distribute API calls to X-axis scaled service clusters
Top Mistakes Regardless of Pattern
• Choosing the wrong thresholds • Scaling too late leads to time outs &
performance degradation• Not monitoring • Services fail, users get angrified by time
outs• Not scaling the scaler • Overwhelming the upstream scaler is a
Very Bad Thing™