supply frame high availability in web content delivery
DESCRIPTION
high availability in web content deliveryTRANSCRIPT
Aleksandar Bilanovic, SRE at Supply Frame, Inc.
High Availability in
Web Content Delivery
14.10.2014
Minimizing risk associated with service failure and providing maximum uptime for application.
In order to achieve it we need to engineer and plan data center, network, servers, OS, applications and people.
It is about eliminating SPOF, detection of errors and building automated and reliable crossover to backup infrastructure.
Pros: high uptime, faster web content delivery, satisfied users, dealing with capacities and not with service denials.
Cons: high price, risk that HA arch became unmaintainable due to complexity, high complexity can contribute to failure and downtime.
What is and how to achieve High Availability
HA in Web Content Delivery: Supply Frame, Inc. Primer
Physical vs. virtualCarrier neutralRedundant power with industrial UPS / diesel generatorsClimate control Fire suppressionPhysical access controlBackup data center with redundant dark fiber cross connect
HA in Web Content Delivery: Data center
Internet access: BGP routing with multiple IP transits Local network: switch clusters for core/distribution/access
layersPrimary and backup data center routing Link aggregationRedundant power supplies
HA in Web Content Delivery: Infrastructure Network
HA Network: normal operations
HA Network: IP transit failure
HA Network: router failure
HA Network: switch failure
HA Network: cross connectfailure
HA Network: link aggregationfailure
Server class machines onlyRedundant power suppliesRedundant Array of Inexpensive / Independent Disks (RAID)Remote server console (iDRAC)
HA in Web Content Delivery: Servers
OS performance tuning (TCP/IP / number of open files, various memory buffers etc … )
Redundant databases / API backendsProtection servers / app performances from aggressive
crawlers (iptables recent module on LBs)OS/ App services monitoring (nagios, riemann, graphite,
dynect, pingdom, new relic)Data backup (online and offline)
HA in Web Content Delivery: OS / App
DynECT / Akamai GTM probing services for server/services failures / DC failover
Pacemaker / Corosync clusterLoad balancing services using haproxy (http/TCP
load balancer)
• uninterrupted services during deploy• high performance in web content delivery (number of web
nodes scales number of requests almost linear)• eliminating SPOF
HA in Web Content Delivery: crossover to backup infrastructure
HA OS/App: DynECT / Akamai GTM
HA OS/App:haproxy
HA OS/App:corosync / pacemaker LB cluster
(us-lax-1w-lb-00)
Human error - no HA arch can predict thatHA people: Follow the SunEveryone has to know something about everything and
everything about something (network, systems, application, automation, programming ...)
HA in Web Content Delivery: People