caching the uncacheable [long version]
DESCRIPTION
(Surge 2014) This is a longer version of our Velocity 2014 slides around caching dynamic content. Topic: In the past, CDNs have been used to cache and distribute static objects. But issues around invalidation, staleness, and lack of visibility have prevented us from using CDNs to fully leverage the benefits of caching when it comes to dynamic content. Today, using a real-time, modern CDN that provides instant cache invalidation and real-time analytics allows for instantaneous control over dynamic content caching.TRANSCRIPT
Caching The Uncacheable
Leveraging Your CDN To Cache Dynamic Content
Hooman Behesh+, VP Technology
Dynamic Content Is Really Interes<ng!
What Is Dynamic Content?
• Stuff that’s not sta+c! • With web traffic, generally the base HTML – Big deal because it’s blocking – And some+mes a large object à longer download
Blocking
What Is Dynamic Content? • Stuff that’s not sta+c! • With web traffic, generally the base HTML – Big deal because it’s blocking – And some+mes a large object à longer download
• Could be other things too – AJAX calls – API calls
• More…
Classically, with dynamic content…
Caching
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
Caching vs.
Invalida<on
We tried…
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
• Real-‐+me visibility – Real-‐+me analy+cs/stats – Real-‐+me logging/tracking
So…
Caching
CDNs and Dynamic Content
• Generally, handling dynamic content has been a maRer of transport – Op+mize from-‐origin delivery – “DSA” (Dynamic Site Accelera+on) – Middle mile op+miza+ons – TCP tweaks
Dynamic Content, Tradi<onally
CDN Node
Client
Origin
Some TCP Tweaks
Dynamic Content, Tradi<onally
CDN Node CDN Node
Client
Origin
Lots of TCP Tweaks
Dynamic Content, Tradi<onally
• We some+mes do micro caching of HTML – Short TTL for HTML content via Cache-‐Control – Not full proof
• Ex: news stories faux-‐pas!
• ESI (Edge Side Includes) – Par+al caching – Hard and onerous
Actually…
• Dynamic content is more cacheable than we think
• Sta+c for shorter periods of +me
• Unpredictable invalida+on – Standard HTTP caching rules aren’t good enough
A Lot BeLer!
CDN Node CDN Node
Client
Origin
No CDN
CDN (sta<c)
Blocking CDN
(sta<c)
CDN (sta<c)
CDN (sta<c
+ dynamic)
So Many Benefits!
• Performance – Faster +me to first byte – Faster start render – Happy users!
• Offload – Less work for our servers – Less bandwidth at origin
What would make it beLer?
Programma<c Invalida<on
• Invalida+on API
• Granular – Specific URL – URL groups – All – Etc…
Purge • As a page gets published/updated, a purge command also gets published
• Must be Instant!
• Big problem with classic CDNs – mul+-‐minute purges! – Ex: News story, part 2
Instant?
• IS NOT: – 12 minutes! – quick acknowledgment!
• IS: – <1sec – predictable and determinis+c behavior – (has to also include propaga+on)
Power of the Purge!
• Purge dependencies – Surrogate Keys – Using tags/labels to purge en+re chunks of content at once
Not totally obsolete!
Example: CMS + Purge
WordPress: Before
CDN Node
WordPress: Before
CDN Node
WordPress: Before
CDN Node
WordPress: Before
CDN Node
WordPress: Before
CDN Node
Cache
WordPress: AVer
CDN Node
WordPress: AVer
CDN Node
HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 55,666Cache-Control: Totally Long Time!
WordPress: AVer
CDN Node
HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 55,666Cache-Control: Totally Long Time!
WordPress: AVer
CDN Node
WordPress: AVer
CDN Node
WordPress: AVer
CDN Node PURGE
WordPress: AVer
CDN Node PURGE
WordPress: AVer
CDN Node
(Has to be instantaneous!)
PURGE
WordPress: AVer
CDN Node
HTTP/1.1 200 OKContent-Type: text/htmlContent-Length: 55,666Cache-Control: Totally Long Time!
WordPress: AVer
CDN Node
Example: customer1.js
Before
CDN Node
Origin http://www.3rdparty.com/customer1.js
(Referer: www.customer1.com)
Before
CDN Node
Origin http://www.3rdparty.com/customer1.js
Before
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
Before
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
Before
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
Before
CDN Node
Origin http://www.3rdparty.com/customer1.js
(Ager 1 min)
Before
CDN Node
Origin http://www.3rdparty.com/customer1.js VALIDATION
If-Modified-Since: Wed, 24 Sep 2014 19:51:30 GMT
(Ager 1 min)
Before
CDN Node
Origin http://www.3rdparty.com/customer1.js
304 Not Modified
(Ager 1 min)
Before
CDN Node
Origin http://www.3rdparty.com/customer1.js
304 Not Modified
(Ager 1 min)
AVer
CDN Node
Origin http://www.3rdparty.com/customer1.js
AVer
CDN Node
Origin http://www.3rdparty.com/customer1.js
AVer
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=2592000Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
AVer
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=2592000Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
AVer
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=2592000Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
AVer
CDN Node
Origin http://www.3rdparty.com/customer1.js
(Ager 1 min)
AVer
CDN Node
Origin http://www.3rdparty.com/customer1.js
(Ager 1 min)
This happens many many <mes! (many many happy visitors!)
AVer
CDN Node
Origin
Customer1 changes config
(Ager 1 min)
AVer
CDN Node
Origin PURGE customer1.js
Customer1 changes config
(Ager 1 min)
AVer
CDN Node
Origin PURGE customer1.js
Customer1 changes config
(Ager 1 min)
AVer
CDN Node
Origin http://www.3rdparty.com/customer1.js
(Ager 1 min)
AVer
CDN Node
Origin http://www.3rdparty.com/customer1.js
HTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=2592000Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
(Ager 1 min)
AVer
CDN Node
Origin
HTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=2592000Last-Modified: Wed, 24 Sep 2014 19:51:30 GMTContent-Type: application/javascriptDate: Thu, 25 Sep 2014 12:22:20 GMTServer: ApacheContent-Length: 7835
http://www.3rdparty.com/customer1.js
(Ager 1 min)
More than just Invalida<on…
The Influence of Clouds
• The CDN is an extension of the app • No longer a black box • Real-‐+me integra+on with the app • Infrastructure as code – Your content => You need control
Control
• Programmability – Configura+on API – Invalida+on API – Instantaneous and real +me
Caching Control
• Granular caching – Dynamic caching – flexible cache keys – etc
• Ex: Geo-‐based caching
Control at the Edge
• Moving app logic to the edge • VCL – Varnish Configura+on Language – Script-‐like configura+on for func+onality at the edge
Visibility
• Real +me analy+cs – Network stats – HTTP stats (status codes , etc) – Caching stats (hits, misses, etc) – Stats API
• Logging – Real +me logs – Streaming to various log des+na+ons
Example: Beacon Termina<on at the Edge
Before
CDN Node
Origin
Log Analysis
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before
CDN Node
Origin
Log Analysis
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before
CDN Node
Origin
Log Analysis
HTTP/1.1 200 OKPragma: no-cacheExpires: Wed, 19 Apr 2000 11:43:00 GMTCache-Control: no-cache, no-storeLast-Modified: Wed, 21 Jan 2004 19:51:30 GMTContent-Type: image/gifDate: Fri, 20 Jun 2014 12:22:20 GMTServer: ApacheContent-Length: 35
http://collector.site.com/beacon.img?a=1&b=2&c=3
Before
CDN Node
Origin
Log Analysis
HTTP/1.1 200 OKPragma: no-cacheExpires: Wed, 19 Apr 2000 11:43:00 GMTCache-Control: no-cache, no-storeLast-Modified: Wed, 21 Jan 2004 19:51:30 GMTContent-Type: image/gifDate: Fri, 20 Jun 2014 12:22:20 GMTServer: ApacheContent-Length: 35
http://collector.site.com/beacon.img?a=1&b=2&c=3
AVer
CDN Node
Origin http://collector.site.com/beacon.img?a=1&b=2&c=3
AVer
CDN Node
Origin
HTTP/1.1 200 OKPragma: no-cacheExpires: Wed, 19 Apr 2000 11:43:00 GMTCache-Control: no-cache, no-storeLast-Modified: Wed, 21 Jan 2004 19:51:30 GMTContent-Type: image/gifDate: Fri, 20 Jun 2014 12:22:20 GMTServer: ApacheContent-Length: 35
http://collector.site.com/beacon.img?a=1&b=2&c=3
AVer
CDN Node
Origin
HTTP/1.1 204 No ContentDate: Sat, 21 Jun 2014 23:21:12 GMTServer: Awesome ServerContent-Length: 0
http://collector.site.com/beacon.img?a=1&b=2&c=3
AVer
CDN Node
Origin
Syslog / S3 / FTP/etc
http://collector.site.com/beacon.img?a=1&b=2&c=3
Example: Edge-‐generated Content
JSON Data Center ID
CDN Node
Origin http://www.site.com/which_datacenter.js
JSON Data Center ID
CDN Node
Origin
{ ‘datacenter’ : ‘SJC’ }
http://www.site.com/which_datacenter.js
VCL Snippet
JSON Geo IP
CDN Node
Origin
{ “city” : “New York”, “state”: “New York”, “country”: “United States”, “ip”: “173.18.14.237”}
http://www.site.com/geo_ip.js
More Examples
• Caching with tracking cookies: – hRp://www.fastly.com/blog/how-‐to-‐cache-‐with-‐tracking-‐cookies
• API Caching: – hRp://www.fastly.com/blog/api-‐caching-‐part-‐iii (part 3, with links to previous two parts)
• Log Streaming: – hRp://www.fastly.com/blog/+ps-‐for-‐streaming-‐logs
Let’s Sum Up!
Summary
• Dynamic content can be cached – We need instant purging – We need real-‐+me logs and stats
• Real-‐+me integra+on of our CDN with our app is cool! – Extensive/granular API to control the CDN – Control and visibility at the edge lets us be really crea+ve
• Never use “Totally Long Time!” in a Cache-Control header!
Thank you!