caching and tuning fun for high scalability€¦ · caching and tuning fun for high scalability wim...
TRANSCRIPT
Caching and tuning funfor high scalability
Wim GoddenCu.be Solutions
Who am I ?Wim Godden (@wimgtr)Owner of Cu.be Solutions (http://cu.be)Open Source developer since 1997Developer of OpenXZend Certified EngineerZend Framework Certified EngineerMySQL Certified Developer
Who are you ?Developers ?System/network engineers ?Managers ?
Caching experience ?
Goals of this tutorialEverything about caching and tuningA few techniques
How-to
How-NOT-to
→ Increase reliability, performance and scalability
5 visitors/day → 5 million visitors/day
(Don't expect miracle cure !)
LAMP
Architecture
Test page3 DB-queries
select firstname, lastname, email from user where user_id = 5;
select title, createddate, body from article order by createddate desc limit 5;
select title, createddate, body from article order by score desc limit 5;
Page just outputs result
Our base benchmarkApachebench = useful enoughResult ?
Single webserver ProxyStatic PHP Static PHP
Apache + PHP 3900 17.5 6700 17.5
Limit :CPU, network
or disk
Limit :database
CachingCaching
What is caching ?
CACHECACHE
What is caching ?
x = 5, y = 2n = 50 Same result
CACHECACHE
select*
fromarticle
join useron article.user_id = user.id
order bycreated desc
limit10
Doesn't changeall the time
Theory of caching
DB
Cache
$data = get('key')
false
GET /pagePage
select data from table
$data = returned result
set('key', $data)
if ($data == false)
Theory of caching
DB
Cache
HIT
Caching techniques#1 : Store entire pages
#2 : Store part of a page (block)#3 : Store data retrieval (SQL ?)
#4 : Store complex processing result#? : Your call !
When you have data, think :Creating time ?Modification frequency ?Retrieval frequency ?
How to find cacheable dataNew projects : start from 'cache everything'Existing projects :
Look at MySQL slow query log
Make a complete query log (don't forget to turn it off !)
Check page loading times
Caching storage - DiskData with few updates : goodCaching SQL queries : preferably not
DON'T use NFS or other network file systemsespecially for sessions
locking issues !
high latency
Caching storage - Disk / ramdiskLocal
5 Webservers → 5 local caches
How will you keep them synchronized ?→ Don't say NFS or rsync !
Caching storage - Memcache(d)Facebook, Twitter, YouTube, … → need we say more ?Distributed memory caching systemMultiple machines ↔ 1 big memory-based hash-tableKey-value storage system
Keys - max. 250bytes
Values - max. 1Mbyte
Caching storage - Memcache(d)Facebook, Twitter, YouTube, … → need we say more ?Distributed memory caching systemKey-value storage system
Keys - max. 250bytes
Values - max. 1Mbyte
Extremely fast... non-blocking, UDP (!)
Memcache - where to install
Memcache - where to install
Memcache - installation & running itInstallation
Distribution package
PECL
Windows : binaries
RunningNo config-files
memcached -d -m <mem> -l <ip> -p <port>
ex. : memcached -d -m 2048 -l 172.16.1.91 -p 11211
Caching storage - Memcache - some notesNot fault-tolerant
It's a cache !
Lose session data
Lose shopping cart data
...
Caching storage - Memcache - some notesNot fault-tolerant
It's a cache !
Lose session data
Lose shopping cart data
…
Firewall your Memcache port !
Memcache in code
<?php$memcache = new Memcache();$memcache->addServer('172.16.0.1', 11211);$memcache->addServer('172.16.0.2', 11211);
$myData = $memcache->get('myKey');if ($myData === false) { $myData = GetMyDataFromDB(); // Put it in Memcache as 'myKey', without compression, with no expiration $memcache->set('myKey', $myData, false, 0);}echo $myData;
Benchmark with Memcache
Single webserver ProxyStatic PHP Static PHP
Apache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108
Memcache slabs(or why Memcache says it's full when it's not)
Multiple slabs of different sizes :Slab 1 : 400 bytes
Slab 2 : 480 bytes (400 * 1.2)
Slab 3 : 576 bytes (480 * 1.2) (and so on...)
Multiplier (1.2 here) can be configuredStore a lot of very large objects→ Large slabs : full→ Rest : free→ Eviction of data !
Memcache - Is it working ?Connect to it using telnet
"stats" command →
Use Cacti or other monitoring tools
STAT pid 2941STAT uptime 10878STAT time 1296074240STAT version 1.4.5STAT pointer_size 64STAT rusage_user 20.089945STAT rusage_system 58.499106STAT curr_connections 16STAT total_connections 276950STAT connection_structures 96STAT cmd_get 276931STAT cmd_set 584148STAT cmd_flush 0STAT get_hits 211106STAT get_misses 65825STAT delete_misses 101STAT delete_hits 276829STAT incr_misses 0STAT incr_hits 0STAT decr_misses 0STAT decr_hits 0STAT cas_misses 0STAT cas_hits 0STAT cas_badval 0STAT auth_cmds 0STAT auth_errors 0STAT bytes_read 613193860STAT bytes_written 553991373STAT limit_maxbytes 268435456STAT accepting_conns 1STAT listen_disabled_num 0STAT threads 4STAT conn_yields 0STAT bytes 20418140STAT curr_items 65826STAT total_items 553856STAT evictions 0STAT reclaimed 0
Memcache - backing up
Memcache - tipPage with multiple blocks ?→ use Memcached::getMulti()
But : what if you get some hits and some misses ?
getMulti($array) Hashingalgorithm
Updating data
Updating data
LCD_Popular_Product_List
Adding/updating data
$memcache->delete('LCD_Popular_Product_List');
Adding/updating data
Adding/updating data - Why it crashed
Adding/updating data - Why it crashed
Adding/updating data - Why it crashed
Cache stampeding
Cache stampeding
Memcache code ?
DB
Visitor interface Admin interface
Memcache code
Cache warmup scriptsUsed to fill your cache when it's emptyRun it before starting Webserver !
2 ways :Visit all URLs
Error-proneHard to maintain
Call all cache-updating methods
Make sure you have a warmup script !
Cache stampeding - what about locking ?Seems like a nice idea, but...While lock in placeWhat if the process that created the lock fails ?
LAMP...
→ LAMMP
→ LNMMP
NginxWeb serverReverse proxyLightweight, fast12.2% of all Websites
NginxNo threads, event-drivenUses epoll / kqueueLow memory footprint
10000 active connections = normal
Nginx - Configuration
server { listen 80; server_name www.domain.ext *.domain.ext; index index.html; root /home/domain.ext/www;} server { listen 80; server_name photo.domain.ext; index index.html; root /home/domain.ext/photo;}
Nginx with PHP-FPMSince PHP 5.3.3Runs on port 9000Nginx connects using fastcgi method
location / { fastcgi_pass 127.0.0.1:9000; fastcgi_index index.php; include fastcgi_params; fastcgi_param SCRIPT_NAME $fastcgi_script_name; fastcgi_param SCRIPT_FILENAME /home/www.4developers.pl/$fastcgi_script_name; fastcgi_param SERVER_NAME $host; fastcgi_intercept_errors on;
}
Nginx + PHP-FPM featuresGraceful upgradeSpawn new processes under high loadChrootSlow request log !
Nginx + PHP-FPM featuresGraceful upgradeSpawn new processes under high loadChrootSlow request log !fastcgi_finish_request() → offline processing
Nginx + PHP-FPM - performance ?
Single webserver ProxyStatic PHP Static PHP
Apache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112
Limit :single-threadedApachebench
Reverse proxy time...
VarnishNot just a load balancerReverse proxy cache / http accelerator / …Caches (parts of) pages in memoryCareful :
uses threads (like Apache)
Nginx usually scales better (but doesn't have VCL)
Varnish - backends + load balancing
backend server1 {.host = "192.168.0.10";
}backend server2 {
.host = "192.168.0.11";}
director example_director round-robin {{ .backend = server1;}{
.backend = server2;}
}
Varnish - VCLVarnish Configuration LanguageDSL (Domain Specific Language)
→ compiled to C
Hooks into each requestDefines :
Backends (web servers)
ACLs
Load balancing strategy
Can be reloaded while running
Varnish - whatever you wantReal-time statistics (varnishtop, varnishhist, ...)ESI
Article content page
Article content (TTL : 15 min)/article/732
Varnish - ESIPerfect for caching pages
Header (TTL : 60 min)/top
Latest news (TTL : 2 min) /news
Navigation(TTL :
60 min)/nav
In your article page output :<esi:include src="/top"/><esi:include src="/nav"/><esi:include src="/news"/><esi:include src="/article/732"/>
In your Varnish config :sub vcl_fetch { if (req.url == "/news") { esi; /* Do ESI processing */ set obj.ttl = 2m; } elseif (req.url == "/nav") { esi; set obj.ttl = 1m; } elseif ….….}
Varnish with ESI - hold on tight !
Single webserver ProxyStatic PHP Static PHP
Apache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112Varnish - - 11200 4200
Varnish - what can/can't be cached ?Can :
Static pages
Images, js, css
Pages or parts of pages that don't change often (ESI)
Can't :POST requests
Very large files (it's not a file server !)
Requests with Set-Cookie
User-specific content
ESI → no caching on user-specific content ?
Logged in as : Wim Godden
5 messages
TTL = 5minTTL=1h
TTL = 0s ?
Coming soon...Based on NginxReduces load by 50 – 95%
Requires code changes !
Well-built project → few changes
Effect on webservers and database servers
What's the result ?
What's the result ?
FiguresFirst customer :
No. of web servers : 18 → 4
No. of db servers : 6 → 2
Total : 24 → 6 (75% reduction !)
Second customer (already using Nginx + Memcache) :No. of web servers : 72 → 8
No. of db servers : 15 → 4
Total : 87 → 12 (86% reduction !)
AvailabilityStable at 2 customersStill under heavy developmentBeta : July 2012Final : Sep 2012
PHP speed - some tipsUpgrade PHP - every minor release has 5-15% speed gain !Use an opcode cache (APC, eAccelerator, XCache)
DB speed - some tipsUse same types for joins
i.e. don't join decimal with int
RAND() is evil !count(*) is evil in InnoDB without a where clause !Persistent connect is sort-of evil
Caching & Tuning @ frontend
http://www.websiteoptimization.com/speed/tweak/average-web-page/
Frontend tuning1. You optimize backend2. Frontend engineers messes up → havoc on backend3. Don't forget : frontend sends requests to backend !
SO...
Care about frontendTest frontendCheck what requests frontend sends to backend
Tuning frontendMinimize requests
Combine CSS/JavaScript files
Tuning frontendMinimize requests
Combine CSS/JavaScript files
Use CSS Sprites
CSS Sprites
Tuning content - CSS sprites
Tuning content - CSS sprites
11 images11 HTTP requests24KByte
1 image1 HTTP requests14KByte
Tuning frontendMinimize requests
Combine CSS/JavaScript files
Use CSS Sprites (horizontally if possible)
Put CSS at topPut JavaScript at bottom
Max. no connections
Especially if JavaScript does Ajax (advertising-scripts, …) !
Avoid iFramesAgain : max no. of connections
Don't scale images in HTMLHave a favicon.ico (don't 404 it !)
→ see my blog
What else can kill your site ?Redirect loops
Multiple requestsMore load on WebserverMore PHP to process
Additional latency for visitor
Try to avoid redirects anyway
→ In ZF : use $this->_forward instead of $this->_redirect
Watch your logs, but equally important...Watch the logging process →Logging = disk I/O → can kill your server !
Above all else... be prepared !Have a monitoring systemUse a cache abstraction layer (disk → Memcache)Don't install for the worst → prepare for the worstHave a test-setupHave fallbacks
→ Turn off non-critical functionality
So...Cache
But : never delete, always push !
Have a warmup script
Monitor your cache
Have an abstraction layer
Apache = fine, Nginx = betterStatic pages ? Use VarnishTune your frontend → impact on backend !
Questions ?
Questions ?
We're hiring !Lots of challengesWork with cutting-edge technologyVaried :
PHP developmentSystem / network architectureScalability servicesBuild our own servicesWork on Open Source
→ mail us : [email protected]
ContactTwitter @wimgtrWeb http://techblog.wimgodden.beSlides http://www.slideshare.net/wimgE-mail [email protected]
Please...Rate my talk : http://joind.in/6327
Thanks !
Please...Rate my talk : http://joind.in/6327