tulsa tech fest 2010 - web speed and scalability
TRANSCRIPT
![Page 1: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/1.jpg)
Jason RagsdaleSr. Technical Yahoo
Yahoo!
Web Speed & Scalability
Friday, November 12, 2010
![Page 2: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/2.jpg)
• How to build a bigger, faster, and more reliable website
• You will learn the concepts of Speed and Scalability
• Specific Examples of Caching, Load Balancing and testing tools.
Introduction
Friday, November 12, 2010
![Page 3: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/3.jpg)
• What is Scalability?
• Avoiding Failure
• High Availability?!?!?
• Monitoring
• Release Cycles
• Fault Tolerance
• Load Balancing
• Static Content
• Caching
• Yslow & PageSpeed
Agenda
Friday, November 12, 2010
![Page 4: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/4.jpg)
• Horizontal Scalability
• Capacity can be increased just by adding more hardware/software
• Best solution
• Does not guarantee that you are safe
• Up (Vertical) Scalability
• Capacity can be increased by adding more Disk Storage, RAM , Processors
• Expensive
• Should only be used if Horizontal will not work for you
What is Scalability?
Friday, November 12, 2010
![Page 5: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/5.jpg)
• Capital investment will be made
• The system will be more complex
• Maintenance costs will increase
• Time will be required to act
Scalability Considerations
Friday, November 12, 2010
![Page 6: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/6.jpg)
• Good Planning
• Have a plan for whatever you are about to do to your system, and most importantly, have a roll-back plan if and when things do not work the way you expected.
Avoiding Failure
Friday, November 12, 2010
![Page 7: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/7.jpg)
• Functional and Unit Testing
• Automated test do not catch everything that can go wrong, but they are very good at catching bugs introduced by changes elsewhere in your code base
• Unit Testing (PHPUnit, Simpletest)
• Function Testing (selenium)
Avoiding Failure
Friday, November 12, 2010
![Page 8: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/8.jpg)
• Control Change (Version Control)
• USE IT!!!! There is no better way even as a single developer to keep your codebase safe from bad changes
Avoiding Failure
Friday, November 12, 2010
![Page 9: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/9.jpg)
• /trunk/
• Used for all mainline development
• /branches
• Used to do development that needs to be separate from the trunk code
• /tags/
• Holds copies of production ready code
• Do not use Version Control as a backup solution, backup your VCS separately
Version Control in Action
Friday, November 12, 2010
![Page 10: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/10.jpg)
• What is “five nines” 99.999%?
• Do the math, 60 seconds * 60 minutes * 24 hours * 365 days
• 31,536,000 seconds of uptime a year
• 99.999 * 31536000 = 315.36 seconds of downtime a year
High Availability?!?!?!
Friday, November 12, 2010
![Page 11: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/11.jpg)
• Understand the goodness of “Planned maintenance periods”
• There are things you will need to do to your systems on a periodic basis I.E. Database Cleanup, Disk Defrag, Software/Hardware Upgrades
High Availability?!?!?!
Friday, November 12, 2010
![Page 12: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/12.jpg)
• You can stagger your maintenance periods if you have enough servers so you have no customer downtime, just a reduction in capacity
High Availability?!?!?!
Friday, November 12, 2010
![Page 13: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/13.jpg)
• No matter how stable your code is or how reliable your hardware, you will have failure
Monitoring
Friday, November 12, 2010
![Page 14: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/14.jpg)
• Top Down (Business Monitors)
• Monitor the application as the customer interacts with it
Monitoring Methods
Friday, November 12, 2010
![Page 15: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/15.jpg)
Monitoring Methods
• Bottom Up (System Monitors)
• Most commonly used
• Monitors the base components of your application like
• Disk Space
• Network speed
• Database Statistics
• By no means bad, but without Business Monitoring you will not be able to catch all failures
Friday, November 12, 2010
![Page 16: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/16.jpg)
• SNMP Support
• Can support most systems out there
• Extensibility
• Ability to plugin custom monitoring packages
• Flexible notifications
• Handle notifying operators and escalating issues if they are not looked into
• Custom reaction
• In the event of errors that can not be diagnosed by computers, need to be able to notify a human to do further investigation
Criteria For A Monitoring System
Friday, November 12, 2010
![Page 17: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/17.jpg)
• Complex scheduling
• Ability to set the monitoring frequency and timing per monitoring item
• Maintenance scheduling
• Monitors should never be taken offline, they need to be smart enough to know when a maintenance period is in effect
• Event acknowledgment
• Ability to understand when a event needs to be paged to a human at 2am, and when it shouldn't
• Service dependencies
• You need to monitor all points between your monitoring system and the client. This includes Firewalls, Routers, Switches
Criteria For A Monitoring System
Friday, November 12, 2010
![Page 18: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/18.jpg)
• Basic Release Cycle
• Development
• Things are expected to break
• Staging
• QA and bug fixing a build before release
• Production
• Only serious bug fixes are pushed
Release Cycles
Friday, November 12, 2010
![Page 19: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/19.jpg)
• Keep in mind that reality has priority over “Best Practice”
• You can and will have to release from development… it happens
Release Cycles
Friday, November 12, 2010
![Page 20: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/20.jpg)
router
switch
www-1-1 www-1-2
Intertubes
router
switch
www-1-1 www-1-2
Intertubes
router
switch
Fault Tolerance
Friday, November 12, 2010
![Page 21: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/21.jpg)
• Load Balancing is NOT HA
• Balancing is meant to spread the workload of requests across the cluster
Load Balancing
Friday, November 12, 2010
![Page 22: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/22.jpg)
• Round robin
• One request per server in a uniform rotation
Balancing Approaches
Friday, November 12, 2010
![Page 23: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/23.jpg)
• Least connections
• The faster the machine processes requests the more it will receive
Balancing Approaches
Friday, November 12, 2010
![Page 24: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/24.jpg)
• Predictive
• Usually based on Round robin or Least connections with some custom code
Balancing Approaches
Friday, November 12, 2010
![Page 25: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/25.jpg)
• Available resources
• Not a good choice, bad performance
Balancing Approaches
Friday, November 12, 2010
![Page 26: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/26.jpg)
• Random
• Pure random distribution of requests
• Weighted random
• Random with a preference to specific machines
Balancing Approaches
Friday, November 12, 2010
![Page 27: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/27.jpg)
• Static content examples
• Images
• CSS
• JS
• Any non dynamic element
Static Content
Friday, November 12, 2010
![Page 28: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/28.jpg)
• Serving these items from a dedicated server fees up your web process for actual dynamic code, intern increasing your capacity and response speed
• On you static server you can use lightHTTP, which is very quick at serving static content compared to apache (Although apache 2.2.x is much better than 1.3.x)
Static Content
Friday, November 12, 2010
![Page 29: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/29.jpg)
• Layered / Transport Cache
• “Transparent”
• Placed in-front of your hardware and caches requests before they hit your webserver
Types of Caching
Friday, November 12, 2010
![Page 30: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/30.jpg)
• Integrated (Look-Aside) Cache
• Computational Reuse technique
• Used where the cost of storing the results of a computation and later finding them again is less expensive than performing the computation again
Types of Caching
Friday, November 12, 2010
![Page 31: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/31.jpg)
• Write-Thru Caches
• Application is responsible for updating the Cache and Datastore when changes are made
• Write-Back Caches
• All data changes are made to the cache
• Cache layer is responsible for modifying the backend datastore
Types of Caching
Friday, November 12, 2010
![Page 32: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/32.jpg)
• Distributed Cache
• Using several machines to cache data, distributing the data and load
• Memcached can do this very simply
Types of Caching
Friday, November 12, 2010
![Page 33: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/33.jpg)
• It is a high-performance, distributed object caching system
Memcahed
Friday, November 12, 2010
![Page 34: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/34.jpg)
• It is simple to setup and use
• # ./memcached -d -m 2048 -l 10.0.0.40 -p 11211
Memcahed
Friday, November 12, 2010
![Page 35: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/35.jpg)
• It is not designed to be redundant
• If you loose data you memcache will repopulate the data as it is accessed
Memcahed
Friday, November 12, 2010
![Page 36: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/36.jpg)
• It provides no security to your cache
• “Memcached is the soft, doughy underbelly of your application. Part of what makes the clients and server lightweight is the complete lack of authentication. New connections are fast, and server configuration is nonexistent. If you wish to restrict access, you may use a firewall, or have memcached listen via unix domain sockets.”
Memcahed
Friday, November 12, 2010
![Page 37: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/37.jpg)
• Alternative PHP Cache
• The Alternative PHP Cache (APC) is a free and open opcode cache for PHP. It was conceived of to provide a free, open, and robust framework for caching and optimizing PHP intermediate code.
• Just enabling APC will transparently cache your code as you use it, no code changes required on your side
• Provides a cheap caching layer that can be shared on a between all apache processes on one machine
APC and why it’s your friend
Friday, November 12, 2010
![Page 38: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/38.jpg)
• Based on 13 principles from http://developer.yahoo.com/performance/rules.html
• 1.) Make fewer HTTP requests
• 80% of the end-user response time is spent on the front-end. Most of this
YSlow
Friday, November 12, 2010
![Page 39: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/39.jpg)
• Based on 13 principles from
• http://developer.yahoo.com/performance/rules.html
YSlow
Friday, November 12, 2010
![Page 40: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/40.jpg)
• Make fewer HTTP requests
• Use a CDN
• Add an Expires header
• Gzip components
YSlow
Friday, November 12, 2010
![Page 41: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/41.jpg)
• Put CSS at the top
• Put JS at the bottom
• Avoid CSS expressions
• Make JS and CSS External
YSlow
Friday, November 12, 2010
![Page 42: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/42.jpg)
• Reduce DNS lookups
• Minify JS
• Avoid redirects
• Remove duplicate scripts
• Configure Etags
• Make AJAX cachable
YSlow
Friday, November 12, 2010
![Page 43: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/43.jpg)
• http://code.google.com/speed/page-speed
• The Page Speed family consists of several products. Web developers can use the Page Speed extension for Firefox/Firebug to analyze performance issues while developing web pages. Apache web hosters can use mod_pagespeed, a module for the Apache™ HTTP Server that automatically optimizes web pages and their resources at serving time.
PageSpeed
Friday, November 12, 2010
![Page 44: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/44.jpg)
• Adds client-side latency instrumentation.
• Improves cacheability.
• Removes unnecessary whitespace in HTML.
• Combines multiple <head> elements & CSS files into one.
• Moves CSS into the <head> element.
• Removes unnecessary attributes in HTML tags.
• Inlines small external CSS & Javascript files.
mod_page_speed
Friday, November 12, 2010
![Page 45: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/45.jpg)
• Moves large inline <style> & <script> tags into external files for cacheability.
• Removes unnecessary quotes in HTML tags.
• Removes HTML comments.
• Minifies CSS.
• Rescales, and compresses images; inlines small ones.
• Minifies Javascript.
mod_page_speed
Friday, November 12, 2010
![Page 46: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/46.jpg)
# enable expirationsExpiresActive On# expire GIF images after a month in the client's cacheExpiresByType image/gif A2592000ExpiresByType image/jpeg A2592000ExpiresByType text/css A2592000ExpiresByType application/x-javascript A2592000
# disable ETagsFileETag None
Example apache 2.x performance config
Friday, November 12, 2010
![Page 47: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/47.jpg)
# Gzip Compression
# Insert filterSetOutputFilter DEFLATE
# Netscape 4.x has some problems...BrowserMatch ^Mozilla/4 gzip-only-text/html
# Netscape 4.06-4.08 have some more problemsBrowserMatch ^Mozilla/4\.0[678] no-gzip
# MSIE masquerades as Netscape, but it is fineBrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48# the above regex won't work. You can use the following# workaround to get the desired effect:BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
# Don't compress imagesSetEnvIfNoCase Request_URI \\.(?:gif|jpe?g|png|mp3)$ no-gzip dont-vary
# Make sure proxies don't deliver the wrong contentHeader append Vary User-Agent env=!dont-vary
Example apache 2.x performance config
Friday, November 12, 2010
![Page 48: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/48.jpg)
Questions?
Friday, November 12, 2010
![Page 49: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/49.jpg)
• YSlow: http://developer.yahoo.com/yslow/
• Rules: http://developer.yahoo.com/performance/rules.html
• Scalable Internet Architectures
• By Theo Schlossnagle
• APC: http://us3.php.net/apc
• Memcahed: http://www.danga.com/memcached/
• Selenium: http://www.openqa.org/selenium/
• Simpletest: http://simpletest.org/
• PHPUnit: http://www.phpunit.de/
Links & Information
Friday, November 12, 2010
![Page 50: Tulsa tech fest 2010 - web speed and scalability](https://reader033.vdocuments.us/reader033/viewer/2022060111/5564af5dd8b42a3e618b477a/html5/thumbnails/50.jpg)
• http://joind.in/talk/view/2355
Please Complete An Evaluation Form
Friday, November 12, 2010