pycon 2011 scaling disqus
Post on 08-Sep-2014
21.602 Views
Preview:
DESCRIPTION
TRANSCRIPT
DISQUSJason Yan@jasonyan
David Cramer@zeeg
Python at 400 500 million visitors
Got feedback? Use hashtag #sckrw
Sunday, March 13, 2011
Agenda
• What is DISQUS?
• An Overview of the Infrastructure• Iterative Development and Deployment• Why We Love Python
Sunday, March 13, 2011
We are a comment system with an emphasis on connecting communities
http://disqus.com/about/
dis·cuss • dĭ-skŭs'
What is DISQUS?
Sunday, March 13, 2011
Embeddable Comments
Sunday, March 13, 2011
A Brief History
Sunday, March 13, 2011
Startup-ish
• Founded just about 4 years ago• 16 employees, 8 engineers• Tra!c increasing 15-20% a month• Flat organizational structure, every
engineer is a product manager• Fast turnaround, new feature launches
every week (sometimes daily)
Sunday, March 13, 2011
Tra!c
0M
125M
250M
375M
500M
Number of Visitors
March 2008 through March 2011
Sunday, March 13, 2011
DjangoCon 2010
• 17,000 requests/second peak
• 450,000 websites
• 15 million profiles
• 75 million comments
• 250 million visitors
Sunday, March 13, 2011
Six Months Later
• 25,000 requests/second peak
• 700,000 websites
• 30 million profiles
• 170 million comments
• 500 million visitors
• 17,000 requests/second peak
• 450,000 websites
• 15 million profiles
• 75 million comments
• 250 million visitors
Sunday, March 13, 2011
Six Months Later
• September 2010: 250 million uniques
• March 2011: 500 million uniques
• Handling over 2x the tra!c
Sunday, March 13, 2011
Six Months Later
• September 2010: ~100 servers• March 2011: ~100 servers
• Scale diagonally
Sunday, March 13, 2011
Scaling Diagonally
• We still rent hardware, so there is no “commodity hardware”
• Cheaper to upgrade
• Everything is redundant• Partition data where you need to, scale
partitions vertically
• Upgrade hardware (more RAM, more drives, more cores)
• Python apps tend to be CPU bound
Sunday, March 13, 2011
Infrastructure
• 35% Web Servers (Apache + mod_wsgi)
• 15% Utility Servers (Python scripts, background workers)
• 20% Databases (PostgreSQL, Redis, Membase)
• 20% Load Balancing / High Availability (HAProxy + Heartbeat)
• 10% Caching servers (Memcached, Varnish)
• Half of our servers run Python
Sunday, March 13, 2011
• Use what you’re comfortable with• Apache + mod_wsgi vs nginx + uWSGI
• Bottleneck is in the application
Python Web Servers
mod_wsgi
uWSGI
0 200 400 600
req/sec
Min Avg Max
015.030.045.060.0
mod_wsgi uWSGI
Memory
Sunday, March 13, 2011
Background Workers
• Lots of tasks that don’t need to be done in web application process:
• Crawling URLs
• Updating avatars
• Email notifications
• Analytics
• Counters
Sunday, March 13, 2011
Background Workers (cont’d)
• Most jobs are I/O bound• Slow external calls
• Twitter is slow
• Facebook is slow
• Could parallelize with multiple processes, but...
Sunday, March 13, 2011
Background Workers (cont’d)
• Waste of memory
• Use non-blocking I/O• Celery 2.2 adds support for gevent/
eventlet
Sunday, March 13, 2011
Monitoring
• Application side: Graphite• Real-time(ish) graphing
• Django front-end, Python backend
• Etsy’s StatsD proxy to Graphite
• UDP (fire and forget)
• Batches updates
Sunday, March 13, 2011
Monitoring
• Track application metrics
• Errors, exceptions
• New comments, users, sites, etc.
• Anything
Sunday, March 13, 2011
Monitoring
• Check out Etsy’s posts:
• Measure Anything, Measure Everything http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/
• Tracking Every Release http://codeascraft.etsy.com/2010/12/08/track-every-release/
Sunday, March 13, 2011
What about the code?
Sunday, March 13, 2011
Powered By Django
Sunday, March 13, 2011
Which means...
• Largest Django-powered web application
• We fork, and even sometimes monkey patch to make it scale to our needs
• Fortunately, we don’t have to do too much (Yay, Django!)
• Unfortunately, we can’t use the whole of the Django internal components (and if we do, we do it in atypical ways)
Sunday, March 13, 2011
Iterative DevelopmentRelease Early Release Often
Sunday, March 13, 2011
Iterating Quickly
• Abstracting our application environment
• Less dependancies locally• Rely on CI for dependency coverage
• Heavy use of open source packages• No NIH syndrome
• Deploy frequently, 3-7 times a day
• Lots of branches, but master is “stable”• Realtime reporting on exceptions, metrics
• Our test suite is the main blocker (slow)
Sunday, March 13, 2011
Dealing with Deploys
Sunday, March 13, 2011
Gargoyle
Being users of our product, we actively use early versions of features before public release
Deploy features to portions of a user base at a time to ensure smooth, measurable releases
Sunday, March 13, 2011
The Deployment Problem
• Make some changes locally
• Run a subset of the test suite• Push your commits• CI server begins running tests
• ....
Sunday, March 13, 2011
Waiting on the test suite...
Sunday, March 13, 2011
Rinse and Repeat
• 30 minutes later tests fail, start over• Finally, deploy to a subset of servers
• Open Sentry (our exception logger)
• Monitor Graphite• Deploy to 35 servers (~8 minutes)
• Full rollback in < 30 seconds
Sunday, March 13, 2011
Wait, Sentry?
Sunday, March 13, 2011
Testing
Sunday, March 13, 2011
Testing Code
• Test suite takes around 25 minutes usually• “Stuck” with Hudson (or Jenkins)
• Most tightly integrated plugins are geared towards Java developers
• Which framework do we use?
• unittest(2), nose, doctests, LETTUCE?
• We use unittest and nose• Need to report code coverage, speed of
tests, pylint (or pyflakes)
Sunday, March 13, 2011
We Love Python
Sunday, March 13, 2011
Love-ish
• Many of us started with PHP or Rails• Clean syntax, clear standards
• All languages need PEP8.py and PyFlakes
• Interpreted, fast... enough
• Very easy to learn• We all started by learning Django first,
then Python
Sunday, March 13, 2011
Haters Gonna HateIf you could choose one thing in
Python to hate on...
Sunday, March 13, 2011
Better package management
Sunday, March 13, 2011
What can we do?
• Too many forks, too many frameworks• We need less clones, and more combined
e"ort
• Improving existing Python solutions
• More Python solutions for existing products
Sunday, March 13, 2011
Python Rocks!
Sunday, March 13, 2011
DISQUSQuestions?
psst, we’re hiringjobs@disqus.com
Sunday, March 13, 2011
References
• Sentry (our exception tracking tool)http://github.com/dcramer/django-sentry
• Gargoyle (feature switches)https://github.com/disqus/gargoyle
• Django DB Utils (collection of db helpers for Django)https://github.com/disqus/django-db-utils
• Jenkins CIhttp://jenkins-ci.org/
code.disqus.com
Sunday, March 13, 2011
top related