utopia kindgoms scaling case: from 4 to 50k users
DESCRIPTION
PyCon Ireland Talk 2011TRANSCRIPT
● scaling case. From 4 users to 90k+●
● Jaime Buelta● Soft. Developer at
●
The Game
Get image from game
Utopia Kingdoms
● Fantasy strategy game● Build your own Kingdom● Create armies and attack other Kingdoms● Join other Kingdoms in an Alliance● Manage resources● Available in Facebook and Kongregate
http://www.facebook.com/UtopiaKigdomsGamehttp://www.kongregate.com/games/JoltOnline/utopia-kingdoms
Technology stack
Technology Stack - Backend
PythonCherrypy framework
Amazon SimpleDBLinux in Amazon EC2
Stack of technologies - Frontend
HTML( generated by Genshi templates)
jQuery
Stack of technologies - Frontend
HTML( generated by Genshi templates)
jQuery
Some points of interest(will discuss them later)
● Your resources (population, gold, food, etc) grows with time
● You actions (build something, attack a player) typically takes some time
● Players are ranked against the rest● You can add friends and enemies
Do not guess
Measure
Measurement tools
● OS tools● Task manager (top)● IO Monitor (iostat)
● Monitoring tools (Munin, Nagios)
● Logs● Needs to find a good compromise detailed/relevance
● Profiling
You've got to love profiling
● Generate profiles with cProfile module
Profile whole application with
python -m cProfile -o file.prof my_app.py
(not very useful in a web app)
● If you're using a framework, profile only your functions to reduce noise
Profile decorator (example)
def profile_this(func):
import cProfile
prof = cProfile.Profile()
retval = prof.runcall(func)
filename = 'profile-{ts}.prof'.format(time.time())
prof.dumpstats(filename)
return retval
Analyzing profile
● gprof2dot● Using dot, convert to graph
gprof2dot -f pstats file.prof | dot -Tpng -o file.png
● Good for workflows
● RunSnakeRun● Good for cumulative times
Example of RunSnakeRunRAZR
Example of gprof2dot
The power of cache
All static should be out of python
● Use a good web server to serve all static content (Static HTML, CSS, JavaScript code)
● Some options● Apache● Nginx● Cherokee● Amazon S3
Use memcached(and share the cache between your servers)
Example
● Asking for friends/enemies to DB● Costly request in SimpleDB (using SQL statement)
● On each request● Cache the friends on memcache for 1 hour● Invalidate the cache if adding/removing
friends or enemies
Caching caveats
● Cache only after knowing there is a problem● Do not trust in cache for storage● Take a look on size of cached data● Choosing a good cache time can be difcult /
Invalidate cache can be complex● Some data is too dynamic to be cached
Caching is not just memcached
● More options available:● Get on memory on start● File cache● Cache client side
Parse templates just once
● The template rendering modules have options to parse the templates just once
● Be sure to activate it in production● In development, you'll most likely want to
parse them each time
● Same apply to regex, specially complex ones
More problems
Rankings
● Sort players on the DB is slow when you grow the number of players
● Solution:● Independent ranking server (operates just in
memory)● Works using binary trees● Small Django project, communicates using xmlrpc
● Inconvenient:● Data is not persistent, if the rankings server goes
down, needs time to reconstruct the rankings
Database pulling - Resources
● There was a process just taking care of the growth of resources.● It goes element by element, and increasing the
values● It pulls the DB constantly, even when the user has
their values to maximum● Increment the resources of a user just the next
time is accessed (by himself or by others)● No usage of DB when the user is not in use● The request already reads from DB the user
Database pulling - Actions
● Lots of actions are delayed. Recruit a unit, buildings, raids...
● A process check each user if an action has to be done NOW.● Tons of reads just to check “not now”● Great delay in some actions, as they are not
executed in time
Database pulling - Actions
● Implement a queue to execute the actions at the proper time:● Beanstalk (allows deferred extraction)● A process listen to this queue and performs the
action, independently from request servers.● The process can be launched in a diferent
machine.● Multiple process can extract actions faster.
DataBase Issues
Amazon SimpleDB
● Key – Value storage● Capable of SQL queries● Store a dictionary (schemaless, multiple
columns)● All the values are strings● Access through boto module● Pay per use
Problems with SimpleDB
● Lack of control● Can't use local copy
– In development, you must access Amazon servers (slow and costly)
● Can't backup except manually● Can't analyze or change DB (e.g. can't define
indexes)● Can't monitor DB
Problems with SimpleDB
● Bad tool support● Slow and high variability (especially on SQL
queries)● Sometime, the queries just timeout and had to be
repeated.
Migrate to MongoDB
MongoDB
● NoSQL● Schemaless● Fast● Allow complex queries● Retain control (backups, measure queries, etc)● Previous experience using it from ChampMan
Requisites of the migration
● Low-level approach● Objects are basically dictionaries● Be able to save dirty fields (avoid saving
unchanged values)● Log queries to measure performance
MongoSpell● Thin wrap over pymongo● Objects are just dictionary-like elements● Minimal schema● Fast!● Able to log queries● It will probably be released soon as Open
Source
Definition of collections
class Spell(Document):
collection_name = 'spells'
needed_fields = ['name',
'cost',
'duration']
optional_fields = [
'elemental',
]
activate_dirty_fields = True
indexes = ['name__unique', 'cost']
Querying from DB
Spell.get_from_db(name='fireball')
Spell.filter()
Spell.filter(sort='name')
Spell.filter(name__in=['fireball', 'magic missile'])
Spell.filter(elemental__fire__gt=2)
Spell.filter(duration__gt=2,
cost=3, hint='cost')
Spell.filter(name='fireball', only='cost')
Some features
● Dirty fields● No type checks● Query logs● 10x faster than SimpleDB!!!
Query logs
[07:46:06]- 2.6 ms – get_from_db - Reinforcement - Reinforcements.py(31)[07:46:06]- 4.3 ms - get_from_db - Player - Player.py(876)[07:46:10]- 0.1 ms - filter - Membership- AllianceMembership.py(110) [07:46:10]- 1.3 ms - get_from_db - Reinforcement -Reinforcements.py(31)[07:46:10]- 1.4 ms - get_from_db - Notifications - Notifications.py (56)
Scalability vs Efciency
Scalable vs Efcient
Scalable● Can support more
users adding more elements
Efficient● Can support more
users with the same elements
Work on both to achieve your goals
Keep measuring and improving!(and monitor production to be proactive)
Thank you for your interest!
Questions?
[email protected]://WrongSideOfMemphis.wordpress.com
http://www.joltonline.com