gevent at tellapart
TRANSCRIPT
Kevin Ballard kevin(at)tellapart(dot)com
Image ©2003-‐2012 `DivineError
TellApart’s Infrastructure Overview
2
• Millions of daily acIve users
• Page-‐views across mulIple sites
• Real-‐Time Bidding integraIon - Very high volume, low latency - Response Ime: 50 percenIle: 17ms, 95 percenIle: 50 ms
• All requests require user data
• EnIrely Amazon Web Services (AWS), in 2 parallel regions
What is gevent?
3
gevent is a corouIne-‐based Python networking library that uses greenlet to provide a high-‐level synchronous API on top of the libevent event loop.
• EssenIally, allows normally synchronous code to run asynchronously
What is gevent?
4
lib·∙e·∙vent (ˈlib-‐i-‐ˈvent): efficient cross-‐pla]orm library for execuIng callbacks when specific events occur or a Imeout has been reached. Includes several networking libraries (e.g. DNS, HTTP)
green·∙let (ˈgrēn-‐lət): lightweight co-‐rouInes for in-‐process
concurrent programming. Ported from Stackless Python as a library for the CPython interpreter
How does gevent work?
5
• One gevent “hub” per process
• Monkey-‐patch blocking libraries - socket, thread, select, etc.
• Use greenlets like threads
• Blocking calls switch to another (ready) greenlet
Example Server
6
mod_wsgi: gevent:
Example Server
7
• Server implementaIon is the same
• DB lookup blocks on network IO
• With gevent, greenlet gets swapped out so another request can be served
• When the DB request finishes, the greenlet will conInue where it lej off
Advantages
8
• Write code as though it were synchronous (mostly) - No ‘callback spaghen’ like with a callback framework - Exact same code can run synchronously (e.g. unit tests)
• Greenlets are very lightweight - 100’s or 1000’s can run concurrently - No context switch
o Same order of magnitude as a funcIon call - No GIL related performance issues
• Co-‐operaIve concurrency makes synchronizaIon easy - Greenlets cannot be preempted - No need for in-‐process atomic locks - Ojen eliminates the need for synchronizaIon
o As long as there are no blocking calls in the criIcal secIon
Advantages (conInued)
9
• gevent is fast - Very thorough set of benchmarks by Nicholas Piëlhrp://nichol.as/benchmark-‐of-‐python-‐web-‐servers
And then there is Gevent [...] […] if you want to dive into high performance websockets with lots of concurrent connecIons you really have to go with an asynchronous framework. Gevent seems like the perfect companion for that, at least that is what we are going to use.
Problems
10
• Monkey-‐patching - Doesn’t play well with C extensions
o Blocking code in C libraries will cause the process to block - Can confuse some libraries
o e.g. thread-‐local storage
• Breaks analysis tools - cProfile produces garbage - AlternaIve tools available
o gevent-‐profiler (Meebo) o gevent_request_profiler (TellApart)
• Co-‐operaIve scheduling - Rogue greenlets can Ie up the enIre process
o e.g. CPU bound background worker - Long-‐running tasks have to periodically yield
Problems
11
• Same server as before
• Processing in loop can take long • Can hurt latency of other requests
• Add ‘gevent.sleep(0)’ to loop
• Allows other greenlets to run
Uses
12
• We use gevent everywhere we use Python
• TellApart Front End (TAFE) - gevent WSGI server with a micro-‐framework - One process per core - Nginx reverse-‐proxy in front
• Database Proxy (moxie) - Thrij service - ConnecIon pooling across clients - Minimal addiIonal latency (~2ms)
Case Study -‐ Taba
13
• Taba is a distributed Event AggregaIon Service
• Provides near real-‐Ime metrics from across a cluster
• At TellApart: - 10,000 individual Tabs - 100’s of event source clients - 20,000,000 events / minute - 25 seconds latency from real-‐Ime
Case Study -‐ Taba
14
• Implement Imeouts very easily
• FuncIon doesn’t need to know it’s being Imed
Case Study – Taba
15
• Perform simultaneous lookups to a sharded database
• No thread pools
• No need for locking
Case Study – Taba
16
• Streaming from DB in batches
• No thread pool
• Trivial synchronizaIon
• Process data while the next batch is retrieved
17
Thank you!
Kevin Ballard kevin(at)tellapart(dot)com