pushing python: building a high throughput, low latency system

Post on 25-May-2015

22.426 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Kevin  Ballard  SFpython.org  2014-­‐03-­‐12  

kevin@  tellapart.com  

Introductions

Taba • Distributed event aggregation service

import taba ... taba.RecordValue(‘winning_bid_price’, wincpm) ...

$ taba-cli aggregate winning_bid_price {“name”: “winning_bid_price”, “10m”: {“count”: 14709, “total”: 5836.4}, “percentiles”: [0.07 0.16 0.32 0.84 1.33 8.03]}

Taba +10,000,000  events/sec  

+50,000  metrics  

+1,000  clients  

+100  processors  

GET THE DATA MODEL RIGHT Lesson #1

Data Model

Data Model Event:  (‘bid_cpm’,  ‘Counter’,  time(),  0.233)      State:            Aggregate:  {“10m”:  43.9,  “1h”:  592.22}    

Data Model

Data Model

Data Model

STATE IS HARD Lesson #2

Centralizing State

GENERATORS + GREENLETS = AWESOME

Lesson #3

Asynchronous Iterator

• JIT processing • Automatically switches through I/O

CPYTHON SUFFERS FROM MEMORY FRAGMENTATION

Lesson #4

Fragmentation • Fragmentation is when a process’s heap is

inefficiently used.

• The GC may report a low memory footprint, but the OS reports a much larger RSS.

Fragmentation

Fragmentation

Fragmentation

Fragmentation

Fragmentation

Hybrid Memory Management • Use Cython to allocate page-sized blocks of

pointers into incoming chunk • Hand-off the whole thing to the CPython

memory manager • Whole thing gets deallocated at once

Hybrid Memory Management

Hybrid Memory Management

Hybrid Memory Management

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Ratcheting •  Ratcheting is a pathological case of Fragmentation,

caused by the fact that the heap must be contiguous*:

•  It’s a limitation of CPython that it cannot compact memory (mostly due to extensions).

Ratcheting • Avoid persistent objects • Sockets are common offenders

• Anything that has to be persistent should be created at application startup, before processing data

• Avoid letting the heap grow in the first place

fin.

github.com/tellapart/taba      

kevin@tellapart.com      |    @misterkgb    

We’re  Hiring!      tellapart.com/careers  

top related