why i still develop synchronous web in the asyncio era · pdf filewhy i still develop...

Post on 26-Mar-2018

228 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Why I still develop synchronous

web in the asyncIO era

April 7th, 2017

Giovanni Barillari - pycon otto - Firenze, Italy

Who am I?

I’m Gio!pronounced as Joe

trust me, I’m a physicist :)

code principally in Python, Ruby, Java, Javascript

work with web applications since 2012

CTO @ Sellf since 2016

I love Open Source

contributor of web2py framework since 2012

maintainer of pydal library since 2015

author of weppy framework since 2014

Disclaimer

This is quite a subjective talk

I consider the need of a database in web development

asynchronous IO

approach used to achieve concurrency by allowingprocessing to continue while responses from IO operations

are still being waited upon

the event loop

asyncio programming is heavily centered on the notion ofan event loop, which in its most classic form uses callbackfunctions that receive a call once their corresponding IO

request has data available

the asyncio era

started with javascript

Javascript was designed to be a client side scriptinglanguage for browsers.

Browsers, like any other GUI app, are essentially eventmachines. All they do is respond to user-initiated events.

then it comes node

and Javascript became a server side language.

Reason for its success: it fully embraces the event-drivenprogramming paradigm that client-side programmers are

already well-versed in and comfortable with.

the non-blocking IO approach appropriate for the classiccase of lots of usually asleep or arbitrarily slow connections

became the de facto style in which all web-orientedsoftware should be written.

we all hate threads

asynchronous programming being used to criticisemultithreaded programming since:

threads are expensive to create and maintain in anapplicationthreaded programming is difficult and non-deterministic

the python world

continued confusion over what the GIL does and does notdo provided a fertile land for the async model to take root

strongly

Since Python 3.3 we moved from the implicit async IOparadigm offered by eventlet and gevent to the futures

and coroutines concepts of the asyncio module.

are you sure?

the throughput mith

since you avoid the wait for I/O context switches,

asynchronous programming styles are innately

superior for concurrent performance in nearly all

cases.

Are you sure your application is I/O bound?

I/O Bound refers to a condition in which the time it

takes to complete a computation is determined

principally by the period spent waiting for

input/output operations to be completed.

Are you sure context-switching is the bottleneck in yourreal world application?

Python is slow

I mean REALLY slow

Web applications deal with databases

communication with the database takes up a

majority of the time spent in a database-centric

application

This is a common wisdom in compiled languages, butPython is very slow, compared to such systems.

image made by zzzeek

asyncio is slow

Insert into postgres a few million rows:

Python 2.7.8 threads (22k r/sec, 22k r/sec) Python 3.4.1 threads (10k r/sec, 21k r/sec) Python 2.7.8 gevent (18k r/sec, 19k r/sec) Python 3.4.1 asyncio (8k r/sec, 10k r/sec)

benchmarks made by zzzeek

uvloop

benchmarks made by magicstack

but these are just TCP connections

dude wait, the magicstack guys also published asyncpg!

where the core parsing is written in Cython

and sadly is not DBAPI v2 compatible (PEP249)

web development is something more than network

what are we benchmarking?

the benchmarks’ fairy dust

benchmarks from Falcon website

benchmarks made by TechEmpower

we should stop looking just at numbers

let’s make a pointless benchmark

i5-6600k - OSX 10.11.6 - docker 17.03.0 - python 3.6.0 - wrk -d 15 -c [8-128] -t 4

serialize {“message”: “Hello, World!”} in json

let’s make a realistic benchmark

i5-6600k - OSX 10.11.6 - docker 17.03.0 - python 3.6.0 - wrk -d 15 -c [8-128] -t 4

load 20 records from postgres and serialize in json

when you do benchmarks, be sure on what you’re actuallybenchmarking

the json serialization benchmark on sanic equalsbenchmarking:

MagicStack’s httptools library vs gunicorn/uwsgi HTTPparsingujson vs standard json library

which are faster independently of asyncio

the code simplicity

threads are bad. asyncio code is more explicit and

you’ll have fewer bugs in your program.

The principle is basically:

I want context switches syntactically explicit in my

code. If they aren’t, reasoning about it is

exponentially harder.

In practice, you’ll end up with so many yield from orawait lines in your code that you end up with

well, I guess I could context switch just about

anywhere

Which is the problem you were trying to avoid in the firstplace.

def get_account_data(): user = database.get_user() preferences = database.get_user_preferences(user) post_count = database.get_post_count(user) return locals()

async def get_account_data(): user = await database.get_user() preferences = await database.get_user_preferences(user) post_count = await database.get_post_count(user) return locals()

forget about thread locals

from framework import request, response, session

def message(value): return {'message': value, 'page': request.page}

@app.route('/foo') def foo(): return message('foo')

@app.route('/bar') def bar(): return message('bar')

def message(value, request): return {'message': value, 'page': request.page}

@app.route('/foo') async def foo(request, response, session): return message('foo', request)

@app.route('/bar') async def bar(request, response, session): return message('bar', request)

your code is just less DRY

are we re-inventing the wheel?

Remember tornado?

aiohttp, muffin, Kyoukai, sanic..

can you use any of these in production to write a real app?

are we just moving the dust?

Before asyncio these were your 2 best friends

nginx

uwsgi

why?

nginx use an event loop to process requests

is pure C

with asyncio you would use gunicorn or other wsgi/asgiservers

and I still ask myself

is better to put it behind nginx or not?

does it mean we’re moving the HTTP stack to python code?

you saying you don’t use asyncio?

Of course I use it.

when I do HTTP requests

@app.register('/oauth/{provider}') def oauth(request): provider = request.match_info.get('provider') client, _ = yield from app.ps.oauth.login(provider, request) user, data = yield from client.user_info() url = app.cfg['MOON_HOST'] + '/v1/oauth/' + provider resp = yield from aiohttp.request( 'POST', url, data=json.dumps({'user': user, 'data': data})) redir_url = app.cfg['APP_HOST'] + '/?' rv = yield from resp.json() redir_url += urlencode(rv) raise aiohttp.web.HTTPFound(redir_url)

summing up

asyncio is awesome compared to the nodejs world

uvloop is just amazing

with python 3.6 asyncio seems pretty stable

pypy started support of async code in the latest rc

concurrency doesn’t mean things go faster

there’s no need to asyncify everything

avoid Hipe Driven Development

async or not, the performance of your application highlydepends on your application code

The future is bright, but we’re not there yet

I still see more cons rather than pros into turning webdevelopment async

Thank you.

Let’s discuss.

top related