asynchronous programming done right - node.js

Post on 02-Jul-2015

1.634 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Asynchronous programming done right. Without race conditions. Good pratcices in Node.js.

TRANSCRIPT

Asynchronousprogramming

done right.

Without race conditions ..ions ..io on..ns ditions.

by Piotr Pelczar (Athlan)

About me

Piotr Pelczar

Freelancer for 8yrs

PHP, Node.js, Java/Groovy

Zend Certified Engineer

IPIJ, Startups

Stay in touch

athlan.pl

me@athlan.pl

/piotr.pelczarfacebook.com

/athlangithub.com

/piotrpelczarslideshare.net

/ppelczarlinkedin.com/in

Asynchronousprogramming

Asynchronous actions are actions executed in a non-blockingscheme, allowing the main program flow to continue processing.

How software lives inhardware?

Operating systems are process based

Each process has assigned processor, registers, memory

How software lives inhardware?

Process paralelism using threads (thread pools)

Switching processor over processes/threads causes contextswitching

1. context switching = wasting time

Sync programmingIn trivial, sequential approach

Each operation is executed sequentially:

O(t) > O(t+1)

if O(t) stucks, O(t+1) waits...

Sync programming

This is cool, software flow is predictibleBut not in high throughput I/O

I/O costs because of waiting time...

High throughput I/OHigh throughput I/O doesn't mean:

Memory operations

Fast single-thread computing

High throughput I/OHigh throughput I/O means:

HTTP requests

Database connections

Queue system dispatching

HDD operations

2. Avoid I/O blocking

2. Avoid I/O blocking

Single-threaded, event loopmodel

Imagine a man, who has a task:

Walk around

When bucket is full of water, just pour another bucket

Go to next bucket

There is no sequencesIn async programming, results appears in no sequences

operation1(); // will output "operation1 finished."operation2(); // will output "operation2 finished."operation3(); // will output "operation3 finished."

There is no sequencesoperation1() would be

var amqp = require("amqp")var eventbus = amqp.createConnection();console.log("AMQP connecting...");

eventbus.on("ready", function() { console.log("AMQP connected...");

callback(); return;});

There is no sequencesoperation2() would be

var redis = require("redis")var conn = redis.createClient(port, host, options);console.log("Redis connecting...");

conn.auth(pass, function(err) { if(err) console.log("Redis failed..."); else console.log("Redis connected..."); callback(); return;});

There is no sequencesoperation3() would be

var mongojs = require("mongojs");

console.log("Mongo connecting...");var conn = mongojs.connect(connectionString); // blocking operationconsole.log("Mongo connected...");

callback();return;

There is no sequencesExpectations?

AMQP connecting... // operation1()AMQP connected... // operation1()Redis connecting... // operation2()Redis failed... // operation2()Mongo connecting... // operation3(), blockingMongo connected... // operation3()

There is no sequencesExpectations?

There is no sequencesThe result:

AMQP connecting... // operation1()Redis connecting... // operation2()Mongo connecting... // operation3(), blockingMongo connected... // operation3()Redis failed... // operation2()AMQP connected... // operation1()

There is no sequences

So... what functionsreturns?

You can perform future tasks in function, so what will bereturned?

value123 will be returned,just after blocking code, without waiting for non-blocking.

function my_function() { operation1(); operation2(); operation3();

return "value123";}

Assume: Functions doesNOT returns values

The function block is executed immedietally from top to bottom.You cannot rely to return value, because it is useless.

CallbacksCallback is the reference to function.

var callbackFunction = function(result) { console.log("Result: %s", result)}

When operation is done, the callback function is executed.callbackFunction("test1") // "Result: test1" will be printed out

CallbacksIf callbackFunction is a variable (value = reference),

so can be passed it via function argument.var callbackFunction = function() { ... }someOtherFunction(callbackFunction);

function someOtherFunction(callback) { callback(); // execute function from argument}

CallbacksFunctions can be defined as anonymous (closures)

function someOtherFunction(callback) { var arg1 = "test"; callback(arg1); // execute function from argument}

someOtherFunction(function(arg1) { console.log('done... %s', arg1);})

Callbacks can be nestedNesting callbacks makes code unreadeable:

var amqp = require('amqp');

var connection = amqp.createConnection();

connection.on('ready', function() { connection.exchange("ex1", function(exchange) { connection.queue('queue1', function(q) { q.bind(exchange, 'r1');

q.subscribe(function(json, headers, info, m) { console.log("msg: " + JSON.stringify(json)); }); }); });});

Callbacks can be nestedNesting callbacks makes code unreadeable:

var amqp = require('amqp');

var connection = amqp.createConnection();

connection.on('ready', function() { connection.exchange("ex1", function(exchange) { connection.queue('queue1', function(q) { q.bind(exchange, 'r1');

q.subscribe(function(json, headers, info, m) { console.log("msg: " + JSON.stringify(json)); table.update(select, data, function() { table.find(select, function(err, rows) { // inserted rows... } }); }); }); });});

Asynchronous control flowsPromise design pattern

Libraries that manages callbacks references

Promise design pattern1. Client fires function that will return result in the future

in the future, so it is a promise

2. Function returns promise object immedietalybefore non-blocking operations

3. Client registers callbacks

4. Callbacks will be fired in the future, when task is done

var resultPromise = loader.loadData(sourceFile)

resultPromise(function success(data) { // this function will be called while operation will succeed}, function error(err) { // on fail})

Promise design pattern1. Create deferred object

2. Return def.promise

3. Call resolve() or reject()

var loadData = function(sourceFile) { var def = deferred() , proc = process.spawn('java', ['-jar', 'loadData.jar', sourceFile]) var commandProcessBuff = null , commandProcessBuffError = null; proc.stdout.on('data', function (data) { commandProcessBuff += data }) proc.stderr.on('data', function (data) { commandProcessBuffError += data })

proc.on('close', function (code) { if(null !== commandProcessBuffError) def.reject(commandProcessBuffError) else def.resolve(commandProcessBuff) }) return def.promise}

Promise design pattern

Async Node.js libraryProvides control flows like:

Sequences (series)

Waterfalls (sequences with parameters passing)

Parallel (with limit)

Some/every conditions

While/until

Queue

Async Node.js librarySeries

Async Node.js librarySeries

async.series([ function(callback) { // operation1 }, function(callback) { // operation2 }, function(callback) { // operation3 }], function() { console.log('all operations done')})

Async Node.js libraryParallel

async.parallel([ function(callback) { // operation1 }, function(callback) { // operation2 }, function(callback) { // operation3 }], function() { console.log('all operations done')})

Async Node.js libraryParallel limit

Async Node.js libraryParallel limit

var tasks = [ function(callback) { // operation1 }, function(callback) { // operation2 }, // ...]

async.parallelLimit(tasks, 2, function() { console.log('all operations done')})

Async Node.js libraryWaterfall

async.waterfall([ function(callback) { // operation1 callback(null, arg1, arg2) }, function(arg1, arg2, callback) { // operation2 callback(null, foo, bar) }, function(foo, bar, callback) { // operation3 }], function() { console.log('all operations done')})

Async Node.js libraryWhilst

async.doWhilst( function(done) { // operation1 done(null, arg1, arg2) }, function() { return pages < limit }], function() { console.log('done')})

Asynchronousprogramming traps

Dealing with callbacks may be tricky. Keep your code clean.

Unnamed callbacksKeep your code clean, don't name callback function callback

function doSomething(callback) { return callback;}

Unnamed callbacksfunction doSomething(callback) { doAnotherThing(function(callback2) { doYetAnotherThing(function(callback3) { return callback(); }) })}

Unnamed callbacksInstead of this, name your callbacks

function doSomething(done) { doAnotherThing(function(doneFetchingFromApi) { doYetAnotherThing(function(doneWritingToDatabase) { return done(); }) })}

Double callbacksfunction doSomething(done) {

doAnotherThing(function (err) { if (err) done(err); done(null, result); }); }

Callback is fired twice!

Double callbacksFix: Always prepend callback execution with return statement.

function doSomething(done) {

doAnotherThing(function (err) { if (err) return done(err); return done(null, result); });}

Normally, return ends function execution, why do not keep thisrule while async.

Double callbacksDouble callbacks are very hard to debug.

The callback wrapper can be written and execute it only once.setTimeout(function() { done('a')}, 200)setTimeout(function() { done('b')}, 500)

Double callbacksvar CallbackOnce = function(callback) { this.isFired = false this.callback = callback} CallbackOnce.prototype.create = function() { var delegate = this return function() { if(delegate.isFired) return delegate.isFired = true delegate.callback.apply(null, arguments) }}

Double callbacksobj1 = new CallbackOnce(done)

// decorate callbacksafeDone = obj1.create() // safeDone() is proxy function that passes arguments setTimeout(function() { safeDone('a') // safe now...}, 200)setTimeout(function() { safeDone('b') // safe now...}, 500)

Unexpected callbacksNever fire callback until task is done.

function doSomething(done) {

doAnotherThing(function () { if (condition) { var result = null // prepare result... return done(result); } return done(null); });}

The ending return will be fired even if condition pass.

Unexpected callbacksNever fire callback until task is done.

function doSomething(done) {

doAnotherThing(function () { if (condition) { var result = null // prepare result... return done(result); } else { return done(null); } });}

Unexpected callbacksNever use callback in try clause!

function (callback) { another_function(function (err, some_data) { if (err) return callback(err); try { callback(null, JSON.parse(some_data)); // error here } catch(err) { callback(new Error(some_data + ' is not a valid JSON')); } });}

If callback throws an exception, then it is executed exactly twice!

Unexpected callbacksNever use callback in try clause!

function (callback) { another_function(function (err, some_data) { if (err) return callback(err); try { var parsed = JSON.parse(some_data) } catch(err) { return callback(new Error(some_data + ' is not a valid JSON')); } callback(null, parsed); });}

Unexpected callbacksNever use callback in try clause!

Take care of eventsRead docs carefully. Really.

function doSomething(done) {

var proc = process.spawn('java', ['-jar', 'loadData.jar', sourceFile]) var procBuff = ''; proc.stdout.on('data', function (data) { procBuff += data; }); // WAT?! proc.stderr.on('data', function (data) { done(new Error("An error occured: " + data)) }); proc.on('close', function (code) { done(null, procBuff); }}

Take care of eventsRead docs carefully. Really.

function doSomething(done) {

var proc = process.spawn('java', ['-jar', 'loadData.jar', sourceFile]) var procBuff = ''; var procBuffError = '';

proc.stdout.on('data', function (data) { procBuff += data; });

proc.stderr.on('data', function (data) { proc += data; });

proc.on('close', function (code) { if(code !== 0) { return done(new Error("An error occured: " + procBuffError)); } else { return done(null, procBuff) } }

}

Unreadable and logsKeep in mind, that asynchronous logs will interweave

There are not sequenced

Or there will be same log strings

Unexpected callbacksAsynchronous logs will interweave

Unreadable and logsLogs without use context are useless...

function getResults(keyword, done) { http.request(url, function(response) { console.log('Fetching from API') response.on('error', function(err) { console.log('API error') }) });}

Unreadable and logsfunction getResults(keyword, done) { var logContext = { keyword: keyword } http.request(url, function(response) { console.log(logContext, 'Fetching from API') response.on('error', function(err) { console.log(logContext, 'API error') }) });}

Unreadable and logsCentralize your logs - use logstash

And make them searcheable - Elasticsearch + Kibana

Too many openedbackground-tasks

While running parallel in order to satisfy first-better algorithm,others should be aborted

Too many openedbackground-tasks

Provide cancellation API:var events = require('events')

function getResults(keyword) { var def = deferred() var eventbus = new events.EventEmitter() var req = http.request(url, function(response) { var err = null , content = null res.on('data', function(chunk) { content += chunk; }); response.on('close', function() { if(err) return def.reject(err) else return def.resolve(content) }) response.on('error', function(err) { err += err }) });

Too many openedbackground-tasks

Provide cancellation API:var response = getResults('test')

response.result(function success() { // ...}, function error() { // ...})

// if we needresponse.events.emit('abort')

Everything runs in parallel except your code.

When currently code is running, (not waiting for I/O descriptors)whole event loop is blocked.

THE ENDby Piotr Pelczar

Q&A

by Piotr Pelczar

top related