fail the right way - node.js in production

FAIL... THE RIGHTWAY

NODE.JS IN PRODUCTION

ssw2014.formidablelabs.com

@ryan_roemer formidablelabs.com

WELCOME TO PRODUCTIONProduction can be a rough place for

your Node.js apps. Things can go verywrong out in the wild.

FORMIDABLE LABS

3:00 AM

OUR FOCUSWhether on PAAS, IAAS, or bare metal.

Design for Failure: Keep your Node.js apps up

Avoidance: Get yourself out of the failover business

Isolate: One failure at a time

Analyze: Debug and diagnose problems quickly

1. DESIGN FOR FAILUREFail and recover at multiple levels.

Let's look at failure from a systemperspective.

SINGLE NODE.JS WORKER.Never ignore errors

Have a strong bias for killing theworker.

Handle: uncaughtException,

Listen: foo.on("error")

Domains

MULTIPLE NODE.JS WORKERSUse or to

multiplex CPUs and isolate errors.Workers: die early on errors

Master: monitor and kill workers

cluster recluster

MULTIPLE NODE.JS WORKERS

var recluster = require("recluster");var cluster = recluster("./server.js");cluster.run();

// Hot reload: kill -s SIGUSR2 CLUSTER_PIDprocess.on("SIGUSR2", function() { console.log("Got SIGUSR2, reloading cluster..."); cluster.reload();});

SERVERUse or alternatives

Restart the Node.js master

SERVICELoad-balancers

Heartbeat / ping monitors

Availability zones, etc.

MAKE IT HOTEverything up to this point should have

hot failover.

DATACENTERHot failover across

datacenters?Typically very costly

But, the real deal if you're serious

DISASTER RECOVERY"Business Continuity"

Don't let a technological problem end your business

Have a worst case, "lose some data" recovery plan

2. AVOID FAILURESGet out of the business of failover

when you don't have to do it yourself.

RESOURCES TO NOT SUPPORTDon't rely on system / service

resources you don't need to.Disk: NAS, disks, SSDs.

Datastores: DB, cloud services.

... Load Balancers, DNS, etc.

HOW TO AVOIDUse SAAS wherever possible! (DB, LBs, storage).

Or PAAS for some Node.js apps.

Design Stateless, fungible servers (no disk risks).

3. ISOLATE FAILURESIsolate failures you can't

avoid.

RESOURCES TO SUPPORTLook to resources you must depend on:

CPU/Load: Run out of this and it's over.

HTTP: Each different host you hit.

Datastores: Connections? Different Hosts?

... also, memory, I/O, etc. and combinations thereof

SOME ANECDOTESNode.js apps can be bad neighbors.

DB (auto-suggest) vs. HTTP (vendor translations)

DB (CRUD app) vs. CPU/Load (co-located PHP app)

Read vs. Write DB operations.

HOW TO ISOLATECreate "micro-services" that stand on their own.

Monitor for cross-pressure and respond. (Next section!)

4. ANALYZE EVERYTHINGData drives problem discovery

and action.

LOG, MONITOR, MINE

DECISIONS, GOALSThings to look for in Node.js apps...

IdentifyResource pressure: CPU, I/O,memory, network

Performance: Throughput,latency

Errors/Bugs: Quantitative,qualitative

DecideScale up, scale down?

Separate services?

RECAPDesign for failure

Isolate

Analyze

THANKS!

ssw2014.formidablelabs.com

@ryan_roemer formidablelabs.com

fail the right way - node.js in production

Software

troubleshooting node.js

node.js workshop- node.js basics

node.js debugging

node.js security done right - owasp foundationnode.js...

itevent: express nodes on the right angle - rapid...

node.js what’s next? march 2020 - node.js - what's...

node.js the right way - the pragmatic...

node.js the right way

static analysis of event-driven node.js javascript ... ·...

node.js presentation

asynchronous programming done right - node.js

node.js i getting started -...

leading node.js development | node.js api development...

why global leaders succeed and fail - right

picking the right node.js framework for your use case

node.js primer

introducing in - finn technology · pdf fileagenda what is...

offshore node.js-development- hire node.js developers-...

to node.js on openshift migrating your ......migrating your...

what is node.js used for: the 2015 node.js overview report