node.js and containers: dispatches from the frontier

node.js and Containers:Dispatches from the Frontier

[email protected]

Bryan Cantrill

@bcantrill

Application architecture, circa 2000

• The late 1990s saw the rise of three-tier architectures consisting of presentation, application logic and data tiers

• Many names for roughly the same notion: “Service-oriented architecture”, “Model/View/Controller”, etc.

• The AJAX+REST revolution of the mid-2000s gave rise to true web applications in which application logic could live on the edge

• Led to some broader architectural questioning...

Post-AJAX questions

• Why should HTTP be restricted to the web?

• Why should REST be restricted to web apps?

• Instead of having one monolithic architecture, why not have a series of (smaller) services that merely did one thing well?

• In case this sounds vaguely familiar...

The Unix Philosophy

• The Unix philosophy, as articulated by Doug McIlroy:

• Write programs that do one thing and do it well

• Write programs to work together

• Write programs that handle text streams, because that is a universal interface

• The single most important revolution in software systems thinking

• Applying it to HTTP-based services...

Microservices

• Microservices do one thing, and strive to do it well

• Replace a small number of monoliths with many services that have well-documented, small HTTP-based APIs

• Larger systems can be composed of these smaller services

• While the trend it describes is real, the term “microservices” isn’t without its controversy...

Microservices

Developing microservices

• node.js is a perfect fit for microservices:

• Light memory footprint

• Purely asynchronous

• Expresses Unix programming model with respect to the operating system (processes, pipes, files, sockets, etc.)

• Module approach encourages libraries over frameworks

• node.js is itself an expression of the Unix philosophy

Deploying microservices

• Microservices are tautologically small

• One physical machine per service is clearly uneconomical…

• ...but deploying many orthogonal services on a single machine is a well-known operational nightmare (e.g. conflicting dependencies, shared fault domain)

• The key is to virtualize — but at what layer of the stack?

• Virtualization has ramifications with respect to performance and density — which is to say, economics

Hardware-level virtualization?

• The historical answer to virtualization — since the 1960s — has been to virtualize the hardware:

• A virtual machine is presented upon which each tenant runs an operating system that they choose (and must manage)

• There are as many operating systems on a machine as tenants!

• Can run entire legacy stacks unmodified...

• ...but operating systems are heavy and don’t play well with others with respect to resources like DRAM, CPU, I/O devices, etc.

• With microservices, overhead dominates!

Platform-level virtualization?

• Virtualizing at the application platform layer addresses the tenancy challenges of hardware virtualization, and presents a much more nimble (& developer friendly!) abstraction...

• ...but at the cost of dictating abstraction to the developer

• This is the “Google App Engine” problem: developers are in a straightjacket where toy programs are easy — but sophisticated applications are impossible

• Virtualizing at the application platform layer poses many other challenges with respect to security, containment, etc.

OS-level virtualization!

• Virtualizing at the operating system hits a sweet spot:• A single operating system (i.e. a single kernel) allows for efficient use of

hardware resources, maximizing tenancy and performance

• Disjoint instances are securely compartmentalized by the operating system

• Gives tenants what appears to be a virtual machine (albeit a very fast one) on which to run higher-level software: PaaS ease with IaaS generality

• Also: boots like a bandit!

• Model was pioneered by FreeBSD jails and taken to their logical extreme by Solaris zones — and then aped by Linux containers

OS-level virtualization at Joyent

• Joyent runs OS containers in the cloud via SmartOS — and we have run containers in multi-tenant production since ~2006

• Adding support for hardware-based virtualization circa 2011 strengthened our resolve with respect to OS-based virtualization

• This is especially true for microservices: as services get small, overhead and latency become increasingly important — and OS containers become a bigger and bigger win

• Our belief in containers for microservices comes from our own experience in developing cloud management software...

SmartDataCenter as microservices microcosm

• SmartDataCenter is our cloud orchestration system

• Reflects its times: born in ~2006 as a Rails app; by late 2011, consisted of some node.js microservices + a Ruby monstrosity

• In 2012/2013, we rewrote SmartDataCenter to be entirely node.js-based, microservices-based and container-based

• Open sourced in November 2014: https://github.com/joyent/sdc

• Learned many things along the way — some more surprising than others...

Microservices: State management

• While decentralization is an important tenet of microservices, be careful about applying this to canonical state

• State should be offered through its own microservice that can be rigorously developed, tightly controlled, closely managed, etc.

• To this end, we developed Moray, a node.js-based key/value store backed by replicated Postgres

• Moray’s Postgres replication is managed (and automated) with Manatee, a node.js + ZooKeeper-based system

• Essentially all of our services are backed by Moray + Manatee

Microservices: Deployment + config

• With their larger numbers, microservices will exacerbate any deployment, configuration or image management pain

• To alleviate this, we developed our own services API (SAPI) — but the results were still dissatisfying...

• A little-known PaaS called dotCloud saw the same problem and developed a container-based solution for image management...

• Docker allows developers to encode deployment procedures via an image that can in turn be deployed as a container

• Docker will do to apt what apt did to tar

Microservices: CAP omnipresence

• Some malcontent towards microservices may stem from the breathlessness of some of its proponents, especially with respect to resilience and reliability

• Microservices make your system more distributed — and therefore more vulnerable to its CAP tradeoffs

• More monolithic systems are able to hide from CAP — or are deliberately C+A systems

• It’s not just CAP: performance pathologies can be much more difficult to understand in a distributed system

Microservices: System topology

• Our system is historically an AMQP/HTTP hybrid

• The principles of AMQP are very attractive…

• ...but in practice, implementation and operational issues have made message brokers a single point of failure

• We still use AMQP for some kinds of broadcast traffic, but we try to go point-to-point and HTTP as quickly as we can

Microservices: Service discovery

• Moving from monolithic to microservices means moving from tightly coupled to a loosely federated system — and necessitates service discovery

• We developed Binder, a node.js-based DNS + ZooKeeper system

• While Binder has been sufficient for our needs, service discovery remains broadly a thorny problem — especially around rollbacks, service maintenance, etc.

Microservices: Interface primacy

• Microservices have many more interfaces — so interface-based pathologies will naturally become more acute

• ...and JSON-based systems can exhibit interface-based pathologies not seen in more rigid systems

• We use JSON Schema (v3) to add rigor without sacrificing agility

• We implement HTTP routes with Restify, a node.js module that includes support for interface versioning (+ DTrace support!)

• Postel’s Law (“...an implementation should be conservative in its sending behavior, and liberal in its receiving behavior”) remains helpful, but apply with moderation!

Microservices: Pathological behavior

• Systems will misbehave — and distributed systems have much more surface area; hope is not a strategy!

• We have invested heavily in node.js-based infrastructure to allow us to quickly diagnose aberrant behavior in production:

• Bunyan is a logging facility for node.js (+ DTrace support!)

• SmartOS DTrace support for node.js profiling

• SmartOS support for JavaScript heap analysis from core files

• See @dapsays’ “Industrial-grade node.js” talk!

Microservices: Containers

• OS containers have made our microservices approach possible

• They have proven essential for every aspect of our system: speed of deployment, robustness, latency, debuggability, etc.

• Historically, our approach has been limited to SmartOS...

• While SmartOS is Unix, it’s not Linux — and we understand that many (most?) have an established Linux binary footprint...

• We have implemented support for LX-branded zones in SmartOS that allow for Linux binaries (and distributions!) to run out-of-the box in a secure OS container

Microservices, containers and node.js

• Microservices represent a real and important trend in systems

• We have found node.js to be the perfect fit for implementing these systems — especially given its production debugging support

• Containers have been an essential ingredient for our own deployment of microservices-based architectures

• LX-branded zones will allow our microservices approach to be applicable to a much broader audience!

Thank you!

• @mcavage, @pfmooney, @dapsays, @yunongx, @tumederanges for Moray/Manatee (https://github.com/joyent/moray)

• @dapsays and @tjfontaine for bringing humanity debugging support for node.js in production

• @trentmick for Bunyan (https://github.com/trentm/node-bunyan)

• @trevero, @notmatt for Binder (https://github.com/joyent/binder)

• @joshwilsdon for much of SDC — and for being right about AMQP

• Jerry Jelinek, @pfmooney, @jmclulow and @jperkin for their work onLX-branded zones

node.js and containers: dispatches from the frontier

Technology

term microservices isnt

application platform

httpbased services

platformlevel virtualization

application logic

single machine

operating systems

orthogonal services