event machine

33
Scalable Ruby Processing with EventMachine Mike Perham

Upload: mperham5749

Post on 18-Nov-2014

5.858 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Event Machine

Scalable Ruby Processing with EventMachineMike Perham

Page 2: Event Machine

And you are...?

Developer at OneSpot

memcache-client maintainer

data_fabric author

Page 3: Event Machine

Scalable Processing?

Map/Reduce (Hadoop)

Message Queues

Page 4: Event Machine

Efficient Processing!

Focus on maximizing machine utilization

Google tries for ~80% utilization

Page 5: Event Machine

Status Quo

Typical Message Queue processing in Ruby:

Single Threaded

200MB (or more!) to process one message/sec?!

Load Average: 0.10 0.12 0.09

Page 6: Event Machine

Blocking IO sucks

Page 7: Event Machine

Rule of Thumb

Your code will spend 90% waiting for IO, 10% doing actual work

Page 8: Event Machine

Blocking

Why do you add indexes to a database table?

Why do you put data in memcached?

Page 9: Event Machine

Blocking IO

File

Database

memcached

Net::HTTP

DNS lookups

system()

Page 10: Event Machine

Solutions?

How do we maximize the blue?

Page 11: Event Machine

Threading?

Create 10 threads, each process a message concurrently

10% CPU * 10 = 100% CPU!

Java: good at threading

Ruby: not so much...

Page 12: Event Machine

Threading?

Thread-unsafe extensions / libraries

Poor thread implementation

Ruby 1.8: Green Threads

Ruby 1.9: GIL

JRuby: the only good threading solution

Page 13: Event Machine

Alternative?

What if we could...

Have Ruby work on one operation while another waited on I/O?

Fill in the green gaps?

Without threads?

Page 14: Event Machine

Evented IO rules

Page 15: Event Machine

EventMachine

Ruby implementation of the Reactor pattern

Single threaded by default

Allows us to interleave multiple IO ops and a single CPU op simultaneously

a concurrent programming pattern for handling service requests delivered concurrently to a service handler by one or more inputs. The service handler then demultiplexes the incoming requests and dispatches them synchronously to the associated request handlers.

Page 16: Event Machine

How does it work?

IO.select(rd, wr, ex)

select

epoll on Linux 2.6

kqueue on BSD

/dev/poll on Solaris

All bets are off on Windows

Page 17: Event Machine

Issues

Page 18: Event Machine

Inversion of Control

Application code becomes callbacks

makes error handling difficult

somewhat solved by Fibers

Page 19: Event Machine

Inversion of ControlWithout Fibers With Fibers

Page 20: Event Machine

Coding

Difficult to understand

Little, poor documentation

Learning curve for newbies

Page 21: Event Machine

Testing

Global context: reactor

Each test must setup/teardown a reactor

Page 22: Event Machine

Whack-A-Mole

Blocking IO is everywhere

Easy to lose parallelism

Page 23: Event Machine

Code

Page 24: Event Machine

Evented

My EventMachine sample code repository

http://github.com/mperham/evented

Page 25: Event Machine

Thumbnailer

Rack middleware to dynamically create thumbnails

Thin, EventMachine, ImageScience, em-http-request

Page 26: Event Machine

Thumbnailer

Page 27: Event Machine

Qanat

SQS processing daemon

Event-based S3, SimpleDB and SQS APIs

Uses Fibers with Ruby 1.9

Page 28: Event Machine

EventMagick

system ==> EM.system

Execute ‘identify <JPEG>’ 640 times

system: 10 sec

EM.system: 5 sec

Page 29: Event Machine

Example: system()

Page 30: Event Machine

em_postgresql

ActiveRecord driver for Postgresql with EM

http://github.com/mperham/em_postgresql

Requires Ruby 1.9

Mysql? Use mysqlplus.

Page 31: Event Machine

em_postgresql

Page 32: Event Machine

Conclusions

Threading sucks

Blocking IO is everywhere

Use EM for IO to peg a single core

Use multiple processes for multi-core

Ruby 1.9 makes evented code nicer