truly madly deeply parallel ruby applications

53
Truly Madly Deeply Parallel Ruby (Web)* Applications @harikrishnan83 * Thanks @headius

Upload: hari-krishnan

Post on 12-Jun-2015

185 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Truly madly deeply parallel ruby applications

Truly Madly Deeply Parallel Ruby (Web)* Applications

@harikrishnan83

* Thanks @headius

Page 2: Truly madly deeply parallel ruby applications

How many of you are building web applications?

Page 3: Truly madly deeply parallel ruby applications

How many of you have more than 1 user using it at

a time?

Page 4: Truly madly deeply parallel ruby applications

How many of you use scaling out as the way hit scale?

Page 5: Truly madly deeply parallel ruby applications

How many of you are scaling up to hit scale?

Page 6: Truly madly deeply parallel ruby applications

What is a parallel environment?

Page 7: Truly madly deeply parallel ruby applications

Parallelism

Page 8: Truly madly deeply parallel ruby applications

ConcurrencyThread A - load

Thread B - load

Thread A - increment

Thread B - increment

Thread A - save

Thread B - save

i = 0

i = 0

i = 1

i = 1

Page 9: Truly madly deeply parallel ruby applications

Ruby web application deployment

Page 10: Truly madly deeply parallel ruby applications

Process Parallelism

Reverse Proxy Unicorn Processes Database Layer

Page 11: Truly madly deeply parallel ruby applications

On my current project

Page 12: Truly madly deeply parallel ruby applications

We only use EC2 small instances

Page 13: Truly madly deeply parallel ruby applications

Because it is very hard to utilize a high spec machine

Process Context Switch is Expensive

Page 14: Truly madly deeply parallel ruby applications

Today...

● We have 4 small EC2 instances

● 2 Puma processes run on each

● Handles about 100,000 requests per hour

● And this is our Private alpha

Page 15: Truly madly deeply parallel ruby applications

We need to...

● Handle about 1 million requests per hour

● Which means about 40-45 EC2 small instances

Page 16: Truly madly deeply parallel ruby applications

This is not trivial

● Costs a lot of money

● Lot of time required to maintain these boxes

● Being elastic will become very important

● Cost also in terms of more Devops time

Page 17: Truly madly deeply parallel ruby applications

In general

Page 18: Truly madly deeply parallel ruby applications

It is easier to baby sit few boxes

Page 19: Truly madly deeply parallel ruby applications

Than a lot!

Page 20: Truly madly deeply parallel ruby applications

Ideally, we would like to both scale up and scale out

Page 21: Truly madly deeply parallel ruby applications

i.e. we want to achieve the same throughput with, say,

just 5 large instances

Page 22: Truly madly deeply parallel ruby applications

Enter thread based parallelism

Page 23: Truly madly deeply parallel ruby applications

Why were we not doing this till now?

Page 24: Truly madly deeply parallel ruby applications

Threads are hard*They share memory and mutate thingsThey share memory and mutate things

* - Supposedly

Page 25: Truly madly deeply parallel ruby applications

And there is the ubiquitous issue

‘Thread Safety’

Page 26: Truly madly deeply parallel ruby applications

Before we go there, first lets look at some code

Page 27: Truly madly deeply parallel ruby applications
Page 28: Truly madly deeply parallel ruby applications

The real question is...

Page 29: Truly madly deeply parallel ruby applications

Are you“Safe for Parallelization”

Page 30: Truly madly deeply parallel ruby applications

Understanding this will take you a long way in “getting parallel”

Page 31: Truly madly deeply parallel ruby applications

Things to remember while moving to threaded parallelism

Page 32: Truly madly deeply parallel ruby applications

#1 - Always identify the shared resources

Page 33: Truly madly deeply parallel ruby applications

Shared Resource

● Objects

● DB rows

● Caches

● Log files!

Page 34: Truly madly deeply parallel ruby applications

#2 - Bank on thread safe libraries

Page 35: Truly madly deeply parallel ruby applications

Libraries

● Data structures

● JSON, XML parsing, HTTP clients etc

● Generally, auditing all the gems you use for thread safety is a good idea

Page 36: Truly madly deeply parallel ruby applications

If you only use thread safe libraries are you ‘safe for parallelism’?

Page 37: Truly madly deeply parallel ruby applications

Rails is thread safe right?

Why is everyone concerned about thread safety in the first place?

Page 38: Truly madly deeply parallel ruby applications

#3 - If two libraries are thread safe, code that uses both of them need not be

Page 39: Truly madly deeply parallel ruby applications

Rails thread safety model

● Instantiate everything for every request

● No shared state (global objects)

● Different from, say, Java (single servlet object per container, IOC with singletons etc.)

Page 40: Truly madly deeply parallel ruby applications

#4 - Try and stick to Rails’ way of handling requests

Page 41: Truly madly deeply parallel ruby applications

Are you ‘Safe for parallelism’ if you follow these steps?

Page 42: Truly madly deeply parallel ruby applications

Well, it depends...

Page 43: Truly madly deeply parallel ruby applications

Validating, say, through a green bar is very hard.

Page 44: Truly madly deeply parallel ruby applications

Always give yourself some time to stabilize.

The move is definitely not overnight!

Page 45: Truly madly deeply parallel ruby applications

Speaking of the move, move where?

Page 46: Truly madly deeply parallel ruby applications

Since Rubinius is mostly MRI like, its simpler

Page 47: Truly madly deeply parallel ruby applications

I personally love JRuby more because of my JVM

background

Page 48: Truly madly deeply parallel ruby applications

Lots of good things have been spoken about JRuby

Page 49: Truly madly deeply parallel ruby applications

Some gotchas based on my experience

Page 50: Truly madly deeply parallel ruby applications

JRuby impacts Developers

● The JRuby startup time (mostly because of the JVM startup time) can sometimes kill red-green cycle time

● Sometimes, you should be OK with stooping down to Java code to figure out why something is not working

Page 51: Truly madly deeply parallel ruby applications

JRuby impacts OPs

● You no longer have a ruby app in prod, its a Java app

● GC tuning, Process monitoring, Profiling etc. are very different on a JVM

Page 52: Truly madly deeply parallel ruby applications

Thread ParallelismReverse Proxy Puma Instance Database LayerThreads

Page 53: Truly madly deeply parallel ruby applications

Thank you!@harikrishnan83