ruby on redis

Ruby on Redis Pascal Weemaels Koen Handekyn Oct 2013

Target

Create a Zip file of PDF’s based on a CSV data file

‣  Linear version

‣  Making it scale with Redis

parse csv

create pdf

Step 1: linear

‣ Parse CSV •  std lib : require ‘csv’

•  docs = CSV.read("#{DATA}.csv")

Simple Templating with String Interpolation

INVOICE #{invoice_nr}

</div>

#{name}</br>

#{street}</br>

#{zip} #{city}</br>

</div>

‣ Merge data into HTML •  template =

File.new('invoice.html').read

•  html = eval("<<QQQ\n#{template}\nQQQ”)

invoice.html

Step 1: linear

‣ Create PDF •  prince xml using princely gem

•  http://www.princexml.com

•  p = Princely.new p.add_style_sheets('invoice.css') p.pdf_from_string(html)

Step 1: linear

‣ Create ZIP •  Zip::ZipOutputstream. open(zipfile_name)do |zos| files.each do |file, content| zos.new_entry(file) zos.puts content end end

Full Code require 'csv'!require 'princely'!require 'zip/zip’!!DATA_FILE = ARGV[0]!DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)!!# create a pdf document from a csv line!def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMF\n#{template}\nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)!end!!# zip files from hash !def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name!end!!# load data from csv!docs = CSV.read(DATA_FILE) # array of arrays!!# create a pdf for each line in the csv !# and put it in a hash!files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h!end!!# zip all pfd's from the hash !create_zip files_h!!

Step 2: from linear ...

parse csv

create pdf

Step 2: ...to parallel

parse csv

create pdf

create pdf create pdf

Threads ?

Multi Threaded ‣ Advantage •  Lightweight (minimal overhead)

‣ Challenges (or why is it hard) •  Hard to code: most data structures are not thread safe by default, they

need synchronized access

•  Hard to test: different execution paths , timings

•  Hard to maintain

‣  Limitation •  single machine - not a solution for horizontal scalability ��

beyond the multi core cpu

Step 2: ...to parallel

parse csv

create pdf

Multi Process • scale across machines •  advanced support for debugging and monitoring at the

OS level

• simpler (code, testing, debugging, ...)

•  slightly more overhead

parse csv

create pdf

all this assumes

“shared state across processes”

shared state

SQL? MemCached

File System Terra Cotta

… OR …

Hello Redis

‣ Shared Memory Key Value Store with High Level Data Structure support

•  String (String, Int, Float)

•  Hash (Map, Dictionary)

•  List (Queue)

•  Set

•  ZSet (ordered by member or score)

About Redis

•  Single threaded : 1 thread to serve them all •  (fit) Everything in memory

•  “Transactions” (multi exec)

•  Expiring keys

•  LUA Scripting

•  Publisher-Subscriber

•  Auto Create and Destroy

•  Pipelining

•  But … full clustering (master-master) is not available (yet)

Hello Redis ‣  redis-cli •  set name “pascal” =>

“pascal”

•  incr counter => 1 •  incr counter => 2 •  hset pascal name

“pascal”

•  hset pascal address “merelbeke”

•  sadd persons pascal •  smembers persons =>

[pascal]

•  keys * •  type pascal => hash •  lpush todo “read” => 1 •  lpush todo “eat” => 2 •  lpop todo => “eat” •  rpoplpush todo done =>

“read”

•  lrange done 0 -1 => “read”

Let Redis Distribute

parse csv

create pdf

create pdf ...

process

process process

Spread the Work

parse csv

create pdf

create pdf ...

Queue with data

counter 1

process

process process

Ruby on Redis

‣  Put PDF Create Input data on a Queue and do the counter bookkeeping

docs.each do |doc|!

data = YAML::dump(doc)!

!r.lpush 'pdf:queue’, data!

r.incr "ctr” # bookkeeping!

Create PDF’s

parse csv

create pdf

create pdf ...

Queue with data

counter

process

process process

1 Hash with pdfs

Ruby on Redis ‣  Read PDF input data from Queue and do the counter bookkeeping

and put each created PDF in a Redis hash and signal if ready

while (true)!

_, msg = r.brpop 'pdf:queue’!

!doc = YAML::load(msg)!

#name of hash, key=docname, value=pdf!

r.hset(‘pdf:pdfs’, doc[0], create_pdf(*doc))!

ctr = r.decr ‘ctr’ !

r.rpush "ready", "done" if ctr == 0!

Zip When Done

parse csv

create pdf

create pdf ...

process

process process

Hash with pdfs

ready 3

Ruby on Redis ‣  Wait for the ready signal ��

Fetch all pdf ’s��And zip them

r.brpop "ready“ # wait for signal!

pdfs = r.hgetall ‘pdf:pdfs‘ # fetch hash!

create_zip pdfs # zip it

More Parallelism

parse csv

create pdf

create pdf ...

Queue with data

counter

counter counter

hash Hash with Pdfs

ready ready ready

Ruby on Redis

‣  Put PDF Create Input data on a Queue and do the counter bookkeeping

# unique id for this input file!

UUID = SecureRandom.uuid!

docs.each do |doc|!

data = YAML::dump([UUID, doc])!

!r.lpush 'pdf:queue’, data!

r.incr "ctr:#{UUID}” # bookkeeping!

Ruby on Redis ‣  Read PDF input data from Queue and do the counter bookkeeping and

put each created PDF in a Redis hash

while (true)!

_, msg = r.brpop 'pdf:queue’!

uuid, doc = YAML::load(msg)!

r.hset(uuid, doc[0], create_pdf(*doc))!

ctr = r.decr "ctr:#{uuid}" !

r.rpush "ready:#{uuid}", "done" if ctr == 0 !

Ruby on Redis ‣  Wait for the ready signal ��

Fetch all pdf ’s��And zip them

r.brpop "ready:#{UUID}“ # wait for signal!

pdfs = r.hgetall(‘pdf:pdfs‘) # fetch hash!

create_zip(pdfs) # zip it

Full Code require 'csv'!require 'princely'!require 'zip/zip’!!DATA_FILE = ARGV[0]!DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv”)!!# create a pdf document from a csv line!def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMF\n#{template}\nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)!end!!# zip files from hash !def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name!end!!# load data from csv!docs = CSV.read(DATA_FILE) # array of arrays!!# create a pdf for each line in the csv !# and put it in a hash!files_h = docs.inject({}) do |files_h, doc|! files_h[doc[0]] = create_pdf(*doc)! files_h!end!!# zip all pfd's from the hash !create_zip files_h!!

require 'csv’!require 'zip/zip'!require 'redis'!require 'yaml'!require 'securerandom'!!# zip files from hash !def create_zip(files_h)! zipfile_name = "../out/#{DATA_FILE_BASE_NAME}.#{Time.now.to_s}.zip"! Zip::ZipOutputStream.open(zipfile_name) do |zos|! files_h.each do |name, content|! zos.put_next_entry "#{name}.pdf"! zos.puts content! end! end! zipfile_name!end!!DATA_FILE = ARGV[0]!DATA_FILE_BASE_NAME = File.basename(DATA_FILE, ".csv")!UUID = SecureRandom.uuid!!r = Redis.new!my_counter = "ctr:#{UUID}"!!# load data from csv!docs = CSV.read(DATA_FILE) # array of arrays! ! docs.each do |doc| # distribute!! r.lpush 'pdf:queue' , YAML::dump([UUID, doc])! r.incr my_counter! end!!r.brpop "ready:#{UUID}" #collect!!create_zip(r.hgetall(UUID)) !!# clean up!r.del my_counter!r.del UUID !puts "All done!”!

require 'redis'!require 'princely'!require 'yaml’!!# create a pdf document from a csv line!def create_pdf(invoice_nr, name, street, zip, city)! template = File.new('../resources/invoice.html').read! html = eval("<<WTFMF\n#{template}\nWTFMF")! p = Princely.new! p.add_style_sheets('../resources/invoice.css')! p.pdf_from_string(html)!end!!r = Redis.new!while (true)! _, msg = r.brpop 'pdf:queue'! uuid, doc = YAML::load(msg)! r.hset(uuid , doc[0] , create_pdf(*doc))! ctr = r.decr "ctr:#{uuid}" ! r.rpush "ready:#{uuid}", "done" if ctr == 0!end!

LINEAR MAIN WORKER

Key functions (create pdf and create zip) remain unchanged. Distribution code highlighted

DEMO 2

Multi Language Participants

parse csv

create pdf

create pdf ...

Queue with data

counter

counter counter

hash Hash with pdfs

Conclusions

From Linear To Multi Process Distributed

Is easy with

Redis Shared Memory High Level Data Structures

Atomic Counter for bookkeeping

Queue for work distribution

Queue as Signal

Hash for result sets

ruby on redis

csv data

csv docs

redis parse csv

linear parse csv std

pdf zip

linear version

simple templating

Technology

what's new with enterprise redis - leena joshi, redis labs

working with velti - trifork€¦ · • erlang • riak &...

ruby on rails - ruby-doc.org: documenting the ruby language

plesk 8.1 for linux/unix · ruby support - plesk 8.1 adds...

auto scaling with ruby, aws, jenkins and redis

ruby on rails

realtime recommender with redis: hands on

scaling crashlytics: building analytics on redis 2.6

serializing ruby objects in redis

redis on aws

redis on nvme ssd - zvika guz, samsung

azure redis cache - cache on steroids!

fun with ruby and redis, arrrrcamp edition, javier_ramirez,...

azure redis vs. ncache - alachisoft · redis server is...

redis / redis labs overview and then busting 4 myths … ·...

redis for duplicate detection on real time stream

ruby on rails no brasil (ppt) - ruby on rails

amazon · amazon elasticache for redis elasticache for...

ruby on rails - ghost...

ruby on rails [ ruby on rails.ppt ] - [ruby-doc.org:...