Download - Riak at Posterous

Transcript
Page 1: Riak at Posterous

Riak at PosterousJulio Capote

San Francisco Riak Meetup1/18/2012

Page 2: Riak at Posterous

A/S/L?

• Julio Capote

• Backend Developer at Posterous

• @capotej

Page 3: Riak at Posterous

• Allows anyone to create multiple private or public spaces (blogs)

• Around since 2008

• Millions of posts and users

• Tons of long tail traffic

Some of the first posts are still being accessed today due to search engines

Page 4: Riak at Posterous

How we store posts

• Original post body goes into MySQL

• Multiple variants are generated (nojs, mobile, etc)

• Expensive to generate (sanitizers, expanders)

Page 5: Riak at Posterous

Enter Variant Cache

• A generic read/write-through cache library

• Started with Memcache

• Moved to Redis

At the time disk store looked promising, so we moved from memcache to redis

Page 6: Riak at Posterous

Redis is awesome, but• Requires both the key and value go into

memory

• Terrible disk store performance

• Even with 3 machines with 64gb ram, couldn’t fit entire working set

• Forced to set a TTL

redis wasn’t really designed to ever hit the disk

Page 7: Riak at Posterous
Page 8: Riak at Posterous

The Dream

Page 9: Riak at Posterous
Page 10: Riak at Posterous

What we wanted

• Key/Value store

• Disk backed

• Built in distribution

• Use less boxes to serve more users

• Consistent performance over raw performance

Page 11: Riak at Posterous

Percona MySQL / HandlerSocket

Page 12: Riak at Posterous

MySQL / HandlerSocket

• Great performance

• Can handle a huge number of rows

• Mature / Safe (at least the mysql part)

The Good

Page 13: Riak at Posterous

MySQL /HandlerSocket

• Sharding definitely not built in

• HandlerSocket is pretty much abandoned

The Bad

No support going forward

Page 14: Riak at Posterous
Page 15: Riak at Posterous

MongoDB

• Crazy fast

• Built in sharding support

• ...did I mention it was fast?

The Good

Page 16: Riak at Posterous
Page 17: Riak at Posterous

MongoDB

• 30% standard deviation on fetch times (!)

• Would falsely acknowledge a write

The Bad

This is probably tunable, but still

Page 18: Riak at Posterous
Page 19: Riak at Posterous

Riak + Bitcask

• Distributed by default

• Consistent and predictable performance

• Highly concurrent, no perf degradation

• Ops guy loves it!

The Good

Page 20: Riak at Posterous

Riak + Bitcask

• Not crazy fast

• Stuck it behind memcache

• Still way faster than generating

• No multi get support

The Bad

write and read through memcache

Page 21: Riak at Posterous

Riak in production

• Started using our 3 node cluster for the global production cache

• Accidentally turned off a node

• Keys rebalanced, site didn’t skip a beat

• No one even noticed till hours later

Page 22: Riak at Posterous
Page 23: Riak at Posterous

Stats

• 3 nodes

• 2600+ requests/second

• 300+ GB

• ~200 million keys

• 10 GB memcache/host

Page 24: Riak at Posterous

#Protips

• All nodes can serve all requests, so...

• Use a vip, or...

• Pass all cluster nodes to client driver (thanks @aphyr!)

• Use curb instead of net/http

• Use Keep Alive

Page 25: Riak at Posterous

Any Questions?

Page 26: Riak at Posterous

Thanks for listening!

Special thanks to@twoism@vincentchu@kangchen@argv0@pharkmillups@seancribbs@aphyr@jrecursive


Top Related