spil storage platform (erlang) @ eug-nl

46
SSP : Spil Storage Platform Thijs Terlouw Senior Backend Engineer 12 th July 2012

Upload: thijsterlouw

Post on 25-Jan-2015

367 views

Category:

Technology


0 download

DESCRIPTION

Presentation about the Spil Storage Platform (SSP) written in Erlang. This talk was first given at the Erlang User Group Netherlands in July 2012 hosted at Spilgames in Hilversum.

TRANSCRIPT

Page 1: Spil Storage Platform (Erlang) @ EUG-NL

SSP : Spil Storage Platform Thijs Terlouw – Senior Backend Engineer

12th July 2012

Page 2: Spil Storage Platform (Erlang) @ EUG-NL

2

1. Background • Problems • Wish list

2. Solution 3. Challenges 4. Performance 5. Lessons learned

Schedule

Page 3: Spil Storage Platform (Erlang) @ EUG-NL

Mission Spil Games: “ unite the world in play “ • localized social-gaming platforms • focus on : teens, girls and family • many portals:

• girlsgogames.com • agame.com

Background

3

Page 4: Spil Storage Platform (Erlang) @ EUG-NL

4

Page 5: Spil Storage Platform (Erlang) @ EUG-NL

• Over 200 countries, 15+ different languages

• On average 85 minutes per month per user

• Over 4000 online games

• 200 million unique users per month

Background

5

Page 6: Spil Storage Platform (Erlang) @ EUG-NL

• Traditional LAMP stack • Tweaked over time to keep up with growth • Reaching limits of current system • One of largest problems is the database

Background

6

Page 7: Spil Storage Platform (Erlang) @ EUG-NL

• Not all developers are DB experts • security • performance • caching

• Changing requirements • Difficult to shard the databases

Problems: the database

7

Page 8: Spil Storage Platform (Erlang) @ EUG-NL

1. Transparent scalability • Sharding data • Scalable applications on top of sharded data

2. Multi-database transactions • atomic operations across machines

3. Fast enough (low-ish latency, high throughput) 4. Highly available (central system) 5. Can handle large dataset 6. Offer flexibility (trade consistency for speed for instance) 7. Use MySQL (experience in-house DB-team) 8. Don’t expose SQL to devs, offer business-specific model

• Storage specific security measures (character escaping) 9. Allow changes to storage layer without affecting business (versioning) 10. Centralize ownership of caching

Wish list

8

Page 9: Spil Storage Platform (Erlang) @ EUG-NL

9

1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned

Schedule

Page 10: Spil Storage Platform (Erlang) @ EUG-NL

• No matching Open Source projects • So we want a massively scalable, soft real-time,

highly available system • Implement it ourselves: Erlang obvious candidate

Not the first to think of this: • Amazon SimpleDB • Riak

• Use Open Source where possible

Solution

10

Page 11: Spil Storage Platform (Erlang) @ EUG-NL

1. Our system should be always on 2. No global locks 3. Inconsistencies are the norm

• Hardware breaks down (power failures etc) • Version mismatches (upgrading system non atomic) • State mismatches (adding new machine)

Solution : mindset

11

Page 12: Spil Storage Platform (Erlang) @ EUG-NL

SSP: Spil Storage Platform

12

Buckets:

Erlang

Bucket

Page 13: Spil Storage Platform (Erlang) @ EUG-NL

• Bucket is a list of records of a specific type. Structured data! A bucket can map to one or several MySQL database tables and offers a CRUD-like interface (with filters)

• All data is identified by a unique GID (64 bit integer) • All requests for a particular GID are handled by one

Pipeline process (sequentially)

SSP : Overview

13

Page 14: Spil Storage Platform (Erlang) @ EUG-NL

14

SSP Overview

Page 15: Spil Storage Platform (Erlang) @ EUG-NL

• Why do we need Pipelines? • Sequential = bottleneck !?! • Don’t you guys know Erlang is about PARALLELIZING work?

SSP: Pipeline

15

Page 16: Spil Storage Platform (Erlang) @ EUG-NL

• Drawbacks: • For hotspots (game with a gazillion ) sequential (read)

access is bad indeed • Optimization: allow dirty read (try local cache first , outside

pipeline), other solutions possible.

• Advantages: • Facilitates scalability (no global locks, but per bucket/GID sync) • Pipelines make multi-database consistency easier

Requests to most GIDs (users) are evenly distributed

SSP: Pipeline

16

Page 17: Spil Storage Platform (Erlang) @ EUG-NL

SSP: Finding the Pipeline

17

{bucket, phash2(Gid, Ringsize)}

Page 18: Spil Storage Platform (Erlang) @ EUG-NL

• Each bucket is an OTP application • Buckets are largely generated • XML -> SQL + PIQI -> Erlang

– Using XSLT – Piqic

SSP: Bucket

19

Page 19: Spil Storage Platform (Erlang) @ EUG-NL

• PIQI is • data definition language • cross-language data serialization system

compatible with Protocol Buffers • Piqi-RPC — an RPC-over-HTTP system for Erlang

• Would be better if transport was pluggable

• http://piqi.org/

Piqi?

20

Page 20: Spil Storage Platform (Erlang) @ EUG-NL

SSP: Example Bucket XML definition

21

Page 21: Spil Storage Platform (Erlang) @ EUG-NL

gidlog.piqi

22

Mostly templated via xslt

Page 22: Spil Storage Platform (Erlang) @ EUG-NL

gidlog_accessors.hrl

23

Parse piqi generated hrl: epp:parse_file/3 mostly template added as dep

Page 23: Spil Storage Platform (Erlang) @ EUG-NL

• bucketX.erl – include_lib(“…/bucketX_accessors.hrl”) – verify_record(R) – start/0 and start_link/0 – init/1 – get_fun(Version), del_fun(V), insert_fun(V),…

• bucketX_v1.erl – del, insert, … (Gid, Shard, Filters) – get mysql pool – build some SQL – emysql:execute(Poolname, Sql)

SSP: bucket implementation

24

Page 24: Spil Storage Platform (Erlang) @ EUG-NL

1. A bucket is versioned. The interface of a bucket is stable, but implementation can vary

2. We can go up or down a version, migration is automatic • Mirror-mode is introduced so we can write to multiple

versions (but read from only one version)

SSP: Versions

25

Page 25: Spil Storage Platform (Erlang) @ EUG-NL

1. GIDs (eg users) are sharded automatically. • Each version might have multiple shards

2. Redundancy (of data) is handled by MySQL

{bucket, GID} -> {Version, Shard} mapping

• Version default: config • Shard default: default rule GID % shards • Actual version/shard per GID stored in DB (cached)

SSP: Shards (storage level)

26

Page 26: Spil Storage Platform (Erlang) @ EUG-NL

• Each node has a private Memcached instance • We store all data for a GID/bucket in this cache

• Filters applied after retrieving data from cache

• Don t change data in storage outside of the SSP!

SSP: Cache

27

Page 27: Spil Storage Platform (Erlang) @ EUG-NL

28

1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned

Schedule

Page 28: Spil Storage Platform (Erlang) @ EUG-NL

Challenge: controlled shutdown node

29

Page 29: Spil Storage Platform (Erlang) @ EUG-NL

How do we shutdown a node without losing jobs? • Shutdown bucketX application on a node

• stop pipeline factories on this node (for bucketX) • hand over work to other PF (on other nodes)

– couple of mnesia ring reads – move ETS table contents to new PF – remember which PF took over (so we can forward)

• If we go to another node, clone Pipeline (gen2 pri) • remove this node from the lookup ring • all PFs fix their hash range based on ring

• Because there is a race condition handing over many to one (non-continuous blocks) PF

• Sleep a while (actually wait for pipeline handovers)

Challenge: controlled shutdown node

30

Page 30: Spil Storage Platform (Erlang) @ EUG-NL

• if you terminate an application, all processes that were started (even if not linked) are terminated!

• bit hidden in documentation of application:start/2 and stop/1

• so we need to explicitly set the group_leader to something that never shuts down:

init(#state{} = S ) -> group_leader(whereis(init), self()), {ok, S}.

Note: shutdown application

31

Page 31: Spil Storage Platform (Erlang) @ EUG-NL

• The Pipeline process that we spawn per Gid needs to shutdown when done (less memory)

• When is it actually done? • Work might be assigned to the Pipeline just when

the Pipeline decides it is done: race conditions!

Challenge: shutdown pipeline

32

Page 32: Spil Storage Platform (Erlang) @ EUG-NL

• All requests for a GID are handled by a single Pipeline Factory

• The pipeline will issue a ‘work done’ command to the PF with a ‘CommandCounter’

• PF maintains an ETS table • Lookup if the registered CommandCounter for

that GID is the same as the reported number • If so: tell the Pipeline to die

Challenge: shutdown pipeline (2)

33

Page 33: Spil Storage Platform (Erlang) @ EUG-NL

• We want continuous usage of SSP – Even while upgrading bucket versions – So there can be multiple versions running

simultaneously

• Take care of creating closures • Atomic behavior per GID

Challenge: high uptime

34

Page 34: Spil Storage Platform (Erlang) @ EUG-NL

Challenge: quite complex system

35

Page 35: Spil Storage Platform (Erlang) @ EUG-NL

36

1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned

Schedule

Page 36: Spil Storage Platform (Erlang) @ EUG-NL

• Currently we run SSP in shadow mode, so no real data yet. Making realistic benchmarks is quite a lot of work.

• Latency (local machine): – 6-26ms to do a GET request on a primary key (cache miss) – 0.6ms with a cache hit – Cache stores Erlang terms currently (term_to_binary)

• Always read from cache – Does not detect changes in storage done outside SSP

Performance

37

Page 37: Spil Storage Platform (Erlang) @ EUG-NL

• Requests (local): – Getting from cache at about 13.5K req/sec

• elibs_benchmark:test_fun(gidlog_get, fun() -> gidlog:get(123456) end, 10, 10000).

– Getting from mysql about 615 req/sec incl cache miss • elibs_benchmark:test_fun(gidlog_get, fun() -> {_,_,C} =

os:timestamp(), gidlog:get(C) end, 10, 100). – ~2 SSP machines can saturate a MySQL machine – 8K writes/sec for 2 MySQL + 4 SSP machines (old

hardware)

Performance

38

Page 38: Spil Storage Platform (Erlang) @ EUG-NL

39

1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned

Schedule

Page 39: Spil Storage Platform (Erlang) @ EUG-NL

• There are many good Open Source libraries • Emysql : we have added transaction support • Eep0018 : fast json encoder/decoder (yajl c++) • Estatsd : graphite-capable monitoring • Poolboy : Erlang worker pool factory (for

memcached) • Twig/Lager : logging (syslog)

Lessons learned (1)

40

Page 40: Spil Storage Platform (Erlang) @ EUG-NL

• Mnesia is great to replicate state across machines • Faster local lookups • Less error prone

• Encapsulate all Mnesia usage in a module • Adding nodes to Mnesia • Use ram_copies • Transactions are great

• We deploy an Erlang cluster (with Mnesia replication) only inside a single DataCenter • Not across unreliable connections!

Lessons learned (2)

41

Page 41: Spil Storage Platform (Erlang) @ EUG-NL

• XML + XSD + XSLT are great to define API • They might have a bad name, but work great • Can transform in any other format • Used to generate documentation

Todo: • generate more code (Buckets) • write gen_bucket behaviour • don t start with generating code

Lessons learned (3)

42

Page 42: Spil Storage Platform (Erlang) @ EUG-NL

• Rebar is great • Compilation is pretty convenient, but the best part

are the “dependencies” • Also the worst part

• We have proposed two improvements: • Allow different projects to share dependencies

(major speedup for compiling) • Smarter version conflict resolution (semantic

versioning: [ “>= 1.3.1”, “< 2.0.0” ] )

Lessons learned (4)

43

Page 43: Spil Storage Platform (Erlang) @ EUG-NL

• We use #records{} for all APIs – Piqi input/output – Stable and well-defined – Will move to ProtocolBuffers

• Use OTP applications everywhere – Start/stop stuff – See started apps: application:which_applications()

• Terminate on fatal errors – Memcached down : terminate all buckets, don t

try to recover (prevent overload DB)

Lessons learned (5)

44

Page 44: Spil Storage Platform (Erlang) @ EUG-NL

• You need to add admin/monitoring interface

Lessons learned (6)

45

Page 45: Spil Storage Platform (Erlang) @ EUG-NL

We will not open-source SSP, but we do actively

contribute to libraries used in SSP (so far Emysql, Rebar, Piqi)

Open Source

46

Page 46: Spil Storage Platform (Erlang) @ EUG-NL

THANKS!

Questions? [email protected]

47