Download - Spil Storage Platform (Erlang) @ EUG-NL
![Page 1: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/1.jpg)
SSP : Spil Storage Platform Thijs Terlouw – Senior Backend Engineer
12th July 2012
![Page 2: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/2.jpg)
2
1. Background • Problems • Wish list
2. Solution 3. Challenges 4. Performance 5. Lessons learned
Schedule
![Page 3: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/3.jpg)
Mission Spil Games: “ unite the world in play “ • localized social-gaming platforms • focus on : teens, girls and family • many portals:
• girlsgogames.com • agame.com
Background
3
![Page 4: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/4.jpg)
4
![Page 5: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/5.jpg)
• Over 200 countries, 15+ different languages
• On average 85 minutes per month per user
• Over 4000 online games
• 200 million unique users per month
Background
5
![Page 6: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/6.jpg)
• Traditional LAMP stack • Tweaked over time to keep up with growth • Reaching limits of current system • One of largest problems is the database
Background
6
![Page 7: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/7.jpg)
• Not all developers are DB experts • security • performance • caching
• Changing requirements • Difficult to shard the databases
Problems: the database
7
![Page 8: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/8.jpg)
1. Transparent scalability • Sharding data • Scalable applications on top of sharded data
2. Multi-database transactions • atomic operations across machines
3. Fast enough (low-ish latency, high throughput) 4. Highly available (central system) 5. Can handle large dataset 6. Offer flexibility (trade consistency for speed for instance) 7. Use MySQL (experience in-house DB-team) 8. Don’t expose SQL to devs, offer business-specific model
• Storage specific security measures (character escaping) 9. Allow changes to storage layer without affecting business (versioning) 10. Centralize ownership of caching
Wish list
8
![Page 9: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/9.jpg)
9
1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned
Schedule
![Page 10: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/10.jpg)
• No matching Open Source projects • So we want a massively scalable, soft real-time,
highly available system • Implement it ourselves: Erlang obvious candidate
Not the first to think of this: • Amazon SimpleDB • Riak
• Use Open Source where possible
Solution
10
![Page 11: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/11.jpg)
1. Our system should be always on 2. No global locks 3. Inconsistencies are the norm
• Hardware breaks down (power failures etc) • Version mismatches (upgrading system non atomic) • State mismatches (adding new machine)
Solution : mindset
11
![Page 12: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/12.jpg)
SSP: Spil Storage Platform
12
Buckets:
Erlang
Bucket
![Page 13: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/13.jpg)
• Bucket is a list of records of a specific type. Structured data! A bucket can map to one or several MySQL database tables and offers a CRUD-like interface (with filters)
• All data is identified by a unique GID (64 bit integer) • All requests for a particular GID are handled by one
Pipeline process (sequentially)
SSP : Overview
13
![Page 14: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/14.jpg)
14
SSP Overview
![Page 15: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/15.jpg)
• Why do we need Pipelines? • Sequential = bottleneck !?! • Don’t you guys know Erlang is about PARALLELIZING work?
SSP: Pipeline
15
![Page 16: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/16.jpg)
• Drawbacks: • For hotspots (game with a gazillion ) sequential (read)
access is bad indeed • Optimization: allow dirty read (try local cache first , outside
pipeline), other solutions possible.
• Advantages: • Facilitates scalability (no global locks, but per bucket/GID sync) • Pipelines make multi-database consistency easier
Requests to most GIDs (users) are evenly distributed
SSP: Pipeline
16
![Page 17: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/17.jpg)
SSP: Finding the Pipeline
17
{bucket, phash2(Gid, Ringsize)}
![Page 18: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/18.jpg)
• Each bucket is an OTP application • Buckets are largely generated • XML -> SQL + PIQI -> Erlang
– Using XSLT – Piqic
SSP: Bucket
19
![Page 19: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/19.jpg)
• PIQI is • data definition language • cross-language data serialization system
compatible with Protocol Buffers • Piqi-RPC — an RPC-over-HTTP system for Erlang
• Would be better if transport was pluggable
• http://piqi.org/
Piqi?
20
![Page 20: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/20.jpg)
SSP: Example Bucket XML definition
21
![Page 21: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/21.jpg)
gidlog.piqi
22
Mostly templated via xslt
![Page 22: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/22.jpg)
gidlog_accessors.hrl
23
Parse piqi generated hrl: epp:parse_file/3 mostly template added as dep
![Page 23: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/23.jpg)
• bucketX.erl – include_lib(“…/bucketX_accessors.hrl”) – verify_record(R) – start/0 and start_link/0 – init/1 – get_fun(Version), del_fun(V), insert_fun(V),…
• bucketX_v1.erl – del, insert, … (Gid, Shard, Filters) – get mysql pool – build some SQL – emysql:execute(Poolname, Sql)
SSP: bucket implementation
24
![Page 24: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/24.jpg)
1. A bucket is versioned. The interface of a bucket is stable, but implementation can vary
2. We can go up or down a version, migration is automatic • Mirror-mode is introduced so we can write to multiple
versions (but read from only one version)
SSP: Versions
25
![Page 25: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/25.jpg)
1. GIDs (eg users) are sharded automatically. • Each version might have multiple shards
2. Redundancy (of data) is handled by MySQL
{bucket, GID} -> {Version, Shard} mapping
• Version default: config • Shard default: default rule GID % shards • Actual version/shard per GID stored in DB (cached)
SSP: Shards (storage level)
26
![Page 26: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/26.jpg)
• Each node has a private Memcached instance • We store all data for a GID/bucket in this cache
• Filters applied after retrieving data from cache
• Don t change data in storage outside of the SSP!
SSP: Cache
27
![Page 27: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/27.jpg)
28
1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned
Schedule
![Page 28: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/28.jpg)
Challenge: controlled shutdown node
29
![Page 29: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/29.jpg)
How do we shutdown a node without losing jobs? • Shutdown bucketX application on a node
• stop pipeline factories on this node (for bucketX) • hand over work to other PF (on other nodes)
– couple of mnesia ring reads – move ETS table contents to new PF – remember which PF took over (so we can forward)
• If we go to another node, clone Pipeline (gen2 pri) • remove this node from the lookup ring • all PFs fix their hash range based on ring
• Because there is a race condition handing over many to one (non-continuous blocks) PF
• Sleep a while (actually wait for pipeline handovers)
Challenge: controlled shutdown node
30
![Page 30: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/30.jpg)
• if you terminate an application, all processes that were started (even if not linked) are terminated!
• bit hidden in documentation of application:start/2 and stop/1
• so we need to explicitly set the group_leader to something that never shuts down:
init(#state{} = S ) -> group_leader(whereis(init), self()), {ok, S}.
Note: shutdown application
31
![Page 31: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/31.jpg)
• The Pipeline process that we spawn per Gid needs to shutdown when done (less memory)
• When is it actually done? • Work might be assigned to the Pipeline just when
the Pipeline decides it is done: race conditions!
Challenge: shutdown pipeline
32
![Page 32: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/32.jpg)
• All requests for a GID are handled by a single Pipeline Factory
• The pipeline will issue a ‘work done’ command to the PF with a ‘CommandCounter’
• PF maintains an ETS table • Lookup if the registered CommandCounter for
that GID is the same as the reported number • If so: tell the Pipeline to die
Challenge: shutdown pipeline (2)
33
![Page 33: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/33.jpg)
• We want continuous usage of SSP – Even while upgrading bucket versions – So there can be multiple versions running
simultaneously
• Take care of creating closures • Atomic behavior per GID
Challenge: high uptime
34
![Page 34: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/34.jpg)
Challenge: quite complex system
35
![Page 35: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/35.jpg)
36
1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned
Schedule
![Page 36: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/36.jpg)
• Currently we run SSP in shadow mode, so no real data yet. Making realistic benchmarks is quite a lot of work.
• Latency (local machine): – 6-26ms to do a GET request on a primary key (cache miss) – 0.6ms with a cache hit – Cache stores Erlang terms currently (term_to_binary)
• Always read from cache – Does not detect changes in storage done outside SSP
Performance
37
![Page 37: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/37.jpg)
• Requests (local): – Getting from cache at about 13.5K req/sec
• elibs_benchmark:test_fun(gidlog_get, fun() -> gidlog:get(123456) end, 10, 10000).
– Getting from mysql about 615 req/sec incl cache miss • elibs_benchmark:test_fun(gidlog_get, fun() -> {_,_,C} =
os:timestamp(), gidlog:get(C) end, 10, 100). – ~2 SSP machines can saturate a MySQL machine – 8K writes/sec for 2 MySQL + 4 SSP machines (old
hardware)
Performance
38
![Page 38: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/38.jpg)
39
1. Background 2. Solution 3. Challenges 4. Performance 5. Lessons learned
Schedule
![Page 39: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/39.jpg)
• There are many good Open Source libraries • Emysql : we have added transaction support • Eep0018 : fast json encoder/decoder (yajl c++) • Estatsd : graphite-capable monitoring • Poolboy : Erlang worker pool factory (for
memcached) • Twig/Lager : logging (syslog)
Lessons learned (1)
40
![Page 40: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/40.jpg)
• Mnesia is great to replicate state across machines • Faster local lookups • Less error prone
• Encapsulate all Mnesia usage in a module • Adding nodes to Mnesia • Use ram_copies • Transactions are great
• We deploy an Erlang cluster (with Mnesia replication) only inside a single DataCenter • Not across unreliable connections!
Lessons learned (2)
41
![Page 41: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/41.jpg)
• XML + XSD + XSLT are great to define API • They might have a bad name, but work great • Can transform in any other format • Used to generate documentation
Todo: • generate more code (Buckets) • write gen_bucket behaviour • don t start with generating code
Lessons learned (3)
42
![Page 42: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/42.jpg)
• Rebar is great • Compilation is pretty convenient, but the best part
are the “dependencies” • Also the worst part
• We have proposed two improvements: • Allow different projects to share dependencies
(major speedup for compiling) • Smarter version conflict resolution (semantic
versioning: [ “>= 1.3.1”, “< 2.0.0” ] )
Lessons learned (4)
43
![Page 43: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/43.jpg)
• We use #records{} for all APIs – Piqi input/output – Stable and well-defined – Will move to ProtocolBuffers
• Use OTP applications everywhere – Start/stop stuff – See started apps: application:which_applications()
• Terminate on fatal errors – Memcached down : terminate all buckets, don t
try to recover (prevent overload DB)
Lessons learned (5)
44
![Page 44: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/44.jpg)
• You need to add admin/monitoring interface
Lessons learned (6)
45
![Page 45: Spil Storage Platform (Erlang) @ EUG-NL](https://reader034.vdocuments.us/reader034/viewer/2022051819/54c485874a7959df1c8b456b/html5/thumbnails/45.jpg)
We will not open-source SSP, but we do actively
contribute to libraries used in SSP (so far Emysql, Rebar, Piqi)
Open Source
46