engineering scalable social games

Post on 23-Feb-2016

23 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Engineering Scalable Social Games. Dr . Robert Zubek. Top social game developer. Growth Example Roller Coaster Kingdom . 1M DAUs over a weekend. source: developeranalytics.com. Growth Example FishVille. 6 M DAUs in its first week. source: developeranalytics.com. - PowerPoint PPT Presentation

TRANSCRIPT

Engineering ScalableSocial Games

Dr. Robert Zubek

Top social game developer

Growth ExampleRoller Coaster Kingdom

1M DAUs over a weekend

source: developeranalytics.com

Growth ExampleFishVille

6M DAUs in its first week

source: developeranalytics.com

Growth ExampleFarmVille

25M DAUs over five months

source: developeranalytics.com

Talk Overview

Part I. Game Architectures

Part II. Scaling Solutions

Introduce game developers to bestpractices for large-scale web development

• server approaches• client approaches

Part I. Game Architectures

• Game logic in MMO server (eg. Java)

• Web stack for everything else

Server Side

• Usually based on a LAMP stack

• Game logic in PHP• HTTP communication

Mixed stackWeb stack

Example 1FishVille

Web Servers + CDNCache and

Queue Servers

DB Servers

Example 2 YoVille

Web Servers + CDN

MMO Servers

Cache and Queue Servers

DB Servers

• HTTP scales very well• Stateless request/response• Easy to load-balance across servers• Easy to add more servers

Some limitations• Server-initiated actions• Storing state between requests

Why web stack?

caches

exampleWeb stack and state

Web #1

DB Servers

Web #2Web #3 Client

Web #4

xxx

MMO servers

• Minimally “MMO”:• Persistent socket connection per client• Live game support: chat, live events• Supports data push• Keeps game state in memory

• Very different from web!

Why MMO servers?

• Harder to scale out!• But on the plus side:

1. Supports live communication and events2. Game logic can run on the server,

independent of client’s actions3. Game state can be entirely in RAM

Example Web + MMO stack

Web Servers

MMO Servers

Client

caches

DB Servers

• The game is “just” a web page

• Minimal sys reqs• Limited graphics and

animation

Client Side

• High production quality games

• Game logic on client• Can keep open

socket

HTML + AJAXFlash

Social Network Integration

• “Easy” because not related to scaling

• Calling the host network to get a list of friends, post something, etc.

• Networks provide REST APIs, and sometimes client libraries

Architectures

Data

Server

Client

database, cache, etc.

Web server stack

HTML Flash

Web + MMO

Flash

Part I. Game Architectures

Part II. Scaling Solutions

Introduce game developers to bestpractices for large-scale web development

not blowing upas you grow

ScalingScale up vs. scale out

I need more boxes

I need a bigger box

ScalingScale up vs. scale out

Scale out has been a clear win• At some point you can’t

get a box big enough, quickly enough

• Much easier to add more boxes

• But: requires architectural support from the app!

Example:Scaling up vs. out

• 500,000 daily players in first week• Started with just one DB, which quickly

became a bottleneck• Short term: scaled up, optimized DB accesses,

fixed slow queries, caching, etc.• Long term: must scale out with the size

of the player base!

ScalingWe’ll look at four elements:

Databases

Game servers (web + MMO)

Caching

Capacity planning

II A. Databases

App Servers

App Servers

Client

caches

queues

DBDB

DBs are the first to fall over

App

W R

App

W R

M S S

App

M S S

App

M

App

M

M

App

MM0

App

M1

How to partition data?

• Two most common ways:• Vertical – by table

• Easy but doesn’t scale with DAUs• Horizontal – by row

• Harder to do, but gives best results!• Different rows live on different DBs• Need a good mapping from row # to DB

id Data

100 foo

101 bar

102 baz

103 xyzzy

Row stripingM0 M1• Row-to-shard mapping:

• Primary key modulo # of DBs• Like a “logical RAID 0”• More clever schemes exist, of course

S0 S1M2 M3

x xScaling out your shards

M0

ServersServersServers

M1

Sharding in a nutshell

• It’s just data striping across databases (“logical RAID 0”)• There’s no automatic replication, etc.• No magic about how it works• That’s also what makes it robust,

easy to set up and maintain!

Example: Sharding

• Biggest challenge: designing how to partition huge existing dataset

• Partitioned both horizontally and vertically• Lots of joins had to be broken • Data patterns needed redesigned

• With sharding, you need to know the shard ID for each query!

• Data replication had difficulty catching up with high-volume usage

Sharding surprises

• Can’t do joins across shards• Instead, do multiple selects,

or denormalize your data

Be careful with joins

Sharding surprises

• CPU-expensive; easier to pay this cost in the application layer (just be careful)

• The more you keep in RAM, the less you’ll need these

Skip transactions and foreign key constraints

II B. Caching

• DBs can be too slow• Spend memory to buy speed!

caches

queues

caches

queues

App Servers

App Servers

ClientDB

Caching

• Two main uses:• Speed up DB access to commonly

used data• Shared game state storage for stateless

web game servers

• Memcached: network-attached RAM cache

Memcache

• Stores simple key-value pairs• eg. “uid_123” => {name:”Robert”, …}

• What to put there?• Structured game data• Mutexes for actions across servers

• Caveat• It’s an LRU cache, not a DB!

No persistence!

Memcache scaling out

DBs MCs App

Shard it just like the DB!

II C. Game Servers

Web Servers

MMO Servers

Client

caches

queues

DBWeb

Servers

MMO Servers

LB

LB

DNSLB

Scaling web servers

LB

Scaling content delivery

CDN

game dataonly

mediafiles

mediafiles

Web Servers Client

Scaling MMO servers

• Scaling out is easiest when servers don’t need to talk to each other• Databases: sharding• Memcache: sharding• Web: load-balanced farms• MMO: ???

• Our MMO server approach: shard just like other servers

Scaling MMO servers

• Keep all servers separate from each other• Minimize inter-server communication• All participants in a live event should be on

the same server• Scaling out means no direct sharing

• Sharing through third parties is okay

Problem #1:Load balancing • What if a server gets overloaded?

• Can’t move a live player to a different server

• Poor man’s load-balancing:• Players don’t pick servers; the LB does• If a server gets hot, remove it from the

LB pool• If enough servers get hot, add more server

instances and send new connections there

Problem #2: Deployment

Downtime = lost revenues• You can measure cost of

downtime in $/minute!

Problem #2: Deployment

Deployment comparison:• Web: just copy over PHP files• Socket servers – much harder

• How to deploy with zero downtime?

• Ideally: set up shadow new server and slowly transition players over

II D. Capacity planning

Capacity

We believein scaling out

• Demand could change rapidly• How to provision enough servers?

Different logistics thanscaling up

Capacity:colo vs. cloud

• Low or zero fixed cost• Higher variable cost• Little or no lead time on

additional capacity• Less control over ops• Limited choice of VMs• Virtualized storage• Can’t scale up! Only out.

CloudColo

• Higher fixed cost• Lower variable cost• Significant lead time on

additional capacity• More control over ops• Any chips you want• Any disks you want• Scale up or out

Operations

• Scaling out: a legion of servers

• You’ll need tools• App health – custom dashboard• Server monitoring – Munin• Automatic alerts – Nagios

Ops: Dashboard

Ops: Munin

Server stats from the entire farm

Ops: Nagios

• SMS alerts for your server stats• Connections drop to zero• Or CPU load exceeds 4• Or test account can’t connect, etc.

• Put alerts on business stats, too!• DAUs dropping below daily avg, etc.

Server failures

• Common in the cloud• Dropping off the net• Sometimes even shutting down

• Be defensive• Ops should be prepared• Devs should program defensively

The EndMany thanks to those who taught me a lot:

Virtual Worlds: Scott Dale, Justin WaldronCoaster: Biren Gandhi, Nathan Brown, Tatung Mei

http://robert.zubek.net/gdc2010

top related