scaling

Post on 01-Nov-2014

187 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Scaling: a naïve approach A look at how to scale an existing monolythic system, and how companies such as Disqus and Eventbrite have done it.

TRANSCRIPT

SCALING

Òscar Vilaplana @grimborg http://oscarvilaplana.cat

WHAT’S THIS ABOUT?

People

Technology

Tools

PEOPLECare

Focus

Automate & Test.

Shared brain

Finish & DRY.

TECHDesign to clone

Separate pieces

API

Offload everything

Measure

VIRTUAL QUEUE

Queue Instance

Queue Instance

Queue Instance

Queue Instance

VIRTUAL QUEUE

Queue Instance

Queue Instance

Queue Instance

VIRTUAL QUEUE

Queue Instance

Queue Instance

Queue Instance

VIRTUAL QUEUE

Queue Instance

Queue Instance

Queue Instance

Queue Instance

TECH• Design to clone

• Separate pieces

• API

• Offload everything

• Measure

TYPES OF TASKS

• Realtime

• ASAP

• When you have time } Async!

INSTAGRAM’S FEED

• Redis queue per follower.

• New media: push to queues

• Small chained tasks

INSTAGRAM’S FEED

harro wouter orestis siebejan oscar

Schedulenext

batch

SMALL TASKS• 10k followers per task

• < 2s

• Finer-grained load balancing

• Lower penalty of failure/reload

CELERY: REDIS• Good: Fast

• Bad:

• Polling for task distribution

• Messy non-synchronous replication

• Memory limits task capacity

CELERY: BEANSTALK• Good:

• Fast

• Push to consumers

• Writes to disk

• Bad:

• No replication

• Only useful for Celery

CELERY: RABBITMQ• Fast

• Writes to disk

• Low-maintenance synchronous replication

• Excellent Celery compatibility

• Supports other use cases

RESERVATIONS• UI

• Room locking

• Room availability

• Registration manager

• Email, PDF invoice

• Payment

• Login

• …

WE DON’T DO THISdef do_everything(request): hotel_id = request.GET.hotel_id room_number = request.GET.room_number with room_mutex(hotel_id, room_number): room = (session.query(Room) .filter(Room.hotel_id == hotel_id) .filter(Room.room_number == room_number).one()) if not room.available: return Response("Room not available”, template=room_template) reservation = Reservation(client=request.client, room=room) session.add(reservation) room.available = False price = # price_calculation payment = Payment(reservation=reservation, price=price) session.add(payment) session.commit() url = payment.get_psp_url() return Redirect(url)

BUT WE DO THIS• Frontend UI

• Locking rooms

• Calculating room availability

• Temporarily locking rooms

• Payment processing

• Mail

• PDF invoice generation

BUT WE CAN SCALE!

SCALE DB: HARD• Slaves

• Master-Master?

• Sharding?

SCALING

MINOR SCALE

MAJOR SCALE

FRONTEND

Everything Frontend

Externalpaymentproviders

User

Everything Frontend

Master

Read slaves

SPLIT

• Responsibility

• Stateful/stateless

• Type of system

TYPES OF SYSTEMS

• Unique (mutex, datastore)

• Multiple

TYPES OF TASKS

• Realtime

• ASAP

• When you have time } Async!

SPLIT THIS

Everything Frontend

Externalpaymentproviders

User

Everything Frontend

Master

Read slaves

AUTONOMOUS SYSTEMS

Payment

Externalpaymentproviders

Locking

InvoicePDF

Mailer

UI Reservations ManagerUser

SessionStorage

DatawarehouseReporting

Configuration

Payout

CLONABILITY

CLONABILITY

CLONABILITY

Frontend

CLONABILITY

Everything Frontend

Externalpaymentproviders

User

Everything Frontend

Master

Read slaves

WHAT’S IN AN EASY STEPAs little change as possible.

Reuse.

Unintrusive.

Measure.

Go on the right direction.

SMALL STEPS

PROBLEMS? !

Oversells Configuration Reporting Payout

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

SMALL STEPSPROBLEMS? !

Oversells Configuration Reporting Payout SessionsRoom

Availability

Lock

ReadEverything FrontendEverything Frontend

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

ISOLATED SYSTEM Best technology

Decoupled

API

Testable

SMALL STEPSPROBLEMS? !

Oversells Configuration Reporting Payout Sessions

Everything FrontendConfig Backend

Settings

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

INITIAL SYSTEM

Everything Frontend

INITIAL SYSTEM (MODIFIED)

Everything Frontend Sales

Sync

INITIAL SYSTEM (MODIFIED)

Sales Backend

SMALL STEPSPROBLEMS? !

Oversells Configuration Reporting Payout Sessions

Everything FrontendSales Backend

Sales

Main DB

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

SMALL STEPSPROBLEMS? !

Oversells Configuration Reporting Payout SessionsSession

Storage Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

Everything FrontendEverything Frontend

WHEN?• Difficult.

• Measure everything.

• Find patterns.

• Define thresholds.

• Design: address as risk.

• Don’t overenigneer — Don’t ignore.

EVENTBRITE

• 2012: $600M ticket sales

• Accumulated: $1B

TECHNOLOGY• Monitoring: nagios, ganglia, pingdom

• Email: offloaded to StrongMail

• Load-balanced read slave pool

• Feature flags

• Automated server configuration and release with Puppet and Jenkins

TECHNOLOGY• Feature flags

• Develop on Vagrant

• Celery + RabbitMQ

• Virtual customer queue

• Big data for reporting, fraud, spam, event recommendations

TECHNOLOGY

• Hadoop

• Cassandra

• HBase

• Hive

• Separated into independent services

TIPS

• Instrument and monitor everything

• Lean

HOW BIG?

• 2Gb/day database transactions

• 3.5Tb/day social data analyzed

• 15Gb/day logs

ORDER PROCESSOR

• Pub/sub queue with Cassandra and Zookeeper

PUBLISHING

Publisher

Get queue lock+last batch id

Create new batch“process orders 10, 11, 12”

Store batch id, release lock

SUBSCRIBING

Subscriber

Get my latest processed batch id

Store result

Update my latest processed batch id

SCALING STORAGE• Move to NoSQL

• Aggressively move queries to slaves

• Different indexes per slave

• Better hardware

• Most optimal tables for large and highly-utilized datasets

EMAIL ADDRESSES

• Users have many email addresses.

• Lookup by email, join to users table

FIRST ATTEMPTCREATE TABLE `user_emails` (

`id` int NOT NULL AUTO_INCREMENT,

`email_address` varchar(255) NOT NULL,

... --other columns about the user

`user_id` int, --foreign key to users

KEY (`email_address`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

FIRST ATTEMPT

LOOKUP

CAN IT BE IMPROVED?

INDEX VS PK• InnoDB: B+trees, O(log n)

• Known user id: index on email not needed.

• Small win on lookup: O(1)

• Big win on not storing the index.

INNODB INDEXES

HASH TABLE

DISQUS• >165K messages per second

• <10ms latency

• 1.3B unique visitors

• 10B page views

• 500M users in discussions

• 3M communitios

• 25M comments

ORIGINAL REALTIME BACKEND

• Python + gevent

• NginxPushStream

• Network IO: great

• CPU: choking at peaks

• <15ms latency

CURRENT REALTIME BACKEND

• Go

• Handles all users

• Normal load:3200 connections/machine/sec

• <10ms latency

• Only 10%-20% CPU

Workers

CURRENT REALTIME BACKEND

Subscribed to results

Push result to userNginxPushStream

TESTING

• Test with real traffic

• Measure everything

LESSONS• Do work once, distribute results.

• Most likely to fail: your code. Don’t reinvent. Keep team small.

• End-to-end ACKs are expensive. Avoid.

• Understand use cases when load testing.

• Tune architecture to scale.

LEARN MORE• Instagram

• Braintree

• highscalability.com

• VelocityConf (youtube, nov 2014 @ bcn?)

QUESTIONS? ANSWERS?

THANKS!

Òscar Vilaplana @grimborg http://oscarvilaplana.cat

top related