scaling
DESCRIPTION
Scaling: a naïve approach A look at how to scale an existing monolythic system, and how companies such as Disqus and Eventbrite have done it.TRANSCRIPT
![Page 2: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/2.jpg)
WHAT’S THIS ABOUT?
People
Technology
Tools
![Page 3: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/3.jpg)
PEOPLECare
Focus
Automate & Test.
Shared brain
Finish & DRY.
![Page 4: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/4.jpg)
TECHDesign to clone
Separate pieces
API
Offload everything
Measure
![Page 5: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/5.jpg)
![Page 6: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/6.jpg)
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
Queue Instance
![Page 7: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/7.jpg)
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
![Page 8: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/8.jpg)
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
![Page 9: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/9.jpg)
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
Queue Instance
![Page 10: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/10.jpg)
TECH• Design to clone
• Separate pieces
• API
• Offload everything
• Measure
![Page 11: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/11.jpg)
TYPES OF TASKS
• Realtime
• ASAP
• When you have time } Async!
![Page 12: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/12.jpg)
INSTAGRAM’S FEED
• Redis queue per follower.
• New media: push to queues
• Small chained tasks
![Page 13: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/13.jpg)
INSTAGRAM’S FEED
harro wouter orestis siebejan oscar
Schedulenext
batch
![Page 14: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/14.jpg)
SMALL TASKS• 10k followers per task
• < 2s
• Finer-grained load balancing
• Lower penalty of failure/reload
![Page 15: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/15.jpg)
CELERY: REDIS• Good: Fast
• Bad:
• Polling for task distribution
• Messy non-synchronous replication
• Memory limits task capacity
![Page 16: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/16.jpg)
CELERY: BEANSTALK• Good:
• Fast
• Push to consumers
• Writes to disk
• Bad:
• No replication
• Only useful for Celery
![Page 17: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/17.jpg)
CELERY: RABBITMQ• Fast
• Writes to disk
• Low-maintenance synchronous replication
• Excellent Celery compatibility
• Supports other use cases
![Page 18: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/18.jpg)
![Page 19: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/19.jpg)
![Page 20: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/20.jpg)
![Page 21: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/21.jpg)
RESERVATIONS• UI
• Room locking
• Room availability
• Registration manager
• Email, PDF invoice
• Payment
• Login
• …
![Page 22: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/22.jpg)
WE DON’T DO THISdef do_everything(request): hotel_id = request.GET.hotel_id room_number = request.GET.room_number with room_mutex(hotel_id, room_number): room = (session.query(Room) .filter(Room.hotel_id == hotel_id) .filter(Room.room_number == room_number).one()) if not room.available: return Response("Room not available”, template=room_template) reservation = Reservation(client=request.client, room=room) session.add(reservation) room.available = False price = # price_calculation payment = Payment(reservation=reservation, price=price) session.add(payment) session.commit() url = payment.get_psp_url() return Redirect(url)
![Page 23: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/23.jpg)
BUT WE DO THIS• Frontend UI
• Locking rooms
• Calculating room availability
• Temporarily locking rooms
• Payment processing
• PDF invoice generation
![Page 24: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/24.jpg)
BUT WE CAN SCALE!
![Page 25: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/25.jpg)
SCALE DB: HARD• Slaves
• Master-Master?
• Sharding?
![Page 26: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/26.jpg)
SCALING
![Page 27: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/27.jpg)
MINOR SCALE
![Page 28: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/28.jpg)
MAJOR SCALE
![Page 29: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/29.jpg)
![Page 30: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/30.jpg)
FRONTEND
Everything Frontend
Externalpaymentproviders
User
Everything Frontend
Master
Read slaves
![Page 31: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/31.jpg)
SPLIT
• Responsibility
• Stateful/stateless
• Type of system
![Page 32: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/32.jpg)
TYPES OF SYSTEMS
• Unique (mutex, datastore)
• Multiple
![Page 33: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/33.jpg)
TYPES OF TASKS
• Realtime
• ASAP
• When you have time } Async!
![Page 34: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/34.jpg)
SPLIT THIS
Everything Frontend
Externalpaymentproviders
User
Everything Frontend
Master
Read slaves
![Page 35: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/35.jpg)
AUTONOMOUS SYSTEMS
Payment
Externalpaymentproviders
Locking
InvoicePDF
Mailer
UI Reservations ManagerUser
SessionStorage
DatawarehouseReporting
Configuration
Payout
![Page 36: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/36.jpg)
CLONABILITY
![Page 37: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/37.jpg)
CLONABILITY
![Page 38: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/38.jpg)
CLONABILITY
Frontend
![Page 39: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/39.jpg)
CLONABILITY
Everything Frontend
Externalpaymentproviders
User
Everything Frontend
Master
Read slaves
![Page 40: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/40.jpg)
WHAT’S IN AN EASY STEPAs little change as possible.
Reuse.
Unintrusive.
Measure.
Go on the right direction.
![Page 41: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/41.jpg)
SMALL STEPS
PROBLEMS? !
Oversells Configuration Reporting Payout
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
![Page 42: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/42.jpg)
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout SessionsRoom
Availability
Lock
ReadEverything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
![Page 43: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/43.jpg)
ISOLATED SYSTEM Best technology
Decoupled
API
Testable
![Page 44: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/44.jpg)
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout Sessions
Everything FrontendConfig Backend
Settings
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
![Page 45: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/45.jpg)
INITIAL SYSTEM
Everything Frontend
![Page 46: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/46.jpg)
INITIAL SYSTEM (MODIFIED)
Everything Frontend Sales
Sync
![Page 47: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/47.jpg)
INITIAL SYSTEM (MODIFIED)
Sales Backend
![Page 48: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/48.jpg)
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout Sessions
Everything FrontendSales Backend
Sales
Main DB
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
![Page 49: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/49.jpg)
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout SessionsSession
Storage Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
![Page 50: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/50.jpg)
WHEN?• Difficult.
• Measure everything.
• Find patterns.
• Define thresholds.
• Design: address as risk.
• Don’t overenigneer — Don’t ignore.
![Page 51: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/51.jpg)
EVENTBRITE
• 2012: $600M ticket sales
• Accumulated: $1B
![Page 52: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/52.jpg)
TECHNOLOGY• Monitoring: nagios, ganglia, pingdom
• Email: offloaded to StrongMail
• Load-balanced read slave pool
• Feature flags
• Automated server configuration and release with Puppet and Jenkins
![Page 53: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/53.jpg)
TECHNOLOGY• Feature flags
• Develop on Vagrant
• Celery + RabbitMQ
• Virtual customer queue
• Big data for reporting, fraud, spam, event recommendations
![Page 54: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/54.jpg)
TECHNOLOGY
• Hadoop
• Cassandra
• HBase
• Hive
• Separated into independent services
![Page 55: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/55.jpg)
TIPS
• Instrument and monitor everything
• Lean
![Page 56: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/56.jpg)
HOW BIG?
• 2Gb/day database transactions
• 3.5Tb/day social data analyzed
• 15Gb/day logs
![Page 57: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/57.jpg)
ORDER PROCESSOR
• Pub/sub queue with Cassandra and Zookeeper
![Page 58: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/58.jpg)
PUBLISHING
Publisher
Get queue lock+last batch id
Create new batch“process orders 10, 11, 12”
Store batch id, release lock
![Page 59: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/59.jpg)
SUBSCRIBING
Subscriber
Get my latest processed batch id
Store result
Update my latest processed batch id
![Page 60: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/60.jpg)
SCALING STORAGE• Move to NoSQL
• Aggressively move queries to slaves
• Different indexes per slave
• Better hardware
• Most optimal tables for large and highly-utilized datasets
![Page 61: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/61.jpg)
EMAIL ADDRESSES
• Users have many email addresses.
• Lookup by email, join to users table
![Page 62: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/62.jpg)
FIRST ATTEMPTCREATE TABLE `user_emails` (
`id` int NOT NULL AUTO_INCREMENT,
`email_address` varchar(255) NOT NULL,
... --other columns about the user
`user_id` int, --foreign key to users
KEY (`email_address`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
![Page 63: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/63.jpg)
FIRST ATTEMPT
![Page 64: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/64.jpg)
LOOKUP
![Page 65: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/65.jpg)
CAN IT BE IMPROVED?
![Page 66: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/66.jpg)
INDEX VS PK• InnoDB: B+trees, O(log n)
• Known user id: index on email not needed.
• Small win on lookup: O(1)
• Big win on not storing the index.
![Page 67: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/67.jpg)
INNODB INDEXES
![Page 68: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/68.jpg)
HASH TABLE
![Page 69: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/69.jpg)
DISQUS• >165K messages per second
• <10ms latency
• 1.3B unique visitors
• 10B page views
• 500M users in discussions
• 3M communitios
• 25M comments
![Page 70: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/70.jpg)
ORIGINAL REALTIME BACKEND
• Python + gevent
• NginxPushStream
• Network IO: great
• CPU: choking at peaks
• <15ms latency
![Page 71: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/71.jpg)
CURRENT REALTIME BACKEND
• Go
• Handles all users
• Normal load:3200 connections/machine/sec
• <10ms latency
• Only 10%-20% CPU
![Page 72: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/72.jpg)
Workers
CURRENT REALTIME BACKEND
Subscribed to results
Push result to userNginxPushStream
![Page 73: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/73.jpg)
TESTING
• Test with real traffic
• Measure everything
![Page 74: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/74.jpg)
LESSONS• Do work once, distribute results.
• Most likely to fail: your code. Don’t reinvent. Keep team small.
• End-to-end ACKs are expensive. Avoid.
• Understand use cases when load testing.
• Tune architecture to scale.
![Page 75: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/75.jpg)
LEARN MORE• Instagram
• Braintree
• highscalability.com
• VelocityConf (youtube, nov 2014 @ bcn?)
![Page 76: Scaling](https://reader035.vdocuments.us/reader035/viewer/2022062613/54561799b1af9f33608b4965/html5/thumbnails/76.jpg)
QUESTIONS? ANSWERS?
THANKS!
Òscar Vilaplana @grimborg http://oscarvilaplana.cat