scaling
Post on 01-Nov-2014
187 Views
Preview:
DESCRIPTION
TRANSCRIPT
WHAT’S THIS ABOUT?
People
Technology
Tools
PEOPLECare
Focus
Automate & Test.
Shared brain
Finish & DRY.
TECHDesign to clone
Separate pieces
API
Offload everything
Measure
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
Queue Instance
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
VIRTUAL QUEUE
Queue Instance
Queue Instance
Queue Instance
Queue Instance
TECH• Design to clone
• Separate pieces
• API
• Offload everything
• Measure
TYPES OF TASKS
• Realtime
• ASAP
• When you have time } Async!
INSTAGRAM’S FEED
• Redis queue per follower.
• New media: push to queues
• Small chained tasks
INSTAGRAM’S FEED
harro wouter orestis siebejan oscar
Schedulenext
batch
SMALL TASKS• 10k followers per task
• < 2s
• Finer-grained load balancing
• Lower penalty of failure/reload
CELERY: REDIS• Good: Fast
• Bad:
• Polling for task distribution
• Messy non-synchronous replication
• Memory limits task capacity
CELERY: BEANSTALK• Good:
• Fast
• Push to consumers
• Writes to disk
• Bad:
• No replication
• Only useful for Celery
CELERY: RABBITMQ• Fast
• Writes to disk
• Low-maintenance synchronous replication
• Excellent Celery compatibility
• Supports other use cases
RESERVATIONS• UI
• Room locking
• Room availability
• Registration manager
• Email, PDF invoice
• Payment
• Login
• …
WE DON’T DO THISdef do_everything(request): hotel_id = request.GET.hotel_id room_number = request.GET.room_number with room_mutex(hotel_id, room_number): room = (session.query(Room) .filter(Room.hotel_id == hotel_id) .filter(Room.room_number == room_number).one()) if not room.available: return Response("Room not available”, template=room_template) reservation = Reservation(client=request.client, room=room) session.add(reservation) room.available = False price = # price_calculation payment = Payment(reservation=reservation, price=price) session.add(payment) session.commit() url = payment.get_psp_url() return Redirect(url)
BUT WE DO THIS• Frontend UI
• Locking rooms
• Calculating room availability
• Temporarily locking rooms
• Payment processing
• PDF invoice generation
BUT WE CAN SCALE!
SCALE DB: HARD• Slaves
• Master-Master?
• Sharding?
SCALING
MINOR SCALE
MAJOR SCALE
FRONTEND
Everything Frontend
Externalpaymentproviders
User
Everything Frontend
Master
Read slaves
SPLIT
• Responsibility
• Stateful/stateless
• Type of system
TYPES OF SYSTEMS
• Unique (mutex, datastore)
• Multiple
TYPES OF TASKS
• Realtime
• ASAP
• When you have time } Async!
SPLIT THIS
Everything Frontend
Externalpaymentproviders
User
Everything Frontend
Master
Read slaves
AUTONOMOUS SYSTEMS
Payment
Externalpaymentproviders
Locking
InvoicePDF
Mailer
UI Reservations ManagerUser
SessionStorage
DatawarehouseReporting
Configuration
Payout
CLONABILITY
CLONABILITY
CLONABILITY
Frontend
CLONABILITY
Everything Frontend
Externalpaymentproviders
User
Everything Frontend
Master
Read slaves
WHAT’S IN AN EASY STEPAs little change as possible.
Reuse.
Unintrusive.
Measure.
Go on the right direction.
SMALL STEPS
PROBLEMS? !
Oversells Configuration Reporting Payout
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout SessionsRoom
Availability
Lock
ReadEverything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
ISOLATED SYSTEM Best technology
Decoupled
API
Testable
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout Sessions
Everything FrontendConfig Backend
Settings
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
INITIAL SYSTEM
Everything Frontend
INITIAL SYSTEM (MODIFIED)
Everything Frontend Sales
Sync
INITIAL SYSTEM (MODIFIED)
Sales Backend
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout Sessions
Everything FrontendSales Backend
Sales
Main DB
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
SMALL STEPSPROBLEMS? !
Oversells Configuration Reporting Payout SessionsSession
Storage Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
Everything FrontendEverything Frontend
WHEN?• Difficult.
• Measure everything.
• Find patterns.
• Define thresholds.
• Design: address as risk.
• Don’t overenigneer — Don’t ignore.
EVENTBRITE
• 2012: $600M ticket sales
• Accumulated: $1B
TECHNOLOGY• Monitoring: nagios, ganglia, pingdom
• Email: offloaded to StrongMail
• Load-balanced read slave pool
• Feature flags
• Automated server configuration and release with Puppet and Jenkins
TECHNOLOGY• Feature flags
• Develop on Vagrant
• Celery + RabbitMQ
• Virtual customer queue
• Big data for reporting, fraud, spam, event recommendations
TECHNOLOGY
• Hadoop
• Cassandra
• HBase
• Hive
• Separated into independent services
TIPS
• Instrument and monitor everything
• Lean
HOW BIG?
• 2Gb/day database transactions
• 3.5Tb/day social data analyzed
• 15Gb/day logs
ORDER PROCESSOR
• Pub/sub queue with Cassandra and Zookeeper
PUBLISHING
Publisher
Get queue lock+last batch id
Create new batch“process orders 10, 11, 12”
Store batch id, release lock
SUBSCRIBING
Subscriber
Get my latest processed batch id
Store result
Update my latest processed batch id
SCALING STORAGE• Move to NoSQL
• Aggressively move queries to slaves
• Different indexes per slave
• Better hardware
• Most optimal tables for large and highly-utilized datasets
EMAIL ADDRESSES
• Users have many email addresses.
• Lookup by email, join to users table
FIRST ATTEMPTCREATE TABLE `user_emails` (
`id` int NOT NULL AUTO_INCREMENT,
`email_address` varchar(255) NOT NULL,
... --other columns about the user
`user_id` int, --foreign key to users
KEY (`email_address`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
FIRST ATTEMPT
LOOKUP
CAN IT BE IMPROVED?
INDEX VS PK• InnoDB: B+trees, O(log n)
• Known user id: index on email not needed.
• Small win on lookup: O(1)
• Big win on not storing the index.
INNODB INDEXES
HASH TABLE
DISQUS• >165K messages per second
• <10ms latency
• 1.3B unique visitors
• 10B page views
• 500M users in discussions
• 3M communitios
• 25M comments
ORIGINAL REALTIME BACKEND
• Python + gevent
• NginxPushStream
• Network IO: great
• CPU: choking at peaks
• <15ms latency
CURRENT REALTIME BACKEND
• Go
• Handles all users
• Normal load:3200 connections/machine/sec
• <10ms latency
• Only 10%-20% CPU
Workers
CURRENT REALTIME BACKEND
Subscribed to results
Push result to userNginxPushStream
TESTING
• Test with real traffic
• Measure everything
LESSONS• Do work once, distribute results.
• Most likely to fail: your code. Don’t reinvent. Keep team small.
• End-to-end ACKs are expensive. Avoid.
• Understand use cases when load testing.
• Tune architecture to scale.
LEARN MORE• Instagram
• Braintree
• highscalability.com
• VelocityConf (youtube, nov 2014 @ bcn?)
QUESTIONS? ANSWERS?
THANKS!
Òscar Vilaplana @grimborg http://oscarvilaplana.cat
top related