the web scale
DESCRIPTION
Tuenti architecture to withstand1500+ million pageviews / dayTRANSCRIPT
The Web ScaleTuenti architecture to withstand1500+ million pageviews / day
Guillermo Pérez - [email protected] Security & Backend Architecture Tech Lead
What is a scalable system?
What is scalability
Some Tuenti stats
Tuenti Stats
13M usersREALLY ACTIVE
50%+ active weekly>1h browsing per DAY!
Tuenti Stats
- Each month, over:40,000 M pageviews50,000 M requests100 M new photos2,000+ Tb served photos
- On peaks:1,600 million pageviews/day35,000 requests/second6,000 million served photos/day
Tuenti Stats
- 1200+ servers~500 FEs~300 DBs~100 MCs~100 image serversOthers: Chat, HBase, Queues, Processors...
How to scale?
No silver bullet
MonitorKnow your toolsEvolve, iterate
Learn
Monitoring
- Your crystal ball!Glimpse of the futureAnswer questions
- Detect bottlenecks- Detect what needs to be optimized
The 90/10 RuleNo premature optimization
- Detect bad usages- Detect browser patterns- Detect changes, issues
Monitoring
Monitoring
Monitoring
MonitorKnow your toolsEvolve, iterate
Learn
Know your tools
- Stop reading blogs- Read internals documentation- Test software- Test hardware- Experiment
Know your tools
- Mysql (innoDB) IS fastphotos table (photo_id, user_id, ...)
PK photo_id, KEY user_idPK user_id, photo_id, KEY photo_idUsage: select * from photos where user=X
sortingcovering indexEven No SQL :)Hardware limits, replication
Know your tools
Know your tools
- MemcacheTons of persistent TCP conns eats your ram
UDP performance issuesSingle thread for UDPMultiport patch
proxiesStresses the network to the max
Driver issues, configurationVariable performance with net devices
Know your tools
- No SQLNot magic!Good for heavy write loadsGood for data processingStill needs tweaking partitioning, schemas
MonitorKnow your toolsEvolve, iterate
Learn
Evolve, iterate
- All architectures scale till certain point- Then you must rethink everything
Then, and only then!Remember premature optimization?Scale != efficientFuture is hard to predict
MonitorKnow your toolsEvolve, iterate
Learn
Learn
Learn from:Experience
FailureOthers
Architecture
Architecture
- Basic rules:Static: Add layers (easy caching)Dynamic: Move responsibility to edgesGeneral: Decentralize, redundancy
Architecture
- Design for failure:Support disablingNice degradation, fallbacksControlled launches
- Test with dark launches- Think on storage operations- Be able to migrate live- Focus on your core, use CDNs
Architecture
- Move work to the browser:Request routingTemplatesCachePefetch
- Move remaining to your FEs:Data relationsConsistencyPrivacy, access checkLive migrationsKnowledge of the storage infraestructure
Architecture
- All teams involvedFrontend
Good JS, templating, caching, prefetchingBackend
Data design, parallelization, optimizationsSystems
Iron benchmarks, tunning, networking
Dynamic site example
Scaling a website
- Setup: 1 server- Bottleneck: cpu - Solution: Add fronteds- Changes: Share sessions
Scaling a website
- Setup: N fronteds, 1 DB- Bottleneck: DB Reads - Solution: Add DB slaves- Changes: Split reads to slaves or DB proxy
Scaling a website
- Setup: N fronteds, 1 DB Master + N Slaves- Bottleneck: Limited # of slaves, so DB Reads - Solution: Chain replication / Add cache layer- Changes: Big ones!
Some caches in certain places is easyBut for dynamic app, Memcache as storageMakes your DB nor relational
Scaling a website
- Setup: N FEs, 1 DB Master + N Slaves, Caches- Bottleneck: DB Writes - Solution: Split tables into DB clusters- Changes: Add some DB abstraction
Scaling a website
- Setup: N FEs, N DB clusters, Caches- Bottleneck: DB Writes on certain table - Solution: Partition tables- Changes: DB abstraction and big changes
DB no longer relational, more key basedPartition key limits queriesDenormalization, duplicity
Scaling a website
- Setup: N FEs, N partitioned DBs, Caches- Bottleneck: Disk space, DB cost - Solution: Archive tables- Changes: DB abstraction + migration scripts
Scaling a website
- Setup: N FEs, N partition+archive DBs, Cache- Bottleneck: Internal network traffic - Solution: 2 level caches, split services, cache affinity- Changes: Cache abstraction, browsers
Scaling a website
- Setup: N FEs, N partition+archive DBs, multilayered Cache, services- Bottleneck: Datacenter - Solution:
Split servicesPartition users data
- Changes: Big ones!Greater replication lags, inconsistencies
The Tuenti Backend Framework
Backend Framework
- Our mission:Provide easy to use, productive, easy to debug, testable, fast, extensible, customizable, deterministic, reusable, instrumentalized (stats) framework and tools to ease developers daily work and manage the infraestructure.
Backend Framework
- From Request routing to Storage- Simple layers, clean responsibilities- Clean, organized codebase- Using:
convention over configurationconfiguration over coding
- Queuing system for async execution- Gathering stats from all levels
Backend Framework
- Request routing:Multiple entry pointsFast request parsers route to AgentsData centric agentsPrinters
Backend Framework
- Domain Api:Expose top-level business actionsClean, semantic ApiNo state, no magic, all data in paramsCheck privacy (the right place!)
Backend Framework
- Domain Backend:Implement public/internal business actionsClean, semantic ApiNo state, no magic, all data in paramsCoordinate transactionsNo privacy
Backend Framework
- Domain Storages (ORM like)Configure storage access for a table
Fields, validation, partitioning, primary key, caching techniques, custom queries.
Provide access to storage via standard apis:CRUD actionsCached ListsCached Queries+ Custom
Data container
Backend Framework
- Storage StrategiesCRUDCached ListsCached QueriesCUD Observers for custom actions
Backend Framework
- Storage ServiceProvides access to the different storage services:
mysql, memcache, hbase...Coordinates transactionsAbstract the infrastructure complexities:
partitioning, read/write, weights, hostsHandles transactions
Backend Framework
- Storage Services (concrete ones)Abstract the infrastructure complexities:
partitioning, read/write, weights, hostsApi close to real one:
Memcache: set, get, cas...Mysql: insert, select, update...
Backend Framework
- Storage Drivers (concrete ones)Read configManage PHP driversEnhance API
Love challenges?
We are hiring!http://jobs.tuenti.com
And... Stay tuned for our
Tuenti Challenge 2!http://contest.tuenti.net
Thanks!
?
Guillermo Pérez - [email protected] & Backend Architecture Tech Lead
Images Creative Commons from flickr:heydanielle, eschipul, deanfotos66, nrbelex, mikolski, fdecomite, guldfisken