software architecture for high traffic website
TRANSCRIPT
Software architecture for high traffic website
Case study - Stack Overflow
Presenter: Ngô Xuân Hòa (Novaon Adnetwork - Novanet)Hanoi .Net Meetup
Contents
About Stack Overflow
● Beginning
● Restructure #1
● Restructure # 2
● Founders
● Principles
SO architecture
● StackExchange.Redis
● Dapper
● Jil
Open-source Libs
About Stack Overflow
Founders
Jeff Atwood
Joel Spolsky
2008
Stack Overflow
2009 2010 2011
Server Fault
Stack Exchange 1.0
Stack Exchange 2.0
Stack Overflow Carees
Rome wasn’t build in a day!
● 100+ Q&A Sites
● 600+ million pageviews a month
● 3000+ requests per second
● 16+ million users
● 8+ million question
● 40+ million answers
Principles
Perfomance Is a FeatureCache All The Thing!Reinvention is OK
Stack Overflow Architecture
2 times restructuringStack Exchange 1.0
● ASP.NET MVC
● SQL Server
● LINQ to SQL
● Wikipedia DB Design
Stack Exchange Network
LINQ to SQL
HAProxy
Redis
Lucene.NET
Scale Up
● Cache every things
● Elastic Search
● Reinvention
Stack Exchange 1.0 Structure
Windows NLBLoad balancing
IIS Server IIS ServerWeb server
SQL ServerDatabase
Window NLB
● Cons:
○ Limit to 8 Nodes
○ Cannot detect service failed
Web-tier
ASP.NET MVC
LINQ to SQL
SQL Server
● All-in-memory
● Full text search
● 16 million pageviews a month
● 3 million unique visitors a month
● 6 million visits a month
Follow none but learn from everyone!
Pros
● Bottleneck: Database SQL Server
● High cost to scale up● Simple
Cons
Restructure #1 - Stack Exchange Network
HAProxyRedis CacheLucene.NETTag Engine
Stack Exchange Network Structure
HAProxy
Redis
IIS ServersDatabase
protobuf
sqlhttp http
Load Balancing
● HAProxy:
○ Run in Linux
○ Free
Web-tier
ASP.NET MVC 3
LINQ to SQL
jQuery 1.4.5
Lucene.Net
Redis
● In-memory cache
● Master-slave
● Messaging notification
3 Type Cache
Local Cache Site Cache
● Use Redis● Cache Site’s
data:- Q&As- Acceptance rates- ...
Global Cache
● Use Redis● Cache System
Data:- User info- Inbox- ...
● Use HttpRunTime.Cache
● Cache: - User Session- View Count- ...
Update cache flow - Local cache
Local Cache
Redis
DB
Other sites
1 3
2.1
2.2
41 - OnStartup - Subcribe invalidation message to Redis2.1 - Data changed (by other sites, apps…)2.2 - Send message to Redis3 - Redis send Notification to Subscribers4 - Get data from DB - update Local cache
Deployment flow with HAProxy
● Tell HAProxy to take the server out of rotation via a POST● Delay to let IIS finish current requests (~5 sec)● Stop the website● Copy files● Start the website● Local testing, update local cache, etc…● Re-enable HAProxy via another POST
● High performance
● Low-cost Load Balancing (use HAProxy)
● Use Messaging của Redis for cache invalidation
Pros
● Too many SQL query
Cons
● 95 million pageviews a month
● 800 requests per second
● 16 million users
Restructure #2 - Scale Up
Cache All the ThingElastic SearchReinvention
Stack Exchange Network Structure
Elastic SearchTag Engine
Databases
Redis
HAProxy
5 Level cacheNetwork
LevelLocal Cache
Redis Cache
SQL SV Cache SSD
● Network Level: Browser cache…● Local Cache: HttpRuntime.Cache - Cache all data in memory● Redis Cache: Cache all data● SQL Server Cache: Cache all data in memory (the database servers have
384GB of RAM)
Cache Flow
● Check Local Cache
● Else, check Redis Cache and update Local Cache
● If Cache Redis doesn’t have data, fetch from databases, then update Redis Cache and Local Cache
Cache All the Things!
Pros
● Data has latency
● Very, Very Fast (<400ms)
● Low servers load:
○ IIS: 10-15% CPU usage
○ DB: 10% CPU usage
● 99% request served by cache
Cons
● 95 million pageviews a month
● 800 requests per second
● 16 million users
Open-source Libs
• StackExchange.Redis - high perfomance Redis client
• Dapper - a micro ORM - very fast• Jil - fast JSON Serializer
Reinvention is OK!
Reference sources
● http://stackoverflow.com
● http://highscalability.com
● http://codinghorror.com
● http://www.joelonsoftware.com
● http://nickcraver.com
● http://josephwoodward.co.uk/2014/02/the-architecture-of-stackoverflow/
Thank you!
Ngô Xuân Hò[email protected]