software architecture for high traffic website

32
Software architecture for high traffic website Case study - Stack Overflow Presenter: Ngô Xuân Hòa (Novaon Adnetwork - Novanet) Hanoi .Net Meetup

Upload: tung-nguyen-thanh

Post on 15-Apr-2017

2.011 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Software architecture for high traffic website

Software architecture for high traffic website

Case study - Stack Overflow

Presenter: Ngô Xuân Hòa (Novaon Adnetwork - Novanet)Hanoi .Net Meetup

Page 2: Software architecture for high traffic website

Contents

About Stack Overflow

● Beginning

● Restructure #1

● Restructure # 2

● Founders

● Principles

SO architecture

● StackExchange.Redis

● Dapper

● Jil

Open-source Libs

Page 3: Software architecture for high traffic website

About Stack Overflow

Page 4: Software architecture for high traffic website

Founders

Jeff Atwood

Joel Spolsky

Page 5: Software architecture for high traffic website

2008

Stack Overflow

2009 2010 2011

Server Fault

Stack Exchange 1.0

Stack Exchange 2.0

Stack Overflow Carees

Rome wasn’t build in a day!

Page 6: Software architecture for high traffic website

● 100+ Q&A Sites

● 600+ million pageviews a month

● 3000+ requests per second

● 16+ million users

● 8+ million question

● 40+ million answers

Page 7: Software architecture for high traffic website

Principles

Perfomance Is a FeatureCache All The Thing!Reinvention is OK

Page 8: Software architecture for high traffic website

Stack Overflow Architecture

Page 9: Software architecture for high traffic website

2 times restructuringStack Exchange 1.0

● ASP.NET MVC

● SQL Server

● LINQ to SQL

● Wikipedia DB Design

Stack Exchange Network

LINQ to SQL

HAProxy

Redis

Lucene.NET

Scale Up

● Cache every things

● Elastic Search

● Reinvention

Page 10: Software architecture for high traffic website

Stack Exchange 1.0 Structure

Windows NLBLoad balancing

IIS Server IIS ServerWeb server

SQL ServerDatabase

Page 11: Software architecture for high traffic website

Window NLB

● Cons:

○ Limit to 8 Nodes

○ Cannot detect service failed

Web-tier

ASP.NET MVC

LINQ to SQL

SQL Server

● All-in-memory

● Full text search

Page 12: Software architecture for high traffic website

● 16 million pageviews a month

● 3 million unique visitors a month

● 6 million visits a month

Page 13: Software architecture for high traffic website

Follow none but learn from everyone!

Page 14: Software architecture for high traffic website

Pros

● Bottleneck: Database SQL Server

● High cost to scale up● Simple

Cons

Page 15: Software architecture for high traffic website

Restructure #1 - Stack Exchange Network

HAProxyRedis CacheLucene.NETTag Engine

Page 16: Software architecture for high traffic website

Stack Exchange Network Structure

HAProxy

Redis

IIS ServersDatabase

protobuf

sqlhttp http

Page 17: Software architecture for high traffic website

Load Balancing

● HAProxy:

○ Run in Linux

○ Free

Web-tier

ASP.NET MVC 3

LINQ to SQL

jQuery 1.4.5

Lucene.Net

Redis

● In-memory cache

● Master-slave

● Messaging notification

Page 18: Software architecture for high traffic website

3 Type Cache

Local Cache Site Cache

● Use Redis● Cache Site’s

data:- Q&As- Acceptance rates- ...

Global Cache

● Use Redis● Cache System

Data:- User info- Inbox- ...

● Use HttpRunTime.Cache

● Cache: - User Session- View Count- ...

Page 19: Software architecture for high traffic website

Update cache flow - Local cache

Local Cache

Redis

DB

Other sites

1 3

2.1

2.2

41 - OnStartup - Subcribe invalidation message to Redis2.1 - Data changed (by other sites, apps…)2.2 - Send message to Redis3 - Redis send Notification to Subscribers4 - Get data from DB - update Local cache

Page 20: Software architecture for high traffic website

Deployment flow with HAProxy

● Tell HAProxy to take the server out of rotation via a POST● Delay to let IIS finish current requests (~5 sec)● Stop the website● Copy files● Start the website● Local testing, update local cache, etc…● Re-enable HAProxy via another POST

Page 21: Software architecture for high traffic website

● High performance

● Low-cost Load Balancing (use HAProxy)

● Use Messaging của Redis for cache invalidation

Pros

● Too many SQL query

Cons

Page 22: Software architecture for high traffic website

● 95 million pageviews a month

● 800 requests per second

● 16 million users

Page 23: Software architecture for high traffic website

Restructure #2 - Scale Up

Cache All the ThingElastic SearchReinvention

Page 24: Software architecture for high traffic website

Stack Exchange Network Structure

Elastic SearchTag Engine

Databases

Redis

HAProxy

Page 25: Software architecture for high traffic website

5 Level cacheNetwork

LevelLocal Cache

Redis Cache

SQL SV Cache SSD

● Network Level: Browser cache…● Local Cache: HttpRuntime.Cache - Cache all data in memory● Redis Cache: Cache all data● SQL Server Cache: Cache all data in memory (the database servers have

384GB of RAM)

Page 26: Software architecture for high traffic website

Cache Flow

● Check Local Cache

● Else, check Redis Cache and update Local Cache

● If Cache Redis doesn’t have data, fetch from databases, then update Redis Cache and Local Cache

Page 27: Software architecture for high traffic website

Cache All the Things!

Page 28: Software architecture for high traffic website

Pros

● Data has latency

● Very, Very Fast (<400ms)

● Low servers load:

○ IIS: 10-15% CPU usage

○ DB: 10% CPU usage

● 99% request served by cache

Cons

Page 29: Software architecture for high traffic website

● 95 million pageviews a month

● 800 requests per second

● 16 million users

Page 30: Software architecture for high traffic website

Open-source Libs

• StackExchange.Redis - high perfomance Redis client

• Dapper - a micro ORM - very fast• Jil - fast JSON Serializer

Reinvention is OK!

Page 31: Software architecture for high traffic website

Reference sources

● http://stackoverflow.com

● http://highscalability.com

● http://codinghorror.com

● http://www.joelonsoftware.com

● http://nickcraver.com

● http://josephwoodward.co.uk/2014/02/the-architecture-of-stackoverflow/

Page 32: Software architecture for high traffic website

Thank you!

Ngô Xuân Hò[email protected]