aviran mordo head of engineering @aviranm linkedin.com/in/aviran aviransplace.com scaling with...

42
Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Upload: nigel-wilkins

Post on 14-Jan-2016

218 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Aviran MordoHead of Engineering

@aviranm

linkedin.com/in/aviran

aviransplace.com

Scaling with Microservices Architecture and Multi-cloud platforms

Page 2: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms
Page 3: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Wix in Numbers

Over 72M users (website builders)

Static storage is >2PB of data

3 data centers + 3 clouds (Google, Amazon, Azure)

2B HTTP requests/day

1000 people work at Wix

Page 4: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Initial Architecture

Built for fast development

Stateful login (Tomcat session), Ehcache, file

uploads

No consideration for performance, scalability and

testing

Intended for short-term use

Tomcat, Hibernate, custom web framework

Lighttpd(file

serving)MySQL

DB

Wix(Tomcat)

Page 5: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

The Monolithic Giant

One monolithic server that handled everything

Dependency between features

Changes in unrelated areas of the system caused deployment of the whole system

Failure in unrelated areas will cause system wide downtime

Page 6: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Breaking the System Apart

Page 7: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms
Page 8: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Concerns and SLA

Data Validation

Security / Authentication

Data consistency

Lots of data

Edit websites

High availability

High performance

Lots of static files

Very high traffic volume

Viewport optimization

Long tail (immutable)

Serving Media

High availability

High performance

High traffic volume

Long tail (mutable)

View sites, created by Wix editor

Page 9: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Wix Segmentation

1. Editor Segment 3. Public Segment2. Media Segment

Networking

Page 10: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

HTML Editor

Flash Editor

MSM

Private Media

Public Media

Editor Segment Public Segment

Premium Services

eCommerse

List DB

App Builder

App Store

App Market

Dashboard

Statics/media

Mailer

TimeZone

Public HTML API

Public API (Flash)

MSP

Public Server

HTML Renderer

HTML SEO Renderer

Flash Renderer

Flash SEO Renderer

Sitemap Renderer

Robots.txt Renderer

User Server

Template Viewer

ContactsHUBActivit

y

Site Members

Provided Mailing Service

Comments

Snapshoter

User Pref

Feed Me

Shout-out Hotels

PETRI

Site Pref

Dist LoggerSlicer

eCom Renderer

eCom Cart

eCom Checkout

eCom Catalog

eCom Orders

Payment Facade

Account Info

HTML API

HTML Embeder

BlogMobile

Page 11: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

TRADE-OFFIt is all about

Page 12: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms
Page 13: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Microservices Guidelines

Each service has its own DB schema (if one is needed)

Only one service should write to a specific DB table(s)

There may be additional read-only services that directly accesses the DB (for performance reasons)

Services are stateless

No DB transactions

Cache is not a building block, but an optimization

Page 14: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Microservices TradeoffsEach service has its own DB schema (if one is needed)

Gain - Easy to scale microservices based on service level concerns Tradeoff – system complexity, performance

Only one service should write to a specific DB table(s)Gain - Decoupling architecture – faster development Tradeoff – system complexity / performance

May have additional read-only services that accesses the DBGain - Performance gainTradeoff - coupling

Services are statelessGain - Easy to scale out (just add more servers)Tradeoff - performance / consistency

No DB transactionsGain - Better DB performance, easier to scaleTradeoff - system complexity

Page 15: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

1. Editor Segment

Page 16: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Editor Server

Immutable JSON pages (~3M / day)

Site revisions

Active – standby MySQL cross datacenters

Editor Server

MySQL Active Sites

MySQL Archiv

e

Page 17: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Find Your Critical Path

Page 18: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Protect The Data

DB outage with fast recovery = replication

Data poisoning/corruption = revisions / backup

Make the data available at all times = data distribution to multiple locations / providers

Page 19: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Browser

Editor Server

GCS

MySQL Active Sites

MySQL Archiv

e

Saving Editor Data

WixMedia(Amazon)

WixMedia(Google)

Save Page(s)

200 OK

Upload

Save Page

DC replication

Notify

MySQL Archiv

e

MySQL Active Sites

S3

WixMedia(DC-1)

Page 20: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Browser

Editor Server

GCS

MySQL Active Sites

MySQL Archiv

e

WixMedia(Amazon)

WixMedia(Google)

Save Page(s)

200 OK

Upload

Save Page

DC replication

Notify

MySQL Archiv

e

MySQL Active Sites

S3

WixMedia(DC-1)

Self Healing Process

Page 21: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

No DB Transactions

Save each page (JSON) as an atomic operation

Page ID is a content based hash (immutable/idempotent)

Finalize transaction by sending site header (list of pages)

Can generate orphaned pages, not a problem in practice

Page 22: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

2. Media Segment (WixMP)

Page 23: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Wix Media Platform (WixMP)

Eventual consistent distributed file system(2PB user media files)

Dynamic media processing

Multi datacenter aware

Automatic fallback cross DC

Run on commodity servers & cloud

Page 24: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

T

Google Cloud

Prospero – Wix Media Manager

get image.jpg

First fallback

Second

fallback

If not in

CDN

Amazon

x36Tx36

Tx32

Austin

CDN

Page 25: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

3. Public Segment

Page 26: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Public Segment Roles

Routing (resolve URLs)

Dispatching (to a renderer)

Rendering (HTML,XML,TXT) Public

Server

HTML Rendere

r

HTML SEO

Renderer

Flash Renderer

Sitemap Rendere

r

Robots.txt

Renderer

www.example.com

Flash SEO

Renderer

Page 27: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Public SLA

Our goal: 99% response time <100ms at peak traffic

Page 28: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Publish Site

Publish site header (a map of pages for a site)

Publish routing table

Publish site header / routes (CQRS)Editor Segment Public Segment

Page 29: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Built For Speed

Minimize out-of-process hops (2 DB, 1 RPC)

Lookup tables are cached in memory, updated every few minutes

Denormalized data – optimize for read by primary key (MySQL)

Minimize business logic

Page 30: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

How a Page Gets Rendered

Bootstrap HTML template that contains only data

Only JavaScript imports

JSON data (site-header + dynamic data)

No “real” HTML view

Page 31: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Offload rendering work to the browser

Page 32: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

The average Intel Core i750 can push up to 7 GFLOPS without overclocking

Page 33: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Why JSON?

Easy to parse in JavaScript and Java/Scala

Fairly compact text format

Highly compressible (5:1 even for small payloads)

Easy to fix rendering bugs and cross browsers issues (just deploy a new code)

Page 34: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Minimum Number of Public Servers Needed to Serve 66M Sites

4

Page 35: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Public SLABe Available 99.999%

Page 36: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Serving a Site – Sunny Day

Archive

CDN WixMP

Browserhttp://

example.wix.com

Store HTML to cache

HTTP Request

Notify site view

LB

Public

Renderer

HTML

Resources / Media

HTTP Request

Failure Points

Page 37: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Serving a Site – DC Lost

Archive

CDN WixMP

Browserhttp://

example.wix.com

LB

Public

Renderer

LB

Public

Renderer

Change DNS

HTTP Request

Page 38: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Serving a Site – Public Lost

Archive

Browserhttp://

example.wix.com

LB

Public

Renderer

Get Cached HTML Version

HTMLHTTP Request

LB

Public

Renderer

Fallback to 2nd DC

Page 39: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Living in the Browser

CDN WixMP

Browserhttp://

example.wix.com

LB

Public

Renderer

Editor Pages

Fallback

JSON / Media

HTMLHTTP Request

Fallback

Page 40: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Summary

Identify concerns and SLA for different parts of the system

Build redundancy in critical path (for availability)

De-normalize data (for performance)

Minimize out-of-process hops (for performance)

Take advantage of client’s CPU power

Page 41: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms
Page 42: Aviran Mordo Head of Engineering @aviranm linkedin.com/in/aviran aviransplace.com Scaling with Microservices Architecture and Multi-cloud platforms

Aviran MordoHead of Engineering

@aviranm

linkedin.com/in/aviran

aviransplace.com

http://engineering.wix.com

http://goo.gl/3xhpNW

@WixEng