fast by default -...

44
Fast by default: to achieve sustainable high performance Velocity China, Dec 8 th , 2010, Beijing Xiaoliang “David” Wei Facebook Inc. www.facebook.com/DavidWei [email protected]

Upload: others

Post on 17-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Fast by default: to achieve sustainable high performance

Velocity China, Dec 8th, 2010, Beijing

Xiaoliang “David” Wei Facebook Inc.

www.facebook.com/DavidWei [email protected]

Page 2: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

1 Facebook: always moving fast

2 Focus on abstracts

3 Data-driven

4 Empower the engineering team

Agenda

Page 3: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Facebook: always moving fast

Page 4: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  Saving 10ms per page => saving almost 4 man-years per day

=> saving 15 human-life of time per year

▪  Home page: -600 ms => +8% clicks on feeds (2008)

▪  Another study in summer 2009:

+500 ms latency => -3% page views (50th percentile)

+1 sec latency => -6% page views

Site speed matters!

300 million users, more than 12 billion page views / day

Page 5: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Site speed matters!

300 500 million users, more than 12 20 billion page views / day

▪  Saving 10ms per page => saving almost 4 7 man-years per day

=> saving 15 25 human-life of time per year

▪  Home page: -600 ms => +8% clicks on feeds (2008)

▪  Another study in summer 2009:

+500 ms latency => -3% page views (50th percentile)

+1 sec latency => -6% page views

Page 6: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

•  User adoption is always evolving

•  One new revision each week

•  Patches pushed everyday

•  Urgent fixes 24/7

Facebook: fast evolution

Page 7: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

•  User adoption is always evolving

•  One new revision each week

•  Patches pushed everyday

•  Urgent fixes 24/7

Facebook: fast evolution

•  Product cycle: in weeks

Page 8: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Facebook: fast evolution

•  Product cycle: in weeks

Page 9: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

“Fast by default” Making Facebook sustainably fast

▪  Mar 2008 – Jul 2008: Perf SWAT 20+ people

▪  Aug 2008 – Jun 2009: focus on infrastructure 3~4 people

▪  Jul 2009 – Dec 2009: Perf as a company goal ~ 10 people

▪  2010: Empowering the whole engineering team ~ 10 people

▪  How to make Facebook sustainably fast by a small team on performance and frontend infrastructure?

Page 10: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

The cycle of web performance improvement

Measurement

Understanding Improvement

Abstraction ? ?

? ?

Page 11: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Measurement

Understanding Improvement

Abstraction ? ?

?

The cycle of web performance improvement

Page 12: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Measurement

Understanding Improvement

Abstraction

The cycle of web performance improvement

Page 13: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

“Fast by default” Making Facebook sustainably fast

▪  Focus on abstracts

▪  Data driven

▪  Empowering the whole engineering team

Measurement

Understanding Improvement

Abstraction

Page 14: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Measurement"

Understanding"Improvement"

Abstraction"

Focus on abstracts

Page 15: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction

Frontend Infrastructure

NewsFeed Messaging Photo

Platform API

Farmville TexasPoker “like” … …

Page 16: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction Design principle

▪  Allow developer focus on product and move fast

▪  Only one way to do one thing

▪  A clear set of best-practice rules to follow

▪  Hide details of performance and reliability tunings

Page 17: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction Design principle

▪  Allow developer focus on product and move fast

▪  Only one way to do one thing

▪  A clear set of best-practice rules to follow

▪  Hide details of performance and reliability tunings

Dream

▪  Product development should be as easy & fun as building LEGO

Page 18: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction Example 1: Static Resource Management

▪  Only one way to do one thing:

▪  At least 4 ways to load Javascripts --> “require_static”

▪  Clear set of best-practice rules:

▪  “no inline JS script tag”

▪  “no manual packaging”

▪  Hide details of performance and reliability tunings:

▪  System optimizes the delivery of static resources (automatically choosing dynamic script tag injection and etc)

Page 19: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction Example 2: User tracking

▪  Only one way to do one thing:

▪  HTTP Cookie / Cookie / Server-side storage => Server-side cookie

▪  Clear set of best-practice rules:

▪  “no product-specific cookies”

▪  “Yummy Yummy… Your cookie is eaten by the Cookie Monster.”

▪  Hide details of performance and reliability tunings:

▪  Server-side cookie takes care of data storage/propagation and user tracking

(by John Allen)

Page 20: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction Example 3: Javascript Primer

▪  Only one way to do one thing:

▪  href + onclick / event delegation => primer

▪  Clear set of best-practice rules:

▪  “no inline javascript for event handler”

▪  Hide details of performance and reliability tunings:

▪  System optimizes the JS / non-JS experience

▪  System can optimize the pre-fetching / packaging of necessary JS for interactions

(by Makinde Adeagbo)

Page 21: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Primer: dialog links Example: <a rel="dialog" href="/ajax/intl/language_dialog.php”>English (US)</

a>!

In the endpoint: $dialog = ! new DialogResponse(kAsync_Auth_Any, false);!...!$dialog->setTitle($title)! ->setBody($body)! ->setButtons(DialogResponse::CLOSE)! ->send();!

(by Makinde Adeagbo)

Page 22: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Abstraction Examples:

▪  Static Resource Management

▪  User tracking: Server-side Cookie

▪  Interaction: Javascript Primer

▪  Page compositions: Pagelets & XHP

▪  PHP Preparable

Page 23: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Measurement"

Understanding"Improvement"

Abstraction"

Data driven

Page 24: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  User click: T1

▪  First byte arrival

▪  CSS arrivals (from CDN)

▪  Time-to-interact (TTI)

▪  Full HTML arrival (from server)

▪  JS arrivals (from CDN)

▪  Image arrivals (from CDN)

▪  Report time: T3

Measurement: End-to-end performance

▪  Server start time: T2

▪  Server end time

▪  Report hit time: T4

Page 25: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  User click: T1

▪  First byte arrival

▪  CSS arrivals (from CDN)

▪  Time-to-interact (TTI)

▪  Full HTML arrival (from server)

▪  JS arrivals (from CDN)

▪  Image arrivals (from CDN)

▪  Report time: T3

Measurement: End-to-end performance

▪  Server start time: T2

▪  Server end time

▪  Report hit time: T4

Page 26: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  User click: T1

▪  First byte arrival

▪  CSS arrivals (from CDN)

▪  Time-to-interact (TTI)

▪  Full HTML arrival (from server)

▪  JS arrivals (from CDN)

▪  Image arrivals (from CDN)

▪  Report time: T3

Measurement: End-to-end performance

▪  Server start time: T2

▪  Server end time

▪  Report hit time: T4

Page 27: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  User click: T1

▪  First byte arrival

▪  CSS arrivals (from CDN)

▪  Time-to-interact (TTI)

▪  Full HTML arrival (from server)

▪  JS arrivals (from CDN)

▪  Image arrivals (from CDN)

▪  Report time: T3

Measurement: End-to-end performance

▪  Server start time: T2

▪  Server end time

▪  Report hit time: T4

Page 28: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  User click: T1

▪  First byte arrival

▪  CSS arrivals (from CDN)

▪  Time-to-interact (TTI)

▪  Full HTML arrival (from server)

▪  JS arrivals (from CDN)

▪  Image arrivals (from CDN)

▪  Report time: T3

Measurement: End-to-end performance

▪  Server start time: T2

▪  Server end time

▪  Report hit time: T4

Assumption: similar request latency

T3 – T1 = T4 – T2

Page 29: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

▪  Similar to full page load

▪  T1 can be obtained by Javascript at client side most of the time;

Extra difficulties:

▪  More than one AJAX actions can happen at the same time;

▪  Much more measurement data to deliver

▪  Definition of “meaningful” / “important” AJAX actions

Measurement: AJAX performance

Page 30: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Understanding: Day-to-day monitoring What’s our speed?

▪  Collect gen time / network transfer time and render time

Network Time

Perf Logs

GenTime

Browser onload time

Daily site speed monitoring

Page 31: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Understanding: Day-to-day monitoring Example: TTI graph and its breakdown

Page 32: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Understanding: Project-based analysis Who made it faster / slower?

▪  Integrated with Launch System: ~100 experimental launches

Launch System

Network Time

Perf Logs

GenTime

Browser onload time

Daily site speed monitoring

Project-based regression detection

Page 33: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Understanding: Project-based analysis Example: A project launch’s performance impact

Page 34: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Project-based regression detection

Understanding: Numeric metrics Why are we fast / slow? How can I fix it?

▪  Technical metrics

Network Time

Perf Logs

GenTime

Browser onload time

Daily site speed monitoring

Regression analysis

Technical metrics

Launch System

Page 35: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Understanding: Numeric metrics Example: HTML bytes of home page hits

Page 36: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Measurement"

Understanding"Improvement"

Abstraction"

Empower the engineering team

Page 37: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Setting up goals Areas that we work with product teams:

▪  Fighting regression

▪  Annual/Quarterly per product quality goals

▪  Performance

▪  reliability

▪  Code quality

▪  New product design

▪  Performance expectation

▪  Code quality (usage of abstractions)

Page 38: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Providing convenient tools Good tools to empower the product team:

▪  UI Component Library

▪  Consistent user experience

▪  Speedup product development

▪  Greatly reduce CSS and HTML sizes

▪  Pagelet Gallery: Per pagelet performance analysis

▪  XHProf: PHP latency and CPU time drilldown analysis (Open sourced)

Page 39: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Providing convenient tools Example: UI Component Library

Page 40: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Creating necessary processes Necessary processes that can be helpful:

▪  New hire: Bootcamp / onboarding sessions

▪  New product: performance guideline

▪  Fire fighting vs development: Perf oncall

▪  Cross-team communication: “Perf point” -- “Perf adviser”

Page 41: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Creating necessary processes Example: Bootcamp

▪  Each engineering employee, from fresh undergrads to highly experienced engineering directors, spends 6 weeks on Bootcamp

▪  Two onboarding courses about Web Performance

▪  Basic tools (XHProf and etc)

▪  Basic infrastructure (Static resource management, Pagelets, and etc)

Page 42: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Measurement"

Understanding"Improvement"

Abstraction"

Summary

Page 43: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Achieving “Fast by default” Making a large scale web site sustainably fast

▪  Focus on abstracts

▪  Data driven

▪  Empowering the whole engineering team

Measurement

Understanding Improvement

Abstraction

Page 44: Fast by default - velocity.oreilly.com.cnvelocity.oreilly.com.cn/2010/ppts/VelocityChina2010Dec8Keynote.pdf · Mar 2008 – Jul 2008: Perf SWAT 20+ people Aug 2008 – Jun 2009: focus

Thank you!

Velocity China, Dec 8th, 2010, Beijing

Xiaoliang “David” Wei Facebook Inc.

www.facebook.com/DavidWei [email protected]