an analysis of facebook photo caching

35
by Huang et al., SOSP 2013 An Analysis of Facebook Photo Caching Presented by Phuong Nguyen Some animations and figures are borrowed from the original paper and presentation

Upload: logan-lott

Post on 02-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

An Analysis of Facebook Photo Caching. by Huang et al., SOSP 2013. Presented by Phuong Nguyen. Some animations and figures are borrowed from the original paper and presentation. Photos on Facebook: Overview. Album. Feed. Profile. 250 billion photos, as of Sep 2013. 2. Storage Backend. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Analysis of Facebook Photo Caching

by Huang et al., SOSP 2013

An Analysis ofFacebook Photo Caching

Presented by Phuong Nguyen

Some animations and figures are borrowed from the original paper and presentation

Page 2: An Analysis of Facebook Photo Caching

Photos on Facebook: Overview

Profile

Feed

Album

2

250 billion photos, as of Sep 2013

Page 3: An Analysis of Facebook Photo Caching

Photos on Facebook: Overview

3

StorageBackend

FBCacheLayers Full-stack

Study

AkamaiCDN

Page 4: An Analysis of Facebook Photo Caching

FACEBOOK PHOTO CACHING: HOW IT WORKS?

4

Page 5: An Analysis of Facebook Photo Caching

Client-based Browser CacheClient

Browser Cache

Client

5

LocalFetch

Page 6: An Analysis of Facebook Photo Caching

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser Cache

Client PoP

(Millions)

6

Page 7: An Analysis of Facebook Photo Caching

Single Global Origin Cache (FIFO)

Browser Cache

Edge Cache

OriginCache

PoPClient Data Center

(Tens)(Millions) (Four)

7

Hash(url)

Page 8: An Analysis of Facebook Photo Caching

Haystack Backend

Backend (Haystack)

Browser Cache

Edge Cache

OriginCache

PoPClient Data Center

(Tens)(Millions) (Four)

8

Page 9: An Analysis of Facebook Photo Caching

FULL-STACK CACHE STUDY: DATA COLLECTION

9

Page 10: An Analysis of Facebook Photo Caching

• Objective: collecting a representative sample that could permits correlation of events related to the same request

Trace Collection

Instrumentation Scope

Backend (Haystack)

Browser Cache

Edge Cache

OriginCache

PoPClient Data Center

10

Page 11: An Analysis of Facebook Photo Caching

Sampling Strategies

• Request-based: sampling requests randomly• Bias on popular content

• Objected-based: focused on some subset of photos selected by a deterministic test on photoId• Fair coverage of unpopular photos• Cross stack analysis

11

Page 12: An Analysis of Facebook Photo Caching

WORKLOAD ANALYSIS

12

Page 13: An Analysis of Facebook Photo Caching

Analysis Objectives

• Traffic sheltering effects of caches

• Photo popularity distribution

• Geographic traffic distribution & collaborative caching

• Can we make the cache better?

• Impact of sizes & algorithm

• Could we know which photos to cache?

13

Page 14: An Analysis of Facebook Photo Caching

ANALYSIS:TRAFFIC SHELTERING

14

Page 15: An Analysis of Facebook Photo Caching

Traffic Sheltering

77.2M

26.6M11.2M

7.6M

Backend (Haystack)

Browser Cache

Edge Cache

OriginCache

PoPClient Data Center

65.5%58.0%

31.8%

R

Traffic Share

65.5% 20.0% 4.6% 9.9%

15

Page 16: An Analysis of Facebook Photo Caching

ANALYSIS:PHOTO POPULARITY IMPACT

16

Page 17: An Analysis of Facebook Photo Caching

Popularity Distribution

Skewness is reduced after layers of cache17

Page 18: An Analysis of Facebook Photo Caching

Popularity Impact on Caches

18

Page 19: An Analysis of Facebook Photo Caching

ANALYSIS:GEOGRAPHIC TRAFFIC DISTRIBUTION & COLLABORATIVE CACHING

19

Page 20: An Analysis of Facebook Photo Caching

Substantial Remote Traffic at Edge

20

Atlanta 20% local

Miami 35% localDallas 50% local

Chicago 60% local

LA 18% local

NYC 35% local

Page 21: An Analysis of Facebook Photo Caching

Substantial Remote Traffic at Edge

21

Atlanta 20% local

5% Dallas

35% D.C.

5% NYC

20% Miami

5% California

10% Chicago

• Atlanta has 80% requests served by remote Edges

Page 22: An Analysis of Facebook Photo Caching

Collaborative Edge

22

Page 23: An Analysis of Facebook Photo Caching

Impact of Using Collaborative Edge

Collaborative Edge increases hit ratio by 18%

18%

23

Collaborative

Page 24: An Analysis of Facebook Photo Caching

ANALYSIS:IMPACTS OF CACHE SIZE & ALGORITHM

24

Page 25: An Analysis of Facebook Photo Caching

Potential Improvement Study

• Methodology: cache simulation• Replay the trace (25% warm up)• Evaluate using remaining 75%

• Improvement factors:• Cache size• Caching algorithm

• Evaluation metric: hit ratio

25

Page 26: An Analysis of Facebook Photo Caching

Edge Cache with Different Sizes & Algorithms

Infinite Cache

26

The same hit ratio can be achieved with a smaller cache and higher-performing algorithms

Page 27: An Analysis of Facebook Photo Caching

Edge Cache with Different Sizes & Algorithms

Infinite Cache

27

Sophisticated algorithm can achieve better hit ratio with the same cache size

Page 28: An Analysis of Facebook Photo Caching

ANALYSIS:WHICH PHOTOS TO CACHE?

28

Page 29: An Analysis of Facebook Photo Caching

Intuitions

• Properties that intuitively associated with photo traffic: • The age of photos • The number of Facebook followers

associated with the owner

29

Page 30: An Analysis of Facebook Photo Caching

Content Age Affect

• Age-based cache replacement algorithm could be effective

• Fresh content is popular and tends to be effectively cached throughout the hierarchy

30

Page 31: An Analysis of Facebook Photo Caching

Social Affect

• The more popular photo owner is, the more likely the photo is to be accessed

• Browser caches tend to have lower hit ratios for popular users (“viral” effect)

31

Page 32: An Analysis of Facebook Photo Caching

DISCUSSIONS

32

Page 33: An Analysis of Facebook Photo Caching

Discussions

33

• Evaluation method:• Only consider desktop clients, excluding mobile

clients• Trends by mobility of users

• Sampling: object-based sampling might not represent realistic workload

• Impact of caching done by Akamai CDN• Correlating requests method is not perfect

• Latency issue• Evaluation mainly focuses on hit ratio & traffic

sheltering, not latency• Latency of collaborative caching is note evaluated

Page 34: An Analysis of Facebook Photo Caching

Discussions (cont.)

34

• Other potential improvements:• Improved caching algorithm taking into account

metadata of photos• Optimal placement of resizing functionality along

the stack• The use of Clairvoyant caching might be possible

based on predicting future accesses• E.g., photos from the same album, photos

appear on news feed, etc.• Solve geographical diversity by improving routing

policy (e.g., put more weight into locality aspect)

Page 35: An Analysis of Facebook Photo Caching

THANK YOU!

35