finding a needle in haystack - university of texas at...

21
Finding a needle in Haystack FACEBOOK’S PHOTO STORAGE AKIB ZAMAN

Upload: others

Post on 31-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Finding a needle in

    HaystackFACEBOOK’S PHOTO STORAGE

    AKIB ZAMAN

  • MOTIVATION

    Facebook stores an enormous amount of data:

    260 billion images

    20 petabytes of data

    Traditional filesystems perform poorly under their workload

    Several disk operations were necessary to read a single photo

    Address long tail issue

    Design pre-requisites:

    Data is written once, read often, never modified and rarely deleted

  • QUESTION

    “In our experience, we find that the disadvantages of a traditional

    POSIX based filesystem are directories and per file metadata.”

    Explain how this disadvantage becomes the limiting factor for the

    read throughput.

  • FOUR MAIN GOALS

    High throughput and low latency

    Fault-tolerant

    Cost-effective

    Simple

  • QUESTION

    “We accomplish this by keeping all metadata in main memory,…”.

    Why did keeping metadata in memory become a challenge in

    Facebook’s system? Is it possible just to keep metadata of the most

    popular files in memory and to achieve the objective (“at most one

    disk operation per read”) by exploiting access locality?

  • QUESTION

    “That simplicity lets us build and deploy a working system in a few

    months instead of a few years.” Comment on this statement (why

    can Haystack be considered as simple adaptation of UNIX file

    systems?)

  • QUESTION

    “Haystack takes a straight-forward approach: it stores multiple

    photos in a single file and therefore maintains very large files.” Is

    there such a need to apply the technique in conventional file

    systems? If applied, what are its potential issues (give two examples)?

  • BRIEF OVERVIEW

  • QUESTION : HAYSTACK vs GFS

    Compare serving a photo in Haystack with GFS

    architecture.

  • QUESTION

    “.. we explored whether it would be useful to build a system similar

    to GFS.” Comment on the statement. Why does “Serving photo

    requests in the long tail represents a problem” on GFS?

  • THE HAYSTACK ARCHITECTURE

    Haystack Directory

    Haystack Cache

    Haystack Store

    Photo Read

    Photo Write

    Photo Delete

  • QUESTION

    The Cache “… caches a photo only if two conditions are met: (a)

    the request comes directly from a user and not the CDN and (b) the

    photo is fetched from a write-enabled Store machine.” Please

    explain this design choice.

  • HAYSTACK STORE

  • QUESTION

    “To retrieve needles quickly, each Store machine maintains an in-

    memory data structure for each of its volumes.” What is this data

    structure about?

  • QUESTION

    “As Haystack disallows overwriting needles, photos can only be

    modified by adding an updated needle with the same key and

    alternate key. “ Could you think of reason(s) why Haystack disallows

    overwriting?

  • THE INDEX FILE

  • QUESTION

    “Store machines maintain an index file for each of their volumes.”

    What is this index and why is it needed? Does maintaining the index

    significantly increase disk load?

  • QUESTION

    “Store machines maintain an index file for each of their volumes.”

    What is this index and why is it needed? How is space for deleted

    photos reclaimed?

  • THE STORE FILESYSTEM

    Store machine uses XFS

    XFS has two main advantages:

    The block maps for several large files can be small enough to be stored

    in main memory

    XFS provides efficient file pre-allocation and avoids fragmentation.

    XFS helps to eliminate disk operation for metadata for reading a photo.

  • RECOVERY FROM FAILURES

    Haystack needs to tolerate a variety of failures- faulty hard drives,

    misbehaving RAID controllers, bad motherboards.

    They use the following techniques to tolerate failures:

    Pitch-Fork: Background task that periodically checks the health of the

    machine.

    Bulk Sync: Reset the data of a Store machine using the volume files supplied

    by a replica.

  • CONCLUSION

    Haystack provides a fault-tolerant and simple solution to store pictures.

    Done at dramatically less cost and higher throughput than a

    traditional approach using NAS appliances.

    Haystack is incrementally scalable