fast, inexpensive content- addressed storage in foundation sean rhea*russ cox, alex pesterev*...
TRANSCRIPT
![Page 1: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/1.jpg)
Fast, Inexpensive Content-Addressed Storage in Foundation
Sean Rhea* Russ Cox, Alex Pesterev*Meraki, Inc. MIT CSAIL
*Work done while at Intel Research, Berkeley.
![Page 2: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/2.jpg)
“Digital Dark Ages?”
• Users increasingly store their most valuable data digitally– Wedding/baby photographs– Letters (now called email)– Diaries, scrapbooks, tax returns
• Yet digital information remains especially vulnerable
• Terry Kuny: “We are living in the midst of digital Dark Ages”– Hard drives crash– Removable media evolve (e.g., 5 ¼” floppies)– File formats become obsolete (e.g., WordStar, Lotus 1-2-3)– What will the world remember of the late 20th century?
![Page 3: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/3.jpg)
As a community, we’re not bad at storing important data over the long term.
We’ve only just begun to think about how we’ll interpret that data 30 years from now.
![Page 4: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/4.jpg)
For Example…
• Viewing an old PowerPoint presentation– Do we still have PowerPoint at all? And Windows?– Does the presentation use non-standard fonts/codecs?– Has some newer application overwritten a shared
library with an incompatible version (“DLL Hell”)?
• Not just a Microsoft problem: consider a web page– Even current IE/Safari/Firefox don’t agree on formatting– All kinds of plugins necessary: sound, video, Flash
![Page 5: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/5.jpg)
The Foundation Idea
• Make daily backups of entire software stack– Archives users’ applications, OS, and configuration state
• Don’t worry about identifying dependencies– Just save it all: “Every byte, every night”
• To recover an obscure file, boot the relevant stack in an emulator– View file with the application that created it
![Page 6: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/6.jpg)
Foundation FAQ
• Why preserve the entire disk?– Preserve software stack dependencies: preserve the data with the right
application, libraries, and operating system as a single unit– Works for all applications, not just ones designed for preservation
• Why daily images?– Want to preserve machine state as close as possible to last write of
user’s data (i.e., preserve image before something changes)– Also allows recovery from user errors
• Why emulate hardware?– Much better track record than emulating software– Software example: OpenOffice emulating Microsoft Word (yikes)– Hardware emulators available today for Amiga, PDP-11, Nintendo…
![Page 7: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/7.jpg)
I would love to give a talk about why Foundation is a great solution to the
digital preservation problem.
Really, though, I think it’s just a pretty good start.
Instead, I’m going to talk about a fun problem we had to solve to make it work.
![Page 8: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/8.jpg)
Every Byte, Every Night?Indefinitely? Really?
• Plan 9 did exactly that– Archive changed blocks every night to optical jukebox– Found that storage capacity grew faster than usage
• Later with Content-Addressable Storage (Venti)– Automatically coalesces duplicate data to save space– Required multiple, high-speed disks for performance
• Challenge for Foundation: provide similar storage efficiency on consumer hardware– “Time Machine model”: one external USB drive
![Page 9: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/9.jpg)
Talk Outline
• Introduction– What is Foundation?– Review of Content-Addressed Storage (Venti)
• Contributions– Making Cheap Content-Addressed Storage Fast– Avoiding Concerns over Hash Collisions
• Related Work
• Conclusions
![Page 10: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/10.jpg)
Venti Review
• Plan 9 file system was two-level– Spinning storage, mostly a normal file system– Archival storage, optical write-once jukebox
• Venti replaced optical jukebox– Still write-once– Chunks of data named by their SHA-1 hashes
“Content-Addressable Storage (CAS)”– Automatically coalesces duplicate writes
![Page 11: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/11.jpg)
5:h( )16:7:8:9:
h( )2
reads 1st blockreads 2nd block
User’s Hard Drive External USB Drive
Hash Offset
Data Log
seen it before?
0:1:2:3:h( )04:
RAM
ArchivalProcess
Summary
h( )
appendto log
update index
appendhash to
summary,h( ) ,h( )
reads 4th block
no logwrite!
h( ),
Venti Review
![Page 12: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/12.jpg)
Venti Review
User’s Hard Drive External USB Drive
Hash Offset
Data Log
0:h( )41:2:h( )33:h( )04:h( )7
5:h( )16:h( )67:h( )58:h( )29:
RAM
Summary
h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( )
RestoreProcess
lookup hashof 1st block
map hash to log offset
read blockfrom log
restore block
Crash!
Final step (not shown): archivesummary in data log as well
![Page 13: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/13.jpg)
Notes on Venti• The Good News:
– CAS stores each block with particular contents only once– Changing any one block and re-archiving uses only one
more block in archive– Adding a duplicate file from a different source uses no
additional storage
• The Bad News:– Synchronous, random reads to on-disk index
![Page 14: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/14.jpg)
reads 4th block
User’s Hard Drive External USB Drive
Hash Offset
Data Log
seen it before?
0:1:2:3:h( )04:
5:h( )16:7:8:9:
RAM
ArchivalProcess
Summary
h( ),h( ) ,h( )
h( )2
Venti Review
Have to seek to theright bucket
![Page 15: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/15.jpg)
Venti Review
User’s Hard Drive External USB Drive
Hash Offset
Data Log
RAM
Summary
h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( )
RestoreProcess
lookup hashof 1st block
map hash to log offset
0:h( )41:2:h( )33:h( )04:h( )7
5:h( )16:h( )67:h( )58:h( )29:
Have to seek to theright bucket
![Page 16: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/16.jpg)
Notes on Venti• The Good News:
– CAS stores each block with particular contents only once– Changing any one block and re-archiving uses only one
more block in archive– Adding a duplicate file from a different source uses no
additional storage
• The Bad News:– Synchronous, random reads to on-disk index– Best case, one-disk performance for 512-byte blocks:
one 5 ms seek per 512 bytes archived = 100 kB/s
– That’s 12 days to archive a 100 GB disk!– Larger blocks give better throughput, less sharing
![Page 17: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/17.jpg)
Notes on Venti (con’t.)• Venti’s solution: use 8 high-speed disks for index
– Untennable in consumer space– Wears disks out pretty quickly, too
• The “compare-by-hash” controversy:– Fear of hash collisions: two different blocks with same
hash breaks Venti– May be very unlikely, but cost (data corruption) is huge
Does CAS really require a cryptographically strong hash?
![Page 18: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/18.jpg)
Talk Outline
• Introduction– What is Foundation?– Review of Content-Addressed Storage (Venti)
• Contributions– Making Cheap Content-Addressed Storage Fast– Avoiding Concerns over Hash Collisions
• Related Work
• Conclusions
![Page 19: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/19.jpg)
Making Inexpensive CAS Fast
• The problem: disk seeks– Secure hash randomizes an otherwise sequential disk-
to-disk transfer– To reduce seeks, must reduce hash table lookups
• When do hash table lookups occur?1. When writing data, to determine if we’ve seen it before
2. When writing data, to update the index
3. When reading data, to map hashes to disk locations
![Page 20: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/20.jpg)
2. Updating the Index
• After appending a block to the data log, must update the index– Psuedorandom hash causes a seek
![Page 21: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/21.jpg)
User’s Hard Drive External USB Drive
Hash Offset
Data Log
0:1:2:3:h( )04:
5:h( )16:7:8:9:
RAM
ArchivalProcess
Summary
h( )
appendto log
update indexUpdating the Index
Have to seek to theright bucket
reads 2nd block
![Page 22: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/22.jpg)
2. Updating the Index
• After appending a block to the data log, must update the index– Psuedorandom hash causes a seek
• Easy to fix: use a write-back index cache– Store index writes in memory– Flush to disk sequentially in large batches– On crash, reconstruct index from the data log
![Page 23: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/23.jpg)
3. Mapping Hashes to Disk Locations During Reads
• To restore disk– Start with the list of original blocks’ hashes– Lookup each block in index– Read block from data log and restore to disk
![Page 24: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/24.jpg)
User’s Hard Drive External USB Drive
Hash Offset
Data Log
RAM
Summary
h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( )
RestoreProcess
lookup hashof 1st block
map hash to log offset
0:h( )41:2:h( )33:h( )04:h( )7
5:h( )16:h( )67:h( )58:h( )29:
Have to seek to theright bucket
![Page 25: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/25.jpg)
3. Mapping Hashes to Disk Locations During Reads
• To restore disk– Start with the list of original blocks’ hashes– Lookup each block in index– Read block from data log and restore to disk
• Observation: data log is mostly ordered– Duplicate blocks often occur as part of duplicate files
![Page 26: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/26.jpg)
Ordering in Data Log
User’s Hard Drive External USB Drive
Hash Offset
Data Log
RAM
Summary
h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( )
0:h( )41:2:h( )33:h( )04:h( )7
5:h( )16:h( )67:h( )58:h( )29:
![Page 27: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/27.jpg)
3. Mapping Hashes to Disk Locations During Reads
• To restore disk– Start with the list of original blocks’ hashes– Lookup each block in index– Read block from data log and restore to disk
• Observation: data log is mostly ordered– Duplicate blocks often occur as part of duplicate files– Idea: add another index, ordered by log offset– Read-ahead in this index to eliminate future lookups
in original index
![Page 28: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/28.jpg)
Offset Hash0:h( )1:h( )2:h( )3:h( )4:h( )5:h( )
6:h( )7:h( )8:9:
10:11:
read blockfrom log(seek!)
read blockfrom log
(no seek!)
Index by Offset
User’s Hard Drive External USB Drive
Hash Offset
Data Log
RAM
Summary
h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( ),h( ), h( ), h( )
RestoreProcess
lookup hashof 1st block
map hash to log offset (seek!)
Crash!
Hash Offset
prefetch hashes for next few offsets from
secondary index(seek!)
new index, sorted by offset
h( )0h( )1h( )2
h( )3h( )4
restore block 0:h( )41:2:h( )33:h( )04:h( )7
5:h( )16:h( )67:h( )58:h( )29:
lookup hashof 2nd block
find log offsetin secondary
index – no seek!
![Page 29: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/29.jpg)
1. Is a Block New, or Duplicate?
• Optimization for reads also helps duplicate writes– Index misses on first duplicate block– Hits on subsequent blocks rewritten in same order
• Doesn’t help for new data– Every lookup in primary index fails– Still suffer a seek for every new block
![Page 30: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/30.jpg)
1. Is a Block New, or Duplicate?
• Idea: use a Bloom filter to identify new blocks– Lossy representation of the primary index– Uses much less memory than index itself
• For any given block, Bloom filter tells us:– It’s definitely new append to log, update index– It might be duplicate lookup in index
• If it really is a duplicate, we get the prefetch benefit• Otherwise, called a “false positive”• Using enough memory keeps false positives at ~1%
![Page 31: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/31.jpg)
Results
• Do these optimizations pay off?– Buffering index writes is an obvious win– Bloom filter is, too: removes 99% of seeks when
writing new data– Both trade RAM for seeks
• Benefit of secondary index less clear– If duplicate data comes in long sequences, it reduces
index seeks to two per sequence– If duplicate data comes in little fragments, it doubles
the number of index seeks– Need traces of real data to answer this question
![Page 32: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/32.jpg)
Results (con’t.)
• Research group at MIT has been running Venti as its backup server for two years– We looked at 400 nightly snapshots– Simulated archiving and restoring these in both Venti
and Foundation
Venti Foundation
Average archival speed < 1 MB/s 20.1 MB/s
% time spent seeking 96% 10%
Average restore speed 1.2 MB/s 13.6 MB/s
% time spent seeking 95% 58%
![Page 33: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/33.jpg)
Talk Outline
• Introduction– What is Foundation?– Review of Content-Addressed Storage (Venti)
• Contributions– Making Cheap Content-Addressed Storage Fast– Avoiding Concerns over Hash Collisions
• Related Work
• Conclusions
![Page 34: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/34.jpg)
Eliminating “Compare by Hash”
• Some worried that same SHA-1 doesn’t imply same contents (i.e., hash collisions are possible)– Even if very rare, consequences (corruption) too great
• Stepping back a bit, CAS as a black box:– Give it a data block, get back an opaque ID– Give it an opaque ID, get back the data block
• Do we care that the ID is a SHA-1 hash?– What if the “opaque” ID was just the block’s location
in the data log?
![Page 35: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/35.jpg)
Using Locations As IDs
• Pros+ Reads require no index lookups at all+ System can still find potential duplicates using
hashing (with a weaker, faster hash function)
• Cons– Need another mechanism to check integrity– Since hash untrusted, must compare suspected
duplicates byte-by-byte
• Others have claimed these byte-by-byte comparisons are a non-starter
![Page 36: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/36.jpg)
2nd Disk Arm to the Rescue
• Once we eliminate most index reads (via our previous optimizations), the backup disk is otherwise idle while backing up duplicate data
• Can instead put it to work doing byte-by-byte comparisons of suspected duplicates
Foundation
Venti By Hash By Value
Archival < 1 MB/s 20.1 MB/s 15.4 MB/s
Restore 1.2 MB/s 13.6 MB/s 15.0 MB/s
![Page 37: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/37.jpg)
Talk Outline
• Introduction– What is Foundation?– Review of Content-Addressed Storage (Venti)
• Contributions– Making Cheap Content-Addressed Storage Fast– Avoiding Concerns over Hash Collisions
• Related Work
• Conclusions
![Page 38: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/38.jpg)
Related Work
• Apple Time Machine– Duplicates coalesced at file level via hard links
• Netapp WAFL, ZFS– Copy-on-write coalesces blocks at the FS level– Misses duplicates that come into system separately
• Data Domain Deduplication FS– Very similar to Foundation, in enterprise context– Depends on collision-freeness of hash function
• Lots of other Content-Addressed Storage work– LBFS, SUNDR, Peabody
![Page 39: Fast, Inexpensive Content- Addressed Storage in Foundation Sean Rhea*Russ Cox, Alex Pesterev* Meraki, Inc.MIT CSAIL *Work done while at Intel Research,](https://reader031.vdocuments.us/reader031/viewer/2022032722/56649ce05503460f949aa267/html5/thumbnails/39.jpg)
Conclusions
• Consumer-grade CAS works now– A single, external USB drive is enough– Just have to be crafty about avoiding seeks
• Lots of uses other than preservation– E.g., inexpensive household backup server that
automatically coalesces duplicate media collections
• Doesn’t require a collision-free hash function