improving content addressable storage for databases

22
mproving Content Addressable Storag For Databases Conference on Reliable Awesome Projects (no acronyms please) Advanced Operating Systems (CS736) Brandon Smith Vandhana Selvaprakash

Upload: mahlah

Post on 12-Jan-2016

26 views

Category:

Documents


1 download

DESCRIPTION

Improving Content Addressable Storage For Databases. Conference on Reliable Awesome Projects (no acronyms please) Advanced Operating Systems (CS736). Brandon Smith Vandhana Selvaprakash. Background. Content Addressable Storage (CAS) Store information by content - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Improving Content Addressable Storage For Databases

Improving Content Addressable StorageFor Databases

Conference on Reliable Awesome Projects (no acronyms please)

Advanced Operating Systems (CS736)

Brandon SmithVandhana Selvaprakash

Page 2: Improving Content Addressable Storage For Databases

Background• Content Addressable Storage (CAS)

– Store information by content – Content addressable vs. location addressable

• Idea

– Divide files into ‘chunks’– Index by chunk fingerprints

• Allows for…– Exploiting commonality for data reduction– Enriched metadata for better searches

• Near-line storage – WORM– Data integrity

Page 3: Improving Content Addressable Storage For Databases

Motivation

• CAS is great, but not for databases– Why? Metadata is tightly interspersed with data…

• Cannot just ignore database storage– Makes up significant fraction of storage system clients

• Goal: improve CAS for databases

Page 4: Improving Content Addressable Storage For Databases

Approach

• Evaluate state-of-the-art CAS methods

• Study different database systems for storage characteristics

• Suggest techniques that improve CAS for databases

Page 5: Improving Content Addressable Storage For Databases

What’s the State-of-the-Art?

• Avoiding the Disk Bottleneck in the Data Domain Deduplication File System, B. Zhu et al. FAST ’08

• Content-aware variable-length chunking

• Chunk Compression

• Acceleration Methods– Summary Vector– Stream-Informed Segment Layout– Locality Preserved Caching

Page 6: Improving Content Addressable Storage For Databases

Analysis of State-of-the-Art

• Implemented:– Variable-length chunking

– Efficient hash value storage and lookup

– Summary Vector

• Questions:– How well does average chunk size match desired?– Can we use this for other problems?– Performance on different types of data?

Hash value

Bloom filter

Page 7: Improving Content Addressable Storage For Databases

How well does average chunk size match desired?

Desired (bytes)      Actual Average

128             175    256            303512            5591024            10702048            20944096            41378192            822016384            1641332768            32848

…8192            21335           

…8192 15443…

4096            11008192            184916384            2062

560 MB Mix of Files

330 MB Empty Oracle Tablespace

1 GB Music Data

5 MB Empty PostgreSQL Tablespace

Page 8: Improving Content Addressable Storage For Databases

Can we use this for other problems?

Yes! Efficiently discover duplicate files (e.g., music files)

0.9920000000000000 C:\Workspace\Alicia\Oasis_-_What's_The_Story_...0.9907834101382489 C:\Workspace\Alicia\Incubus_-_Morning_View_-_...0.9903381642512077 C:\Workspace\Alicia\Hard_Candy_-_Counting_Cro...0.9891304347826086 C:\Workspace\Songs\Audioslave_-_Show_Me_How_T...0.9880239520958084 C:\Workspace\Alicia\The_donavon_frankenreiter...0.9872611464968153 C:\Workspace\Songs\Cornell-Rage_Demos_-_Track...0.9837398373983740 C:\Workspace\Alicia\Eric_clapton_-_The_cream_...... ...0.09090909090909091 C:\Workspace\Songs\Blink_182_-_Reebok_Commerc...0.08602150537634409 C:\Workspace\Songs\311_Wake_Your_Mind_Up-from...0.08333333333333333 C:\Workspace\Songs\AUDIOSLAVE-getaway_car-.mp...0.07826086956521740 C:\Workspace\Songs\Glassjaw_-_Lovebites_and_R...0.07692307692307693 C:\Workspace\Songs\aladdin_-_i_can_show_you_t...0.07594936708860760 C:\Workspace\Songs\Glassjaw_-_Trailer_Park_Je...

Page 9: Improving Content Addressable Storage For Databases

Performance on different types of data?

• Audio data– Very well

• Common mix of files (PDF, text, PPT, image data, etc.)– Reasonably well

• Empty database tablespaces– Terrible

Page 10: Improving Content Addressable Storage For Databases

Observations

• Even Udi Manber makes mistakes!

• Debugging code especially difficult when dealing with raw data and complex hash functions

• Variable size chunking does not work well for databases

MtptFpF mod5149

112

should be

Page 11: Improving Content Addressable Storage For Databases

What’s different about Databases?

• Rigid format of DB structures– Database Block => Slotted Page– Indifferent performance for variable sized chunking

• Metadata Galore– ‘Understand’ Data being CASed– Finer grained Commonality Factoring– Similar Blocks common, Identical Blocks rare.

Page 12: Improving Content Addressable Storage For Databases

Naïve CAS for DB

• Finer ‘Chunk’ Granularity• Computational Overhead overwhelms Commonality

Factoring• Doesn’t work

Page 13: Improving Content Addressable Storage For Databases

Naïve CAS for DB

• Finer ‘Chunk’ Granularity• Computational Overhead overwhelms Commonality

Factoring• Doesn’t work

Page 14: Improving Content Addressable Storage For Databases

Overlaying Template Chunks

• Identifying Special Cases– The ‘Almost Zero’ Block– Template Chunks– Delta Chunks

• Identifying Template Chunks– Static Vs. Dynamic– Memory Vs. Computation

Page 15: Improving Content Addressable Storage For Databases

Overlaying - Results

• Good, but not good enough.• Too much work

With Overlays

Naïve Chunking

Page 16: Improving Content Addressable Storage For Databases

Need For DB Knowledge

• The Oracle Database Block• Variable Block Sizes (4/8 KB)

Page 17: Improving Content Addressable Storage For Databases

Need For DB Knowledge

• The Postgres Data Block• 8 KB

+----------------+--------------------------------------+ | PageHeaderData | linp1 linp2 linp3 .. | +-----------+----+-------------------------------------+ | ... linpN | | +-----------+-------------------------------------------+ | ^ pd_lower | | | | v pd_upper | +-------------+-----------------------------------------+ | | tupleN ... | +-------------+-----------------+----------------------+ | ... tuple3 tuple2 tuple1 | "special space“ | +--------------------------------+----------------------+ ^ pd_special

Page 18: Improving Content Addressable Storage For Databases

Semantics-Aware CAS for DB

• Fixed Chunk Size– Chunk Size = Data Block Size– Identical Blocks

• Exploiting Similarity - Data Aware Chunk Compression– Uses metadata within block, DB system level tables

• Free space • Redundant Tuples• Redundant Items

• Memory Saved Vs. Data Coherence– Intra block Vs Inter Block

• Column-based Storage for improved savings

Page 19: Improving Content Addressable Storage For Databases

Semantics-Aware CAS - Evaluation

• Oracle– Chunk Size = 8 KB– Leveraging Free Space (PCTFREE)

• Saves 73.6% of space used (sparse tablespace)• Saves 27.1% of space used (almost full tablespace)

• Observations– Better savings than naïve methods– Solution tightly coupled to database – Changes in DB configuration?

Page 20: Improving Content Addressable Storage For Databases

Future Work

• Implement– Redundant Tuples (in progress)

• Within block

• Reorganize tuples across blocks?

– Redundant Items– Explore Column Stores

• more CAS friendly

Page 21: Improving Content Addressable Storage For Databases

Summary and Questions

• Analysis of state-of-the-art

• Variable sized chunking is usually good

• Semantic aware techniques are better for databases

• High computation cost, but OK for near-line storage

Page 22: Improving Content Addressable Storage For Databases

• Old: Fixed-size chunks (Venti)

• New: data-aware variable-sized chunking (LBFS, Data Domain Inc.)

Content Addressable Storage - Old &

… …InsertData

…InsertData