6.814/6.830 lecture 8
DESCRIPTION
6.814/6.830 Lecture 8. Memory Management. Column Representation Reduces Scan Time. Idea: Store each column in a separate file. Column Representation. Reads Just 3 Columns. Assuming each column is same size, reduces bytes read from disk by factor of 3/5. - PowerPoint PPT PresentationTRANSCRIPT
6.814/6.830 Lecture 8
Column StoresMemory Management
Recap : Column Representation Reduces Scan Time
• Idea: Store each column as a collection of sorted, compressed files– Only scan the columns that are needed
• Simple per-column compression (RLE/Delta/Dictionary/LZ)– Direct operation on compressed data
• Buffer writes in write-optimized store, periodically compress, sort, and write out as new column fragments
3
Write Performance
Tuple MoverAsynchronous Data
Movement
Queries read from both WOS and ROS
Batched
Amortizes seeks
Amortizes recompression
Enables continuous load
Trickle load: Very Fast Inserts
When to Rewrite ROS Objects?
• Store multiple ROS objects, instead of just one• Each of which must be scanned to answer a
query
• Tuple mover writes new objects• Avoids rewriting whole ROS on merge
• Periodically merge ROS objects to limit number of distinct objects that must be scanned (like Big Table)
Tuple Mover
WOS ROS
Older objects
5
When Are Columns Right?
• Warehousing (OLAP)• Read-mostly; batch update• Queries: Scan and aggregate a few
columns
• Vs. Transaction Processing (OLTP)• Write-intensive, mostly single record
ops.
• Column-stores: OLAP optimized
• In practice >10x performance on comparable HW, for many real world analytic applications• True even if w/ Flash or main memory!
Different architectures for different workloads
C-Store Performance
• How much do these optimizations matter?
• Wanted to compare against best you could do with a commercial system
7
Emulating a Column Store
• Two approaches:
1. Vertical partitioning: for n column table, store n two-column tables, with ith table containing a tuple-id, and attribute i
• Sort on tuple-id
• Merge joins for query results
2. Index-only plans
• Create a secondary index on each column
• Never follow pointers to base table
Two Emulation Approaches
9
Bottom Line
C-Store, Compression
C-Store, No Compression
C-Store, Early Materialize
Rows
Rows, Vert. Part.
Rows, All Indexes
4
15
41
26
80
221
Time (s)
SSBM (Star Schema Benchmark -- O’Neil et al ICDE 08) Data warehousing benchmark based on TPC-H Scale 100 (60 M row table), 17 columns Average across 12 queries Row store is a commercial DB, tuned by professional DBA vs
C-Store
Commercial System Does Not Benefit From Vertical Partitioning
10
Problems with Vertical Partitioning
① Tuple headers Total table is 4GB Each column table is ~1.0 GB Factor of 4 overhead from tuple headers and tuple-ids
② Merge joins Answering queries requires joins Row-store doesn’t know that column-tables are
sorted Sort hurts performance
Would need to fix these, plus add direct operation on compressed data, to approach C-Store performance
Problems with Index-Only Plans
Consider the query:
SELECT store_name, SUM(revenue) FROM Facts, Stores WHERE fact.store_id = stores.store_id AND stores.country = “Canada” GROUP BY store_name
• Two WHERE clauses result in a list of tuple IDs that pass all predicates
• Need to go pick up values from store_name and revenue columns
• But indexes map from valuetuple ID!
• Column stores can efficiently go from tuple IDvalue in each column
12
Recommendations for Row-Store Designers
• Might be possible to get C-Store like performance
① Need to store tuple headers elsewhere (not require that they be read from disk w/ tuples)
② Need to provide efficient merge join implementation that understands sorted columns
③ Need to support direct operation on compressed data• Requires “late materialization” design
13
Summary
• C-Store is a “next gen” column-oriented databases
• Key New Ideas:• Late materialization• Compression & direct operation• Fast load via “write optimized store”
• Row-stores do a poor job of emulation• Need better support for compression, late
materialization• Need support for narrow tuples, efficient merge
joins
C-Store: http://db.csail.mit.edu/cstore13
Study Break
pgadmin3 demo
14
Database Internals OutlineFront End
Admission ControlConnection Management
(sql)Parser
(parse tree)Rewriter
(parse tree) Planner & Optimizer
(query plan) Executor
Query System
Storage System
Access MethodsLock ManagerBuffer ManagerLog Manager
This time
Last time
Buffer Manager
• Cache of recently accessed pages
• Mediates all access to page data / index methods
• Today: DBMIN Algorithm for Page Management
16
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
17
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
18
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
19
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 4 2 3 4 m
20
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 4 2 3 4 m
1 4 2 1 3 1 m
21
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
22
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
23
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
24
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
25
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
26
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
27
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
28
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
29
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
30
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
1 3 4 1 h
31
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
1 3 4 1 h
1 2 3 4 2 m
32
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
1 3 4 1 h
1 2 3 4 2 m
2 3 4 3 h
33