c-store: self-organizing tuple reconstruction jianlin feng school of software sun yat-sen university...
TRANSCRIPT
C-Store: Self-Organizing Tuple Reconstruction
Jianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITYApr. 17, 2009
Review of Tuple Reconstruction Stitch together separate column values of the
same logical tuple. Join on Tuple IDs/positions.
Two Strategies Early materialization Late matertialization
Motivation
Tuple Reconstruction is easy if columns are sorted in the same order
However the pre-requisite can not always be preserved. During query processing, many operators (joins, g
roup by, order by, etc.) are not tuple order-preserving.
The Ultimate Access Pattern
For each relation R, we have one copy for each attribute in R. each copy is pre-sorted on the corresponding attribute.
All tuple reconstruction initiated by a restriction on an attribute R.a, can be done using the copy that is sorted on R.a.
The Limitations: Space constraint Idle time for the pre-sortings.
The Proposed Solution
Partial Sideways Cracking Uses auxiliary self-organizing data structures to m
aterialize mappings between pairs of attributes used together for tuple reconstruction.
Background MonetDB Selection-based Cracking
MonetDB: http://monetdb.cwi.nl/ Every relation table
is represented as a collection of Binary Association Tables (BATs).
Each BAT is a set of two columns For a relation R of k attributes, there exists k BATs. Each BAT stores (key, attr) pairs. In each BAT, keys are system generated tuple IDs. For base BAT, the key column is typically virtual.
Like STORAGE KEY in Read Store of C-Store.
MonetDB’s Basic Operators (1) select(A, v1, v2)
Searches all (key, attr) pairs in base column A for attribute values between v1 and v2.
Output: A list of keys/positions.
In the output, the tuple order is usually preserved.
MonetDB’s Basic Operators (2) join(j1, j2)
Performs a join between attr1 of j1 and attr2 of j2. Output:
A list of (key1, key2) pairs. In the output, the tuple order is mainly preserved f
or outer join.
Outer Join
An outer join does not require each record in the two joined tables to have a matching record.
The joined table retains each record—even if no other matching record exists. Left outer join Right outer join Full outer join
Left Outer Join
MonetDB’s Basic Operators (3) reconstruct(A, r)
Output: All (key, attr) pairs of base column A at the position spec
ified by r.
Selection-Based Cracking Cracker column
The first time an attribute A is required by a query, a copy of column A is created, called the cracker column CA of A.
Each selection operator on A triggers a range-based physical reorganization of CA.
Each cracker column, has a cracker index (AVL-tree) to maintain partitioning information.
Future queries benefit from the physically clustered data and do not need to access the whole column.
AVL-Tree
An AVL tree is a self-balancing binary search tree.
In an AVL tree, the heights of the two child subtrees of any node differ by at most one.
An example of an unbalanced non-AVL tree
The same tree after being height-balanced
Order for Tuple Reconstruction The order in which tuples are inserted is used
for tuple construction. Physical reorganization happens only on cracker
columns.
The crackers.select Operator
crackers.select(A, v1, v2) First, it creates CA if it does not exist.
It searches the index of CA for the area where v1 and v2 fall.
If the bounds do not exist, i.e., no query used them in the past, then CA is physically reorganized to cluster all qualifying tuples into a contiguous area.
Output: A list of keys/positions.
Cracker Map
A cracker map MAB is defined as a two-column table over two attributes A and B of a relation R. Values of A are stored in the left column, called he
ad. Values of B are stored in the right column, called t
ail.
Values of A and B in the same position of MAB belong to the same tuple.
Maps Are Created on Demand Only When a query q needs access to attribute B
based on a restriction on attribute A and MAB does not exist,
then q will create MAB by performing a scan over base columns A and B.
For each cracker map MAB , there is a cracker index (AVL-tree) that maintains information about how A values are distributed over MAB.
Queries Trigger Cracking
Query Style Access B based on A.
Each such query triggers cracking (physical reorganization) of MAB based on the restriction applied to A.
Cracking All tuples with values of A that qualify the restrictio
n are in a contiguous area in MAB . Realized by splitting a piece of MAB into two or thre
e new pieces.
The sideways.select(A, v1, v2, B) Operator Returns tuples of attribute B of relation R based on a
predicate on attribute A of R as follows:(1) If there is no cracker map MAB , then create one.
(2) Search the index of MAB to find the contiguous area w of the pieces related to the restriction σ on A.
If σdoes not match existing piece boundaries,
(3) Physically reorganize w to move false hits out of the contiguous area of qualifying tuples.
(4) Update the cracker index of MAB accordingly.
(5) Return a non-materialized view of the tail of w.
Multi-Projection Queries
A single-selection query q that projects n attributes requires n maps, one for each attribute to be projected.
Select B, C
From R
Where A < 4;
For this query, we need 2 maps MAB and MAc .
All maps that have been created using A as head are collected in the map set SA.
Adaptive Alignment
The Problem Naïve use of the sideways.select operator may le
ad to non-aligned cracker maps. The Solution
Extend the sideways.select operator with an alignment step to keep the alignment maps .
The Basic Idea Is to apply all physical reorganizations, due to sele
ctions on an attribute A, in the same order to all maps in the map set SA.
Cracker Tape
For each map set SA, introduce a cracker tape TA. TA logs (in order of their occurrence) all selections on attribute A t
hat trigger cracking of any map in SA.
Each map MAx is equipped with a cursor pointing to the entry in T
A that represents the last crack on MAx.
Given a tape TA , a map MAx is aligned (synchronized) by successively forwarding its cursor towards the end of MAx
and incrementally cracking MAx according to all selections it passes on its way.
All maps whose cursors point to the same position in TA , are physically aligned.
The Extended sideways.select Operator
Map Set Choice: Self-organizing Histograms Following the “cracking philosophy”
In an unpredictable environment with no idle system time, always perform the minimum investment.
In this way, for a query q, a set SA is chosen such that the restriction on A is the most selective in q. Yielding a minimal bit vector
The most selective restriction can be found using the cracker indices.
Complex Queries
No other (relational) operators, rather than tuple reconstruction, depends on tuple insertion order. Joins,aggregations, groupings, etc.
Potentially many operators can exploit the clustering information in the maps. A MAX operator can consider only the last piece o
f a map. Such directions are for future work.
Experimental Analysis
Compare the implementation of selection and sideways cracking on top of MonetDB,
Against the latest non-cracking version of MonetDB,
And against MonetDB on presorted data. Results
Sideways cracking achieves similar performance to presorted data.
But does not have the heavy initial cost and the restrictions on updates and workload prediction.
Partial Sideways Cracking
Consider storage restriction Partial Maps
Maps are only partially materialized driven by the workload.
A map consists of several chunks. Each chunk is a separate two-column table. Each chunk contains a given value range of the
head attribute of this map. Each chunk is cracked separately.
A Research Direction
Improving performance by compression C-Store uses compression heavily.
Can we integrate compression with cracking?
References
S. Idreos, M. L. Kersten, S. Manegold. Self-organizing Tuple Reconstruction in Column-stores. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Providence, RI, USA, Accepted for publication, June 2009.
Daniel J. Abadi, Daniel S. Myers, David J. DeWitt, and Samuel R. Madden 。 Materialization Strategies in a Column-Oriented DBMS . Proceedings of ICDE, April, 2007, Istanbul, Turkey.