optimizing hbase scanner performance mikhail bautin software engineer 01/19/2012
Post on 14-Dec-2015
214 Views
Preview:
TRANSCRIPT
HBase ScannersWhat happens on a Get
RegionScanner
StoreScanner
StoreScanner
StoreFileScanner
StoreFileScanner
StoreFileScanner
ColumnFamily1
ColumnFamily2
. . .
(R1,C1,T3) (R1,C2,T2) (R1,C2,T1)
(R1,C1,T1) (R1,C2,T3) (R2,C1,T2)
(R2,C2,T1) . . .
Store = (Region, CF)
. . .
HBase Scanner StateWhat happens on a next()
RegionScanner
StoreScanner
StoreScanner
StoreFileScanner
StoreFileScanner
StoreFileScanner
ColumnFamily1
ColumnFamily2
. . .
Current KeyValue
Store = (Region, CF)
Current KeyValue
Current KeyValue
Priority
Queue. . .Priorit
y Queue
Priority
Queue
Avoiding next() on StoreFileScannerEvery next() call may result in disk I/O
▪HBASE-4433: avoid extra next if done with row/column (Kannan)
▪ An optimization for queries specifying a column set
▪ INCLUDE_AND_SEEK_NEXT_COL
▪ INCLUDE_AND_SEEK_NEXT_ROW
▪HBASE-4434: Don't do HFile Scanner next() unless the next KV is needed (Kannan)
▪ Avoid aggressive pre-fetching
Simple ROWCOL Bloom FiltersDo we have to read all of these files?
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1
C1 T3
C2 T3
C3 T2
R2C1 T2
C2 T3
Row Col TS
R1C1 T4
C2 T2
R2 C1 T1
Query: (R1, C3)
Simple ROWCOL Bloom FiltersIn some cases, we only have to read one file
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1C1 T4
C2 T2
R2 C1 T1
Query: (R1, C3)
Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1
C1 T3
C2 T3
C3 T2
R2C1 T2
C2 T3
Row Col TS
R1C1 T4
C2 T2
R2 C1 T1
Query: C1 and C3 in all rows
Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries
Row Col TS
R1 C1 T2
R1 C1 T1
R1 C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
Query: C1 and C3 in all rows—seek to (R1, C1)
Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries
Row Col TS
R1 C1 T2
R1 C1 T1
R1 C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
Query: C1 and C3 in all rows—seek to (R1, C3)
Fake key: (R1, end of C3)
Fake key: (R1, end of C3)
Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries
Row Col TS
R1 C1 T2
R1 C1 T1
R1 C2 T1
R2 C1 T1
R2 C2 T2
R2 C2 T1
R2 C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2 C1 T2
R2 C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
Query: C1 and C3 in all rows—seek to (R2, C1)
(R2, C1, T1)
(R2, C1, T1)
(R2, C1, T2) wins by
timestamp
Multi-column Bloom Filters (HBASE-2794)ROWCOL Bloom filters for multi-column queries
Row Col TS
R1 C1 T2
R1 C1 T1
R1 C2 T1
R2 C1 T1
R2 C2 T2
R2 C2 T1
R2 C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2 C1 T2
R2 C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
Query: C1 and C3 in all rows—seek to (R2, C3)
(R2, C3, T1)
Fake key: (R2, end of C3)
Fake key: (R2, end of C3)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1
C1 T3
C2 T3
C3 T2
R2C1 T2
C2 T3
Row Col TS
R1C1 T4
C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Fake key: (R1, C1, T3)
Fake key: (R1, C1, T2)
Fake key: (R1, C1, T4)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1
C1 T3
C2 T3
C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Fake key: (R1, C1, T3)
Fake key: (R1, C1, T2)
(R1, C1, T4)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Fake key: (R1, C3, T3)
Fake key: (R1, C3, T2)
Fake key: (R1, C3, T4)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Fake key: (R1, C3, T3)
Fake key: (R1, C3, T2)
(R2, C1, T1)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
(R1, C3, T2) is next
Fake key: (R1, C3, T2)
(R2, C1, T1)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2C1 T2
C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Fake key: (R2, C1, T3)To be selected next.Fake key: (R2,
C1, T2)
(R2, C1, T1)
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2 C1 T2
R2 C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
(R2, C1, T2) wins by
timestampFake key: (R2, C1, T2)
(R2, C1, T1)
Lazy Seek (HBASE-4465)
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2 C1 T2
R2 C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Fake key: (R2, C3, T3)
Fake key: (R2, C3, T2)
Fake key: (R2, C3, T4)
Optimizing for reading recent data
Lazy Seek (HBASE-4465)Optimizing for reading recent data
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2 C1 T2
R2 C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
Real seek to (R2, C3, T3)
Fake key: (R2, C3, T2)
EOF
Lazy Seek (HBASE-4465)
Row Col TS
R1C1
T2
T1
C2 T1
R2
C1 T1
C2T2
T1
R2 C3 T1
Row Col TS
R1 C1 T3
R1 C2 T3
R1 C3 T2
R2 C1 T2
R2 C2 T3
Row Col TS
R1 C1 T4
R1 C2 T2
R2 C1 T1
T1 – T2
T2 – T3
T1 – T4
EOF
(R2, C3, T1)
EOF
Optimizing for reading recent data
Top-of-the-row seekSome applications do not use DeleteFamily
▪We always seek to the top of the row first
▪ DeleteFamily comes before all columns, i.e. at (R1, empty column)
▪ Even if we only need (R1, C1), there might be a DeleteFamily for R1
▪Some applications do not even use DeleteFamily
▪Two fixes by Liyin Tang:
▪ Utilize existing ROWCOL Bloom filter (HBASE-4469)
▪ Added a separate ROW-only Bloom filter for DeleteFamily(HBASE-4532)
Seek on deleted KV (HBASE-4585)What if the requested column has been deleted?
▪We are requesting C1, C2, ..., Cn
▪What if we see a delete marker for Ci?
▪Previously, we would keep calling next()
▪Now, we seek to (i + 1)’th requested column
(also a fix by Liyin)
Data block read requests (dark launch)Thu, Sep 15 – Sun, Sep 25 2011
Pushed on Tue Sep 20th:• No extra next when done
with column/row (HBASE-4433)
• No KV prefetch (HBASE-4434)
• Lazy Seek (HBASE-4465)
Fri Sep 16th vs. Sep 23rd:45% savings in logical block read requests(cache hits + misses)
Data block read requests (dark launch)Sun, Sep 25 – Mon, Oct 3 2011
Pushed on Fri Sep 30th:• Avoid top-of-the-row seek
(HBASE-4469, Liyin)• Off-peak compactions
(HBASE-4463, Karthik)
Sun Sep 25th vs. Oct 2nd: 33% savings in logical block read requests (cache hits + misses)
Data block cache misses (dark launch)▪20.6 K (Mon Sep 19th) -> 11.8 K (Mon Sep 26th) -> 9.8 K (Mon Oct 3rd)
▪52% savings (42% and then 17% more)
• No next KV prefetch
• No next() when done with row/column
• Lazy Seek
• No top-of-the-row seek
• Off-peak compactios
Avoid loading previous block (HBASE-4443)We sometimes go to previous block on exact match
▪Future work
▪Suppose the first key of a block matches (Row, Column)
▪But maybe there is an earlier key that would also match?
▪We load the previous block to find out
▪Possible fixes:
▪ Track deletes and optimize the MAX_VERSIONS=1 case
▪ Add last key in block to index (increases index size)
Top-of-the-column seek (HBASE-4962)Some applications do not use DeleteColumn
▪Future work
▪DeleteColumn deletes all versions of a particular column
▪Comes before all Puts for a (Row, Column)
▪Slows down timestamp range queries
▪Proposed solution:
▪ Add a (Row, Column) Bloom filter for DeleteColumn only
▪ Seek to (Row, Column, T2) for a [T1, T2] range query
top related