hbase state of the union
TRANSCRIPT
![Page 2: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/2.jpg)
About Me
Enis Söztutar
• Committer and PMC member in Apache HBase, Phoenix, and Hadoop
• HBase/Phoenix dev @Hortonworks
![Page 3: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/3.jpg)
Outline
Versions, compatibility
Releases, what is in HBase-{1.1, 1.2, 1.3}
New Developments
HBase-2.0
![Page 4: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/4.jpg)
Versions, Compatibility
![Page 5: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/5.jpg)
Semantic Versioning
Starting with the 1.0 release, HBase works toward Semantic Versioning
MAJOR.MINOR.PATCH[-identifiers]
PATCH: only BC bug fixes.
MINOR: BC new features
MAJOR: Incompatible changes
![Page 6: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/6.jpg)
SemVer in Action
1.0 Released last year. Started following semantic versioning
10 releases with 1.x.y versions. More coming!
Release notes contain “compatibility” report for source / binary
Patch upgrades do not have new features. Drop in replacement.
Minor versions are “compatible”
![Page 7: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/7.jpg)
To be, or not to be (Compatible)
![Page 8: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/8.jpg)
To be, or not to be (Compatible)
Compatibility is NOT a simple yes or no
Many dimensions
• source, binary, wire, command line, dependencies etc
What is client interface?
• InterfaceAudience.{Public,Private,LimitedPrivate}
Read https://hbase.apache.org/book.html#upgrading
![Page 9: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/9.jpg)
Major Minor Patch
Client-Server Wire Compatibility ✗ ✓ ✓Server-Server Compatibility ✗ ✓ ✓File Format Compatibility ✗* ✓ ✓Client API Compatibility ✗ ✓ ✓Client Binary Compatibility ✗ ✗ ✓Server Side Limited API Compatibility ✗ ✗*/ *✓ ✓Dependency Compatibility ✗ ✓ ✓Operation Compatibility ✗ ✗ ✓
![Page 10: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/10.jpg)
Releases
![Page 11: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/11.jpg)
2015 H2 – 2016 H1 (repo and releases)(master) 2.0.0-SNAPSHOT
(branch-1) 1.4.0-SNAPSHOT
(branch-1.3)1.3.0 RC
1.2.2 RC1.2.0 1.2.1(branch-1.2)
1.1.0 1.1.5
0.98.200.98.19
1.0.0 1.0.3
(branch-1.1)
(branch-1.0)
(0.98)
…
…
…
![Page 12: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/12.jpg)
![Page 13: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/13.jpg)
RTFM – HBase-1.1 Release Notes
• Async RPC client
• Simple RPC throttling
• Improved compaction controls
• Scan improvements
• Procedure V2 for improved reliability of cluster operations (HBASE-12439)
• New extension interfaces for
coprocessor users
• Per-column family flush
• WAL on SSD
• BlockCache in Memcached
• Region replica enhancements around META, WAL, and bulk loading
![Page 14: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/14.jpg)
RTFM – HBase-1.2 Release Notes
• JDK8 is now supported
• Hadoop 2.6.1+ and Hadoop 2.7.1+ are now supported
• Per column-family time ranges for scan
• Daemons respond to SIGHUP to reload configs
• Region location methods added to thrift2 proxy
• Table-level sync that sends deltas
• Client side metrics via JMX
![Page 15: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/15.jpg)
RTFM – HBase-1.3 Release Notes
• Date-based tiered compactions
• Maven archetypes for HBase client applications
• Throughput controller for flushes Controlled delay (CoDel) based RPC scheduler (HBASE-15136)
• Bulk loaded HFile replication
• More improvements to Procedure V2
• Improvements to Multi WAL
• Many improvements and optimizations in metrics subsystem
• Reduced memory allocation in RPC layer
• Region location lookups optimizations in HBase client
![Page 16: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/16.jpg)
Releases – How to choose
0.98 is still released frequently, likely will continue till end of 2016
1.0 is EOL’ed. Move to 1.1 at least
Both 1.1 and 1.2 are pretty stable
Starting from scratch, use 1.2 or 1.3
1.3 is coming shortly
Moving between minor versions is easy for 1.x
![Page 17: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/17.jpg)
New Developments
![Page 18: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/18.jpg)
New Compaction Policies for Time series
FIFO: First In, First Out
• No Compaction!
• Only data with very short TTL
Date Tiered Compaction
• Dramatic reduction in IO!
• Partition hfiles and compaction by time windows
• Scans with time ranges filters whole files
![Page 19: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/19.jpg)
Date Tiered Compaction
From https://labs.spotify.com/2014/12/18/date-tiered-compaction/
![Page 20: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/20.jpg)
Spark Integration
• RRD
• DataFrame / DataSet / SparkSQL
• Partition pruning
• Column pruning
• Data locality
• Predicate pushdown
![Page 21: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/21.jpg)
Spark Integration
![Page 22: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/22.jpg)
Perf
Async
• Async RPC client already in
• Async Client
• Async WAL Writer
Row locks, Read / Write
Write path re-ordered
![Page 23: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/23.jpg)
New Development – In Progress
RPC Scheduling improvements
Replication 2.0
Reduce Garbage
C++ Client
Backup / Restore
![Page 24: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/24.jpg)
New Development – In Progress
Offheaping
Read path (done)
Write path in development
In-memory flushes/compactions
Compact in-memory representations
Fatter flushes
Assignment Manager/Master
![Page 25: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/25.jpg)
HBase-2.0
![Page 26: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/26.jpg)
HBase-2.0
Target is 2016 EOY
Learnt from singularity (0.94 -> 0.96+)
2.0 will be rolling upgradable!
• Disclaimer: to the extend that we can make it
JDK-8 only
Will work with Hadoop-3?
Assignment and data layout changes is the big driver
![Page 27: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/27.jpg)
How to prepare for HBase-2.0
2.0 contains more API clean up
Cleanup PB and guava “leaks” into the API
Some deprecated APIs (HConnection, HTable, HBaseAdmin, etc) going away
Start using JDK-8 (and G1). You will like it.
1.x client should be able to do read / write / scan against 2.0 clusters
Some DDL / Admin operations may not work
![Page 28: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/28.jpg)
Other HBase talks
Today
(3:00pm) Omid: A Transactional Framework for HBase
(4:10pm) Hive Hbase Metastore - Improving Hive with a Big Data Metadata Storage
(5:00pm) Operating and Supporting Apache HBase - Best Practices and Improvements
Thursday
(2:10pm) Managing Hadoop, HBase, and Storm Clusters at Yahoo Scale
(3:00pm) Phoenix + HBase: An Enterprise Grade Data-Warehouse Appliance for Interactive Analytics?
(4:10pm) The DAP: Where Yarn, HBase, Kafka and Spark go to Production
(5:00pm) HBase BoF
![Page 29: HBase state of the union](https://reader035.vdocuments.us/reader035/viewer/2022062522/587b98a21a28ab4e4f8b6e85/html5/thumbnails/29.jpg)
Questions
Thanks for listening *.
*Here is a picture of a cat for your suffering!