hbase lon meetup
TRANSCRIPT
!1
Speaker Name or Subhead Goes Here
HBase status: 0.94, 0.96, 0.98, and future releasesMa6eo.Bertozzi | @Cloudera 17 February 2014 (HBase London Meetup)
HDFS
What is HBase?
Apache HBase is an Open Source,distributed, consistent,non-‐relaQonal databasethat provides low-‐latency,random read/write operaQons on top of HDFS
!2
ZooKeeper
MRApp
Open Source -‐ Developer Community
!3
• Vibrant,)Highly)Ac1ve)community!))• We’re)Growing!)
What is HBase?
non-‐relaQonal• Key:Column/Value Interface
• Dynamic columns (qualifiers), “no schema required” • “Fixed” column groups (families) • table[row:family:column] = value
!4
User-A info name Theo
Key Family Qualifier Value
User-A info address 3 Abbey Rd - London NW8 9AY
User-B info name DaveUser-C info . . . . . .
Distributed
!5 What is HBase?
HDFS
Region
Region Server
Region
Region
Region
Region Server
Region
Region
Region
Region Server
Region
Region
HMaster
ZooKeeper
Client/Appcreate, delete table opera=ons
put, get, scan
• Region Server • Server that contains a set of Regions • Handle reads and writes requests
• Region • Basic unit of scalability • Subset of the table’s data • ConQguous, sorted range of rows stored together
• Master • Coordinate the cluster (e.g. Balancing) • Admin Ops (create/delete table, …)
!6
Timeline and Features
Apache HBase
Apache HBase Timeline
!7
Jan$‘12:$0.92.0$
2014$2006$ 2007$ 2008$ 2009$ 2010$ 2011$ 2013$2012$
Nov$’06:$Google$$BigTable$OSDI$‘06$
Apr$‘07:$First$Apache$HBase$commit$as$Hadoop$contrib$project$
Apr$‘10:$Apache$HBase$becomes$top$level$project$ Oct$‘13:$0.96.0$
Jan‘08:$Promoted$to$Hadoop$subproject$
Q3$‘14:$1.0$
May$‘12:$0.94.0$ Feb$‘14:$0.98.0$
Apr’11:''0.90.1'
Apache HBase 0.94
• Create/Delete Tables • Table Insert, Update, Delete, Get, Scan • Import/Export tools • Map-‐Reduce job helpers • Kerberos & ACLs • …
!8
!9
(The Latest Release)
Apache HBase 0.96
0.96: Major Changes, Minimal disturbance
• …more than a year in the making • Lots of changes under the hood
• HadoopWritables replaced with protobuf (RPC, metadata, …) • -‐ROOT-‐ Table Removed • /hbase dir Layout Changes
• Minimal disturbance to the API • Improved stability • Mean Time To Recovery (MTTR)
!10
0.96: New Features
• Online Region Merge • Online “Schema” Change • Snapshots • MTTR
• Favored Nodes • New Balancers
• Namespaces
!11
h6ps://blogs.apache.org/hbase/entry/hbase_0_96_0_released
Namespaces
!12
• Separate ACLs • Performance IsolaQon * • Region Server groups *
RSG$green$orange$RSG$blue$
Namespace(blue( Namespace(green( Namespace(orange(
AbstracQon for mulQple tenants to create and manage their own tables within a large HBase instance.
Mean Time to Recovery (MTTR)
!13
recovered'replay'assign'split'
Region'unavailable'
Region'available''for'RW'
hdfs' hdfs'
detect'
hdfs'
• Machine failures happen in distributed systems • Repair == split, assign, replay • Distributed log replay with fast write recovery
• Writes in HBase do not incur reads. • regions open for write, during distributed log replay
!14
(The Next Release)
Apache HBase 0.98
0.98: HBase
• Wire CompaQble with 0.96 • “No binary guarantee with 0.96” • 0.94 -‐> 0.98 upgrade is possible • Map-‐Reduce over Snapshots • Stripe Compac=on (Pluggable Compac=on Algo) • Improved WAL write throughput • Reverse Scan • Per-‐Cell ACLs, Visibility Labels, Encryp=on
!15
HBase Security
• 0.90+ Kerberos (RPC Level) • 0.92+ Access Control List (aka ACL) • 0.98+ Per-‐Cell ACLs • 0.98+ Visibility Labels (aka Tags) • 0.98+ Transparent Table/CF encrypQon (HBASE-‐7544)
• Java KeyStore support
!16
!17
“ ” What’s Next? Apache HBase 1.0 and beyond
!18
Speaker Name or Subhead Goes Here
QuesQons?17 February 2014 (HBase London Meetup)