hbase incremental backup

9
2012/07/23 HBase Incremental Backup / Restore

Upload: lee-neal

Post on 15-Jan-2015

1.656 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: HBase Incremental Backup

2012/07/23

HBase Incremental Backup / Restore

Page 2: HBase Incremental Backup

How to perform Incremental Backup/Restore?

• HBase ships with a handful of useful tools – CopyTable– Export / Import

Page 3: HBase Incremental Backup

CopyTable

• Purpose:– Copy part of or all of a table, either to the same cluster or

another cluster

• Usage:– bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--

endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename

• Options:– starttime: Beginning of the time range.– endtime: End of the time range. Without endtime means

starttime to forever.– new.name: New table's name.– peer.adr: Address of the peer cluster given in the format

hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent

– families: Comma-separated list of ColumnFamilies to copy.

Page 4: HBase Incremental Backup

CopyTable (cont.)

• Limitation– Can only backup to another table (Scan + Put)– While a CopyTable is running, newly inserted or updated rows

may occur and these concurrent edits may cause inconsistency.

Page 5: HBase Incremental Backup

Export

• Purpose:– Dump the contents of table to HDFS in a sequence file

• Usage:– $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename>

<outputdir> [[<starttime> [<endtime>]]]

• Options:– *tablename: The name of the table to export– *outputdir: The location in HDFS to store the exported data– starttime: Beginning of the time range– endtime: The matching end time for the time range of the scan

used

Page 6: HBase Incremental Backup

Export (cont.)

• Limitation– Can only backup to HDFS in a sequence file (Scan + Write to

HDFS).– While a CopyTable is running, newly inserted or updated rows

may occur and these concurrent edits may cause inconsistency.

Page 7: HBase Incremental Backup

Import

• Purpose:– Load data that has been exported back into HBase

• Usage– $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename>

<inputdir>

Page 8: HBase Incremental Backup

Conclusion

• Regular (ex. Daily) Incremental backup– Use Export and organize output dir as a meaningful hierarchy

• /table_name/2012 (year) /07 (month) /01 (date)

/02 … /31 /01 (hour) … /24

– Perform Import to restore data on-demand

• To reduce the overhead, don’t perform it during the peak time

Page 9: HBase Incremental Backup

Question?