an introduction to apache hbase
TRANSCRIPT
Apache Hadoop HBase
What is it ?
Why use it ?
Architecture
Storage
Related Projects
Hbase What is it ?
A Hadoop Data Store
A noSQL store for big data
It is Open Source, written in Java
It is a distributed database
Automatic sharding, table data spread over cluster
Automatic region server fail over
Hbase Why / When use it ?
Data in billions of rows
Complex data
High volume of I/O
High level of data nodes, 5 +
No need for extra RDBMS functions i.e. transactions
HBase Architecture
Where does Hbase sit in relation to Hadoop ?
HBase Architecture
HBase is a data store
Uses Hadoop for distributed storage
Data stored across region servers
Region server data spread across HDFS data nodes
A write ahead log (WAL) is used to record changes
HBase Storage
What is the architecture ?
HBase Storage
Client makes call i.e. put
Request RPC'ed as key value to Region server
Key Value routed to region for row
Data is written to WAL
Data written to region memStore
If region server cashes WAL can be used to recover data
HBase Related Projects
Apache Flume move large data sets to Hadoop
Apache Sqoop cmd line, move rdbms data to Hadoop
Apache Hbase Non relational database
Apache Pig analyse large data sets
Apache Oozie work flow scheduler
Apache Mahout machine learning and data mining
Apache Hue Hadoop user interface
Apache Zoo Keeper configuration / build
Contact Us
Feel free to contact us at
www.semtech-solutions.co.nz
We offer IT project consultancy
We are happy to hear about your problems
You can just pay for those hours that you need
To solve your problems