an introduction to apache accumulo
TRANSCRIPT
Apache Accumulo
What is it ?
Design
Integrity
Administration
Squirrel
Accumulo What is it ?
A key / value store
A column oriented database
Based on Google's Big Table
Based on
Apache Hadoop
Apache Zoo Keeper
Apache Thrift
Written in Java
Licensed by Apache
Accumulo Design
Has cell level security via column visibility
Server side programming created via iterators
Table based constraints written in Java
Sharding can be used for parallel doc storage
Large rows can be larger than memory size
Accumulo Integrity
Zookeeper used to manage master fail over
Write ahead logs written to each server
Logical time managed for
Consistant transactions
Bulk data import
Fate transactions ( Fault Tolerant Transactions )
Transactions complete even after master failure
Isolation
Transactions see a consistant view of data at row level
Accumulo Administration
System monitoring and stats via web page
System and table config stored in Zoo Keeper
Table naming stored in Zoo Keeper via id's
Follow threads of execution using tracing
Record time actions take place
Accumulo can be used with Squirrel server
As next slide shows
Future presentation will cover Squirrel
Accumulo with Squirrel
Accumulo Data Management
Internal Data Management
Locality groups
Group columns within a single file
Smart compaction
Smaller files merged with larger using definable ratio until all files merged
Minor compaction
To avoid max files being reached in memory files merged with larger files
Loading user created jars
Load Jars from HDFS using VFS
Accumulo Data Management
On Demand Data Management
Compactions
Force tablets ( table partitions ) to compact to a single file
Tablet merging
Request tablet merging via shell
Table cloning
Clone a table from an existing one, reference data / config
Table import / export
Copy table / meta data to another cluster
Accumulo Screen Shot
Contact Us
Feel free to contact us at
www.semtech-solutions.co.nz
We offer IT project consultancy
We are happy to hear about your problems
You can just pay for those hours that you need
To solve your problems