apache phoenix: oltp in hadoop, by james taylor - saleforce.com
TRANSCRIPT
V5
OLTP for Hadoop@ApachePhoenix
http://phoenix.apache.org/James Taylor (@JamesPlusPlus)
About James
• Architect at Salesforce.com– Part of the Big Data group– Lead of the Apache Phoenix– PMC of Apache Calcite
What is Apache Phoenix?
• A relational database layer for Apache HBase
What is
• High performance horizontally scalable byte store• Suitable as store of record for mission critical data
?
What is Apache Phoenix?
What is Apache Phoenix?
• A relational database layer for Apache HBase– OLTP/Query engine
• Transforms SQL into native HBase API calls• Pushes work to cluster for parallel execution• Supports ACID transactions
– Metadata repository• Typed access to data stored in HBase tables• Support multi-tenancy modeled as SQL views
– JDBC driver• A top level Apache Software Foundation project
– Originally developed at Salesforce– A growing community with momentum
Where Does Phoenix Fit In?Sq
oop
RDB
Data
Col
lect
orFl
ume
Log
Data
Col
lect
or
Zook
eepe
rCo
ordi
natio
n
YARNCluster Resource
Manager / MapReduce
HDFS 2.0Hadoop Distributed File System
GraphXGraph analysis
framework
PhoenixQuery execution engine
HBaseDistributed Database
The Java Virtual Machine HadoopCommon JNI
SparkIterative In-Memory
Computation
MLLibData mining
PigData Manipulation
HiveStructured Query
PhoenixJDBC client
Why is Phoenix fast?• Pushes down computation to region servers
– Start/stop key range(s)– Time range min/max– Predicates– Aggregation– Sort– Limit– TopN
• Parallelizes query from client– Intra-region through statistics collection
• Supports secondary indexes– Global & co-located
Client
Server Server Server Server Server ServerPartitioned TableR1 R2 … … Rn
Intra-region guideposts
Parallel scans
Scan rangeScan ranges
• Filter• Aggregate• Sort• Limit
Scan range
Why is Phoenix fast?
Skip Scans
• Push key ranges through filter• Use SEEK_NEXT_HINT to skip data
R1
Include
Include
Include
Skip
Skip
Skip
Filters
• Pushes down WHERE clause to server• Filters entire HFile based on time-range
R1HFilet1 - 10
HFilet11 - 20
HFilet21 - 30
HFilet31 - 40 ScanHFilet31 - 40 Scan
Transactions
• Snapshot isolation model– Using Tephra (http://tephra.io/)– Supports SERIALIZABLE isolation level– Allows reading your own uncommitted data
• Optional– Enabled on a table by table basis– No performance penalty when not used
• Available in 4.7.0 release (being voted on now)
Optimistic Concurrency Control
• Avoids cost of locking rows and tables• No deadlocks or lock escalations• Cost of conflict detection and possible rollback is
higher• Good if conflicts are rare: short transaction, disjoint
partitioning of work• Conflict detection not always necessary:
write-once/append-only data
Tephra Architecture
ZooKeeper
Tx Manager(standby)
HBase
Master 1
Master 2
RS 1
RS 2 RS 4
RS 3
Client 1
Client 2
Client N
Tx Manager(active)
Transaction Lifecycle
time outtry abort
failedroll backin HBase
writeto
HBasedo work
Client Tx Manager
none
complete Vabortsucceeded
in progress
start txstart
start tx
committry commit check conflicts
invalid Xinvalidatefailed
Tephra Architecture
• TransactionAware client• Coordinates transaction lifecycle with manager• Communicates directly with HBase for reads and writes
• Transaction Manager• Assigns transaction IDs• Maintains state on in-progress, committed and invalid transactions
• Transaction Processor coprocessor• Applies server-side filtering for reads• Cleans up data from failed transactions, and no longer visible
versions
Demo
• Debit/credit example• Multiple clients updating account balance through stream of
debits/credits.• Without transactions, balance will not be correct
What’s Next?
You are here
Introducing Apache Calcite• Query parser, compiler, and planner framework
– SQL-92 compliant• Pluggable cost-based optimizer framework
– Sane way to model push down through rules• Interop with other Calcite adaptors
– Already used by Drill, Hive, Kylin, Samza– Supports any JDBC source (RDBMS - remember them )– One cost-model to rule them all
How does Phoenix plug in?
Calcite Parser & Validator
Calcite Query Optimizer
Phoenix Query Plan GeneratorPhoenix Runtime
JDBC Client
SQL + Phoenix specific grammar
Built-in rules + Phoenix specific
rules
What’s Next?Query
Drillbit
Calcite
Phoenix/HBase HDFS
Drillbit
Calcite
RDBMS
Drillbit
Calcite
Samza/ Streams
Drillbit
Calcite
Thank you!Questions?
* who uses Phoenix