apache phoenix: oltp in hadoop, by james taylor - saleforce.com

V5

OLTP for Hadoop@ApachePhoenix

http://phoenix.apache.org/James Taylor (@JamesPlusPlus)

About James

• Architect at Salesforce.com– Part of the Big Data group– Lead of the Apache Phoenix– PMC of Apache Calcite

What is Apache Phoenix?

• A relational database layer for Apache HBase

What is

• High performance horizontally scalable byte store• Suitable as store of record for mission critical data

?


• A relational database layer for Apache HBase– OLTP/Query engine

• Transforms SQL into native HBase API calls• Pushes work to cluster for parallel execution• Supports ACID transactions

– Metadata repository• Typed access to data stored in HBase tables• Support multi-tenancy modeled as SQL views

– JDBC driver• A top level Apache Software Foundation project

– Originally developed at Salesforce– A growing community with momentum

Where Does Phoenix Fit In?Sq

oop

RDB

Data

Col

lect

orFl

ume

Log

Data

Col

lect

or

Zook

eepe

rCo

ordi

natio

n

YARNCluster Resource

Manager / MapReduce

HDFS 2.0Hadoop Distributed File System

GraphXGraph analysis

framework

PhoenixQuery execution engine

HBaseDistributed Database

The Java Virtual Machine HadoopCommon JNI

SparkIterative In-Memory

Computation

MLLibData mining

PigData Manipulation

HiveStructured Query

PhoenixJDBC client

Why is Phoenix fast?• Pushes down computation to region servers

– Start/stop key range(s)– Time range min/max– Predicates– Aggregation– Sort– Limit– TopN

• Parallelizes query from client– Intra-region through statistics collection

• Supports secondary indexes– Global & co-located

Client

Server Server Server Server Server ServerPartitioned TableR1 R2 … … Rn

Intra-region guideposts

Parallel scans

Scan rangeScan ranges

• Filter• Aggregate• Sort• Limit

Scan range

Why is Phoenix fast?

Skip Scans

• Push key ranges through filter• Use SEEK_NEXT_HINT to skip data

R1

Include

Include

Include

Skip

Skip

Skip

Filters

• Pushes down WHERE clause to server• Filters entire HFile based on time-range

R1HFilet1 - 10

HFilet11 - 20

HFilet21 - 30

HFilet31 - 40 ScanHFilet31 - 40 Scan

Transactions

• Snapshot isolation model– Using Tephra (http://tephra.io/)– Supports SERIALIZABLE isolation level– Allows reading your own uncommitted data

• Optional– Enabled on a table by table basis– No performance penalty when not used

• Available in 4.7.0 release (being voted on now)

http://tephra.io/

Optimistic Concurrency Control

• Avoids cost of locking rows and tables• No deadlocks or lock escalations• Cost of conflict detection and possible rollback is

higher• Good if conflicts are rare: short transaction, disjoint

partitioning of work• Conflict detection not always necessary:

write-once/append-only data

Tephra Architecture

ZooKeeper

Tx Manager(standby)

HBase

Master 1

Master 2

RS 1

RS 2 RS 4

RS 3

Client 1

Client 2

Client N

Tx Manager(active)

Transaction Lifecycle

time outtry abort

failedroll backin HBase

writeto

HBasedo work

Client Tx Manager

none

complete Vabortsucceeded

in progress

start txstart

start tx

committry commit check conflicts

invalid Xinvalidatefailed

Tephra Architecture

• TransactionAware client• Coordinates transaction lifecycle with manager• Communicates directly with HBase for reads and writes

• Transaction Manager• Assigns transaction IDs• Maintains state on in-progress, committed and invalid transactions

• Transaction Processor coprocessor• Applies server-side filtering for reads• Cleans up data from failed transactions, and no longer visible

versions

Demo

• Debit/credit example• Multiple clients updating account balance through stream of

debits/credits.• Without transactions, balance will not be correct

What’s Next?

You are here

Introducing Apache Calcite• Query parser, compiler, and planner framework

– SQL-92 compliant• Pluggable cost-based optimizer framework

– Sane way to model push down through rules• Interop with other Calcite adaptors

– Already used by Drill, Hive, Kylin, Samza– Supports any JDBC source (RDBMS - remember them )– One cost-model to rule them all

How does Phoenix plug in?

Calcite Parser & Validator

Calcite Query Optimizer

Phoenix Query Plan GeneratorPhoenix Runtime

JDBC Client

SQL + Phoenix specific grammar

Built-in rules + Phoenix specific

rules

What’s Next?Query

Drillbit

Calcite

Phoenix/HBase HDFS

Drillbit

Calcite

RDBMS

Drillbit

Calcite

Samza/ Streams

Drillbit

Calcite

Thank you!Questions?

* who uses Phoenix

apache phoenix: oltp in hadoop, by james taylor - saleforce.com

Technology