introducing apache kudu (incubating) - montreal hug may 2016

32
1 © Cloudera, Inc. All rights reserved. Introducing Apache Kudu (incubating) Mladen Kovacevic | Solutions Architect May 2016 Adapted from Jeremy Beard’s original presentation in New York, Dec 2015

Upload: mladen-kovacevic

Post on 17-Feb-2017

316 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

1© Cloudera, Inc. All rights reserved.

Introducing Apache Kudu (incubating)Mladen Kovacevic | Solutions ArchitectMay 2016Adapted from Jeremy Beard’s original presentation in New York, Dec 2015

Page 2: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

2© Cloudera, Inc. All rights reserved.

Presenter

• Mladen Kovacevic• Solutions Architect at Cloudera (Toronto based)• 2+ yrs in Big Data• 3+ yrs building workload optimized systems, exploiting hardware features to

improve software performance and capability (compute, storage, networking)• 6+ yrs engine development enterprise RDBMS

[email protected]

Page 3: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

3© Cloudera, Inc. All rights reserved.

Current storage landscape in Hadoop

HDFS (GFS) excels at:• Scanning large amounts of data• Accumulating data with high

throughputHBase (BigTable) excels at:• Efficiently finding and writing

individual rows• Making data mutable

Gaps exist when these properties are needed simultaneously

Page 4: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

4© Cloudera, Inc. All rights reserved.

Managing the gap (today)

Code Complexity• Manage flow and sync of data between HDFS and HBase

Monitoring and Security• Managing consistent backups, security policies, monitoring and more is hard

Performance• Significant lag between arrival of HBase data “staging” and time when data is

available for analytics.

Page 5: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

5© Cloudera, Inc. All rights reserved.

Hardware landscape is changingrethink design!

• Spinning disk (HDD) -> solid state storage (SSD)• NAND flash: Up to 450k read 250k write IOPS, about 2GB/sec read and 1.5GB/sec write

throughput, at a price of less than $3/GB and dropping• 3D XPoint memory – persistent storage (1000x faster than NAND, cheaper than RAM)

• RAM is cheaper and more abundant:• 64->128->256GB over last few years

• Takeaway 1: The next bottleneck is CPU, and current storage heavy applications weren’t designed with CPU efficiency in mind• Takeaway 2: Column stores are feasible for random access

Page 6: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

6© Cloudera, Inc. All rights reserved.

Apache Kudu (Incubating)Storage for Fast Analytics on Fast Data

• New updatable column store for Hadoop• Simplifies the architecture for building analytic

applications on changing data• Designed for fast analytic performance• Natively integrated with Hadoop

• Donated as incubating project at Apache Software Foundation (November 17, 2015)

• Beta v0.8.0 now available

STRUCTUREDSqoop

UNSTRUCTUREDKafka, Flume

PROCESS, ANALYZE, SERVE

UNIFIED SERVICES

RESOURCE MANAGEMENTYARN

SECURITYSentry, RecordService

FILESYSTEMHDFS

RELATIONALKudu

NoSQLHBase

STORE

INTEGRATE

BATCHSpark, Hive, Pig

MapReduce

STREAMSpark

SQLImpala

SEARCHSolr

SDKKite

Page 7: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

7© Cloudera, Inc. All rights reserved.

• High throughput for big scans (columnar storage and replication)Goal: Within 2x of Parquet

• Low-latency for short accesses (primary key indexes and quorum design)Goal: 1ms read/write on SSD

• Database-like semantics (initially single-row ACID)

• Relational data model (tables/schemas)• Easily integrate with SQL engines (Impala, Drill, Hive)• “NoSQL” style scan/insert/update (Java, C++, Python

clients)

Kudu design goals

Page 8: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

8© Cloudera, Inc. All rights reserved.

What Kudu is not• Not a SQL interface• Just the storage layer• “BYOSQL” – Bring-your-own SQL

• Not a file system• Data must have tabular structure

• Not an application that runs on HDFS• An alternative, native Hadoop storage engine

• Not a replacement for HDFS or HBase• Select the right storage for the right use case• Cloudera will continue to support and invest in all three

Page 9: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

9© Cloudera, Inc. All rights reserved.

What Kudu is

STORAGE SYSTEMfor

TABLESof

STRUCTURED DATA

No SQL engine, nor query planner, nor optimizer. Storage system that lets you get at your data quickly (random/scan).Highly scalable, with ability to provide higher level systems info to exploit its abilities (ie. locality).

Data logically presented to clients as rows and finite set of columnsStored in columnar formatTables are partitioned across tablets managed by tablet servers.Tablets are replicated managed by Raft consensus.

Strongly typed columnsFinite number of columnsAdd/remove columns solely by ALTER TABLE statements

Page 10: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

10© Cloudera, Inc. All rights reserved.

About Kudu• Apache-licensed open source software

• Structured data model

• Basic construct: tables• Tables broken down into tablets (roughly equivalent to partitions)

• Architecture supports geographically disparate, active/active systems • Not the initial design goal

Page 11: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

11© Cloudera, Inc. All rights reserved.

Kudu data model

• Tables have a RDBMS-like schema• Finite number of columns (unlike HBase/Cassandra)• Types: BOOL, INT8/16/32/64, FLOAT, DOUBLE, STRING, BINARY, TIMESTAMP• Some subset of columns make up a primary key

• Fast random reads/writes by primary key• No secondary indexes (yet)

• Columnar layout on disk – Parquet-like• Lazy materialization• Encoding and compression options

11

Page 12: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

12© Cloudera, Inc. All rights reserved.

Table partitioning

• Hash bucketing• Distribute records by hash of partition column(s)• N buckets leads to N tablets

• Range partitioning• Distribute records by ranges of the partition column(s)• N split keys leads to N tablets

• Can be a mix for different columns of the primary key

Page 13: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

13© Cloudera, Inc. All rights reserved.

Kudu tables are partitioned into “tablets” (regions)Partitioning based on primary key (PK)• Native support for

range partitioning and/or hash partitioning (salting)• Hash example:

PRIMARY KEY (tweet_id) DISTRIBUTE BY HASH(tweet_id) INTO 100 BUCKETS

Page 14: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

14© Cloudera, Inc. All rights reserved.

Each tablet is fault tolerant via Raft consensus

• A single Tablet Server can host many tablets• Metadata is stored on just

another tablet, but only Master Server processes host that tablet• Raft consensus:• Strong consistency• Leader election on failure• Replication factor 3 or 5

is typical

Page 15: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

15© Cloudera, Inc. All rights reserved.

Consistency model

• Consistency and replication enforced by Raft consensus (similar to Paxos)• Replication by operation not data

• Single-row transactions now• Multi-row transactions later

• Geo-distributed replicas will be possible under strict time synchronization

• Techniques drawn from Google Spanner and others

Page 16: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

16© Cloudera, Inc. All rights reserved.

Kudu interfaces

• NoSQL-style APIs• Insert(), Update(), Delete(), Scan()• Java, C++, limited Python

• Integrations with MapReduce, Spark, and Impala

• No direct access to underlying Kudu tablet files

• Beta does not have authentication, authorization, encryption

Page 17: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

17© Cloudera, Inc. All rights reserved.

Impala integration

• Opens up Kudu to JDBC/ODBC clients

• Intuitive way to get data into Kudu• INSERT INTO kudu_table SELECT * FROM src_table;

• Additional commands• UPDATE• DELETE• Efficient INSERT VALUES

• Runs on the Kudu C++ client

Page 18: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

18© Cloudera, Inc. All rights reserved.

Performance characteristics

Very CPU efficient• Written in modern C++, uses specialized CPU instructions (SIMD), JIT

compilation with LLVM

Latency dependent on storage hardware capabilities• Expect sub-millisecond response on SSDs and upcoming technologies

No garbage collection allows very large memory footprint with no pauses

Bloom filters reduce the need for many disk accesses

Page 19: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

19© Cloudera, Inc. All rights reserved.

Operating Kudu

• Easiest through Cloudera Manager integration• Separate parcel for now

• Kudu is always compacting• No minor vs. major compaction• No compaction latency spikes

• Web UI is full of metrics and logs

Page 20: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

20© Cloudera, Inc. All rights reserved.

Cluster layout

• One or multiple masters for failover• Only one in current beta• Low CPU and memory impact

• One tablet server per worker node• Can share disks with HDFS• One SSD per worker node just for Kudu WAL can speed up writes

• No dependencies on other Hadoop ecosystem components• But interfacing components like Impala or Spark do

Page 21: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

21© Cloudera, Inc. All rights reserved.

Kudu: Cluster metadata management

• Replicated master• Acts as a tablet directory• Acts as a catalog (which tables exist, etc)• Acts as a load balancer (tracks TS liveness, re-replicates under-replicated

tablets)• Caches all metadata in RAM for high performance• Under heavy load, 99.99% response times still in microseconds (650us)

• Client configured with master addresses• Client asks master for tablet locations as needed and caches them locally

Page 22: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

22© Cloudera, Inc. All rights reserved.

Real-time analytics in Hadoop todayMerging in new data = storage complexity

Downsides:

● Multiple storage layers

● Latest data is hidden

● Files are messy

● Complex to do updates without breaking running queriesNew Partition

Most Recent Partition

Historic Data

HBase

Parquet File

Have we accumulated enough data?

Reorganize HBase file

into Parquet

• Wait for running operations to complete • Define new Impala partition referencing

the newly written Parquet file

Incoming Data (Messaging

System)

Reporting Request

HDFS + Impala

Page 23: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

23© Cloudera, Inc. All rights reserved.

Real-time analytics in Hadoop with Kudu

Improvements:

● One system to operate

● No schedules or background processes

● Handle late arrivals or data corrections with ease

● New data available immediately for analytics or operations

Historical and Real-timeData

Incoming Data (Messaging

System)

Reporting Request

Kudu + Impala

Page 24: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

24© Cloudera, Inc. All rights reserved.

Kudu for data warehousing

• Near real time data visibility• BI tools can display events that happened seconds earlier

• Excellent for star schemas• Fast scans of deep fact tables• Efficient wide fact tables• Simplified updates of slowly changing dimensions

Page 25: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

25© Cloudera, Inc. All rights reserved.

Near real time data warehousing on Kudu

Page 26: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

26© Cloudera, Inc. All rights reserved.

Resources

Join the communityhttp://[email protected]

Download the betacloudera.com/downloads

Read the whitepapergetkudu.io/kudu.pdf

Page 27: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

27© Cloudera, Inc. All rights reserved.

Demo

• Load a standard HDFS Parquet table• Create a Kudu table via Impala• INSERT/UPDATE/DELETE statement with Impala• Spark insert into Kudu table sample job• Reading from Parquet files as input• Map task to insert each row into Kudu

Page 28: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

28© Cloudera, Inc. All rights reserved.

Creating a Kudu table - Impala

Kudu Storage handler

Table name in Impala does NOT match table name in Kudu. Kudu is its own storage layer.

Kudu Master hostname and port

A primary key is mandatory

Page 29: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

29© Cloudera, Inc. All rights reserved.

Spark (Scala) code – Java APIDataFrame Row

Kudu Master hostname and port

Kudu table name

Create a client, session and table object

Extract values from the row, strong types

Create an insert object and row

Set the values by type, column name and column valule

Perform the actual insert

Cleanup

Page 30: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

30© Cloudera, Inc. All rights reserved.

Kudu with Spark DataFramesTie in Kudu with Spark

Map of the Kudu table name, and Kudu master

Table name to refer to within Spark

SparkSQL statement

Page 31: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

31© Cloudera, Inc. All rights reserved.

Kudu code examples and docs

https://github.com/cloudera/kudu-examples

http://www.cloudera.com/documentation/betas/kudu/0-7-0/topics/kudu_development.html

http://getkudu.io/docs/developing.html

Page 32: Introducing Apache Kudu (Incubating) - Montreal HUG May 2016

32© Cloudera, Inc. All rights reserved.

Thank [email protected]