enabling high performance big data platform with … · enabling high performance big data platform...

Post on 06-Aug-2018

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Enabling High performance Big Data platform with RDMA

Tong Liu

HPC Advisory Council

Nov 5th , 2014

2

• Administration tooling

• Performance

• Reliability

• SQL support

• Backup and recovery

From 451 Research 2013 Hadoop survey

Shortcomings of Hadoop

3

Where can we improve Hadoop?

• Issues

– Inherent data latency issue

with HDFS

– Cannot support large number

of small files

– Efficiency of Map Reduce,

Hbase, Hive, etc.

HDFS™ (Hadoop Distributed File System)

Map Reduce HBase

Hive Pig

Map Reduce

SQL (e.g. Impala) • High demand to improve

– Real-time operation

– Fast execution

– Streaming data

4

HDFS Operation

Client

NameNode

DataNode

1 4

8

DataNode

4

8

DataNode

1

4

2

Write

Read

Replication Replication

HDFS Federation

NameNode

• HDFS Federation

• Faster Disks

• Faster CPU and Memory

• Bigger network pipe

5

RDMA (Remote Directory Memory Access)

RDMA over InfiniBand or Ethernet

KERN

EL

HARD

WARE

USER

RACK 1

OS

NIC Buffer 1

Application

1 Application

2

OS

Buffer 1

NIC Buffer 1

TCP/IP

RACK 2

HCA HCA

Buffer 1 Buffer 1

Buffer 1

Buffer 1

Buffer 1

6

RDMA: Critical for Efficient Data Movement

ZERO Copy Remote Data Transfer

Low Latency, High Performance Data Transfers

InfiniBand - 56Gb/s RoCE

Kernel Bypass Protocol Offload

* RDMA over Converged Ethernet

Application Application USER

KERNEL

HARDWARE

Buffer Buffer

7

HDFS Operation with RDMA

Client

NameNode

DataNode

1 4

8

DataNode

4

8

DataNode

1

4

2

Write

Read

Replication Replication

NameNode

8

• Hadoop HDFS-RDMA acceleration: – 100% java code written on top of JXIO

• Same memory footprint as the vanilla client/server uses

– First results show double performance for HDFS WRITE operation

• With 3 replications compared to vanilla

HDFS RDMA Acceleration – Solution 1

9

• Open source

– https://github.com/accelio/accelio/ && www.accelio.org

• Faster RDMA integration to application

• Maximize message and CPU parallelism

Accelio, High-Performance Reliable

Messaging and RPC Library

10

• Package available at: http://hadoop-

rdma.cse.ohio-state.edu/

• Big performance gain with RDMA support

HDFS RDMA Acceleration – Solution 2

11

Map Reduce Workflow

12

RDMA-Enabled MapReduce

• Unstructured Data Accelerator - UDA

– Uses RDMA to do the Shuffle & Merge

– Plug-in architecture

– Open-source

• Supported Hadoop Distributions

– Apache 3.0, Apache 2.2.x, Apache 1.3

– Cloudera Distribution Hadoop 4.4 Inbox

13

Storage Limitations for Hadoop

• Hadoop using local disk to maintain data locality and

reduce latency

– High-value that resides on external storage systems

– Copy data onto HDFS, run Analytics, and then copy the results

to another system

– Wasting storage space

– As data sources increase, managing data is nightmare

• Option of just accessing the external data without having

to deal with “copying”

– Need to provide performance

14

Storage: From Scale-Up to Scale-Out

• Scale-out storage systems using distributed computing

architectures

– Scalable and resilient

15

Sequential Read Performance (singe port)

16

Fastest and Lowest Latency Storage

Access with iSER

iSCSI(TCP/IP)

1 x FC 8 Gbport

4 x FC 8 Gbport

iSER 1 x40GbE/IB

Port

iSER 2 x40GbE/IB

Port(+Acceleratio

n)

KIOPs 130 200 800 1100 2300

0

500

1000

1500

2000

2500

K IO

Ps

@ 4

K I

O S

ize

17

Lustre as Hadoop Storage Solution

RDMA enables highest Lustre performance

18

Hadoop over Cloud?

• Heavily utilized, rather than

being massively provisioned

• Cloud storage is slower and

expensive

• Data locality makes a big

difference for performance

Concerns:

• Lowering the cost of innovation

• Procuring large scale resources

quickly

• Running closer to the data

• Simplifying Hadoop operations

Benefits:

? Performance?

19

• Using OpenStack Built-in components and management

– RDMA is already inbox and used by OpenStack

• RDMA enables faster performance, with much lower CPU%

Fastest OpenStack Storage Access

Hypervisor (KVM)

OS

V

M OS

V

M OS

V

M

Adapter

Open-iSCSI w iSER

Compute Servers

RDMA Capable Interconnect

iSCSI/iSER Target (tgt)

Adapter Local Disks

RDMA Cache

Storage Servers

OpenStack (Cinder)

Using RDMA to

accelerate

iSCSI storage

20

Fast Interconnect with RDMA to Boost Big Data

4X Faster Run Time! Benchmark: TestDFSIO (1TeraByte, 100 files)

2X Higher Performance! Benchmark: 1M Records Workload (4M Operations)

2X faster run time and 2X higher throughput

2X Faster Run Time! Benchmark: MemCacheD Operations

3X Faster Run Time! Benchmark: Redis Operations

21

RDMA Can Accelerate All Layers

Compute

I/O Nodes

Filesystem

Storage

22

What’s Happening with Big Data Platform

Big Data Meets HPC!

23 23

All trademarks are property of their respective owners. All information is provided “As-Is” without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and

completeness of the information contained herein. HPC Advisory Council Mellanox undertakes no duty and assumes no obligation to update or correct any information presented herein

Questions?

top related