20+ million records a second - dellemc.com · apache kafka performance with dell emc isilon f800...

39
Abstract This paper describes performance test results for running Kafka with Dell EMC Isilon F800 All-Flash NAS Storage. A comparison against direct attached storage is also provided. 20+ MILLION RECORDS A SECOND Running Kafka with Dell EMC Isilon All Flash F800 Scale-out NAS Author: Boni Bruno, CISSP, CISM, CGEIT Chief Solutions Architect, Dell EMC Authotr

Upload: dangtu

Post on 25-Feb-2019

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Abstract

This paper describes performance test results for running Kafka with Dell

EMC Isilon F800 All-Flash NAS Storage. A comparison against direct

attached storage is also provided.

20+ MILLION RECORDS A SECOND

Running Kafka with Dell EMC Isilon All Flash F800 Scale-out NAS

Author: Boni Bruno, CISSP, CISM, CGEIT

Chief Solutions Architect, Dell EMC

Authotr

Page 2: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Copyright © August 2018 Dell Inc. or its subsidiaries. All rights reserved.

Dell believes the information in this publication is accurate as of its publication date. The

information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS-IS.“ DELL MAKES NO REPRESENTATIONS

OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND

SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A

PARTICULAR PURPOSE. USE, COPYING, AND DISTRIBUTION OF ANY DELL SOFTWARE DESCRIBED IN

THIS PUBLICATION REQUIRES AN APPLICABLE SOFTWARE LICENSE.

Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks

may be the property of their respective owners.

EMC Corporation

Hopkinton, Massachusetts 01748-9103

1-508-435-1000 In North America 1-866-464-7381

www.EMC.com

Page 3: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Contents

Overview .......................................................................................................................................................................... 4

Kafka Introduction ......................................................................................................................................................... 5

Dell EMC Isilon F800 All-Flash NAS ................................................................................................................................ 8

Performance Test Environment .................................................................................................................................. 11

Performance Test Results ............................................................................................................................................ 12

Test 1 - Single producer (1 Broker), no replication, 50M records, 100 Byte record size ........................... 12

Test 2 - Single producer (5 brokers), 3x asynchronous replication, 50M records, 100 Byte record size 13

Test 3 - Single producer (5 brokers), 3x synchronous replication, 50M records, 100 Byte record size .. 15

Test 4 - 5 producers, no replication, 250M records, 100 Byte record size .................................................. 16

Test 5 - 5 producers, 3x asynchronous replication, 250M records, 100 Byte record size ......................... 17

Test 6 – Effect of Record Size on Producer Throughput ................................................................................ 19

Test 7 - Single consumer, 50M records, 100 Byte record size ........................................................................ 21

Test 8 - 5 consumers, 250M records, 100 Byte record size ............................................................................. 22

Test 9 – 1 producer & 1 consumers, 50M records written, 50M records read ........................................... 23

Test 10 – Stress testing Isilon F800 All-Flash Scale-out NAS ............................................................................. 24

Conclusions ................................................................................................................................................................... 27

Appendix ....................................................................................................................................................................... 28

Kafka Server Properties ........................................................................................................................................ 29

Zookeeper Properties ........................................................................................................................................... 30

Producer Properties .............................................................................................................................................. 30

Consumer Properties ............................................................................................................................................ 30

NFS Client Configuration ..................................................................................................................................... 31

Isilon Configuration ............................................................................................................................................... 33

OneFS TCP Tuning ................................................................................................................................................. 37

Kafka End-to-End Latency Test .......................................................................................................................... 39

Page 4: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Overview

Kafka is a distributed, horizontally-scalable, fault-tolerant, stream processing system being used in many

enterprises. Kafka is a system that lets you publish and subscribe to streams of data, it also stores and

process the data. It is now a part of the Apache Software Foundation with a commercial version

available through Confluent that includes Kafka software enhancements and Enterprise level of

support.

Kafka runs as a cluster and can scale to handle millions of records a second. This paper covers Kafka

performance test results for a Kafka DAS (Direct Attached Storage) cluster using PowerEdge R730XD

servers and a Kafka Isilon F800 NAS (Network Attached Storage) cluster using the same servers.

As you read through this paper, you will see that Dell EMC Isilon F800 Scale-out NAS solution provides

excellent performance and better storage utilization with less drives and a smaller storage foot print

compared to DAS. This is great news for customers looking to simplify their Kafka cluster deployments

and improve storage efficiency.

Page 5: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Kafka Introduction

A Kafka cluster consists of Producers that send records to the cluster, the cluster stores these records and

makes them available to Consumers. A general Kafka cluster diagram is shown below for reference.

A key concept to understand with Kafka is what is known as a Topic. Producers publish their records to

a specific topic and consumers can subscribe to one or more of these topics. A Kafka topic is just a

partitioned write-ahead log. Producers append records to these logs and consumers simply subscribe

to the changes. The records consist of a key/value pair. The key is used for assigning the records to a

log partition. Below is an example of a topic with four partitions with writes being appended to the end

of each partition.

Partitions also provide redundancy and scalability. Each partition can be hosted on a different server

allowing a single topic to be scaled horizontally to increases cluster performance. The term stream in

Kafka is a single topic of data regardless of the number of partitions.

Consumers work as part of a consumer group where one or more consumers work together to consume

a topic. Each partition is only consumed by one member of the consumer group. Below is an example

Page 6: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

of three consumers in a single group consuming a single topic. Here two consumers are working from

one partition and the third consumer is working from two partitions.

Consumers can scale to consume topics with large number of messages. If a single consumer fails, the

remaining members of the group will rebalance the partitions being consumed to take over for the

failed consumer.

A single Kafka server if called a Broker. Brokers receive messages from producers, assigns offsets to

them, and commits the messages to disk. Brokers also service requests from consumers. Brokers are part

of the Kafka Cluster, only one broker will be elected as the cluster controller to assign partitions to

brokers and detect broker failures. A partition is owned by a single broker, this broker becomes the

leader of the partition. If a partition is assigned to multiple brokers, the partition will be replicated to

provide better redundancy if a broker were to fail.

The diagram below shows replication of partitions in Kafka.

Page 7: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Kafka allows retention policies to be configured where Kafka Brokers retain messages for some period of

time or until a topic reaches a certain size in bytes. Once these limits are reached, messages are

expired and deleted.

Storage management gets complicated with Kafka as retention requirements increase or as the Kafka

cluster itself increases in size. Centralizing storage with Dell EMC Isilon can greatly simplify storage

management issues with Kafka as more space is needed, with Isilon you simply add more Isilon nodes to

the backend and capacity is instantly available with no need to change any of the Kafka configuration

except for maybe increasing the retention policy.

The next section covers Dell EMC Isilon in more detail.

Page 8: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Dell EMC Isilon F800 All-Flash NAS

Dell EMC Isilon F800 all-flash scale-out NAS storage provides up to 250,000 IOPS and 15 GB/s bandwidth

per chassis. With a choice of SSD drive capacities, all-flash storage ranges from 96 TB to 924 TB per

chassis making the Isilon F800 ideal for demanding storage requirements in high volume messaging

systems like Kafka.

In additional to all-flash high-performance scale-out hardware design of the Isilon F800, the embedded

storage operating system (Isilon OneFS) provides a unifying clustered file system with built-in scalable

data protection that simplifies storage management and administration. OneFS is a fully symmetric file

system with no single point of failure — taking advantage of clustering not just to scale performance

and capacity, but also to allow for any-to-any failover and multiple levels of redundancy that go far

beyond the capabilities of RAID.

OneFS allows hardware to be incorporated or removed from the cluster at will and at any time,

abstracting the data and applications away from the hardware. Data is given infinite longevity and the

cost and pain of data migrations and hardware refreshes are eliminated.

Isilon nodes

OneFS works exclusively with the Isilon scale-out NAS nodes, referred to as a “cluster”. A single Isilon

cluster consists of multiple nodes, which are rack-mountable enterprise appliances containing: memory,

CPU, networking, Ethernet or low-latency InfiniBand interconnects, disk controllers and storage media.

As such, each node in the distributed cluster has compute as well as storage capabilities.

With the new generation of Isilon hardware (“Gen6”), a single chassis of 4 nodes in a 4U form factor is

required to create a cluster, which currently scales up to 144-nodes. Previous Isilon hardware platforms

need a minimum of three nodes and 6U of rack space to form a cluster. There are several different

types of nodes, all of which can be incorporated into a single cluster, where different nodes provide

varying ratios of capacity to throughput or Input/Output operations per second (IOPS).

Each node or chassis added to a cluster increases aggregate disk, cache, CPU, and network capacity.

OneFS leverages each of the hardware building blocks, so that the whole becomes greater than the

sum of the parts. The RAM is grouped together into a single coherent cache, allowing I/O on any part of

the cluster to benefit from data cached anywhere. A file system journal ensures that writes are safe

across power failures. Spindles and CPU are combined to increase throughput, capacity and IOPS as

the cluster grows, for access to one file or for multiple files. A cluster’s storage capacity can range from

a minimum of 18 terabytes (TB) to a maximum of greater than 68 petabytes (PB). The maximum

capacity will continue to increase as disk drives and node chassis continue to get denser.

Isilon nodes are broken into several classes, or tiers, according to their functionality:

Page 9: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

This paper focuses on the F800 node type for Kafka. A good alternative to the F800 is the H600 node

type if storage capacity requirements are lower.

Network

There are two types of networks associated with a cluster: internal and external.

Back-end (internal) network

All intra-node communication in a cluster is performed across a dedicated backend network,

comprising either 10 or 40 GbE Ethernet, or low-latency QDR InfiniBand (IB). This back-end network,

which is configured with redundant switches for high availability, acts as the backplane for the cluster.

This enables each node to act as a contributor in the cluster and isolating node-to-node

communication to a private, high-speed, low-latency network. This back-end network utilizes Internet

Protocol (IP) for node-to-node communication.

Front-end (external) network

Clients connect to the cluster using Ethernet connections (1GbE, 10GbE or 40GbE) that are available on

all nodes. Because each node provides its own Ethernet ports, the amount of network bandwidth

available to the cluster scales linearly with performance and capacity. The Isilon cluster supports

standard network communication protocols to a customer network, including NFS, SMB, HTTP, FTP, HDFS,

and OpenStack Swift. Additionally, OneFS provides full integration with both IPv4 and IPv6 environments.

The Kafka Isilon F800 cluster tested in this paper uses NFS v3 as the network communication protocol.

Complete cluster view

Page 10: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

The complete cluster is combined with hardware, software, networks in the following view:

File system structure

The OneFS file system is based on the UNIX file system (UFS) and, hence, is a very fast distributed file

system. Each cluster creates a single namespace and file system. This means that the file system is

distributed across all nodes in the cluster and is accessible by clients connecting to any node in the

cluster. There is no partitioning, and no need for volume creation.

Because all information is shared among nodes across the internal network, data can be written to or

read from any node, thus optimizing performance when multiple users or applications are concurrently

reading and writing to the same set of data.

For more details on Isilon and OneFS please see Isilon Technical Overview.

Page 11: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Performance Test Environment

Kafka

The Kafka version tested for this paper is version 2.12-1.1.0. Kafka configuration (including zookeeper)

properties are shown in the Appendix.

Compute Nodes

All the compute nodes are identical Dell PowerEdge R730xd servers with 40 cores, 256G RAM, 25 x 1.1 TB

SAS disks (directly mounted JBOD, no RAID), and 10G NIC running CentOS Linux release 7.4.1708 (Core).

Up to 12 x PowerEdge R730xd servers were used for various test scenarios that are described in detail in

the Performance Test Results section of this paper. A total of 5 zookeeper servers are configured in the

test environment and run on the first five Kafka servers – k0, k1, k2, k3, and k4. The remaining Kafka

servers are named k5 – k11.

Isilon F800 (NFS Mounted from each Kafka compute node)

A single Isilon F800 Chassis with 60 x 1.6TB SSD drives available for Kafka-Isilon testing.

The specific Isilon Model tested: Isilon F800-4U-Single-256GB-1x1GE-2x40GE SFP+-24TB SSD

The Isilon OneFS release tested: OneFS v 8.1.0.4. NFS configuration details are listed in the Appendix.

Kafka Clusters

Two Kafka clusters were tested - a DAS cluster with 300 x SAS drives vs a single 4U Isilon F800 cluster with

only 60 drives.

The DAS cluster strictly uses PowerEdge R730xd servers for both compute and storage, a total of 12

PowerEdge servers with 300 SAS disks were available (~ 300 TB capacity) for the DAS Kafka cluster. All

the compute nodes were connected 10GbE with jumbo frames (MTU 9014) enable.

The Isilon cluster uses a single 4U Isilon F800 for all Kafka storage with the PowerEdge R730xd servers used

for compute only. Each Kafka server NFS mounts a corresponding kafka-logs directory from Isilon.

Each server has a unique mount point. Details on the NFS setup is shown in the Appendix. Only the OS

drive and an additional 1.1 TB drive was used on each PowerEdge server for the Isilon cluster. The Isilon

F800 connected to the 10GigE PowerEdge environment over 40GbE ports on the core switch with jumbo

frames (MTU 9000) enabled.

Note: Only half the 40GbE front-end ports available on Isilon were connected due to the lack of

available 40GbE ports on the core switch during testing.

Page 12: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Performance Test Results

Producer Throughput Tests

The producer throughput tests stress the throughput of the producer on each cluster (DAS & Isilon). No

consumers are run during these tests so all messages are persisted but not read. Results below show the

average of three test runs.

Note: The optimum batch size is used for each cluster, this is determined by various test runs for each

cluster and seeing what yields the best performance result.

Test 1 - Single producer (1 Broker), no replication, 50M records, 100 Byte record size

Producer Test 1 Setup:

DAS:

kafka-topics.sh –-zookeeper k0:2181 –-create –-topic DASr1 –-partitions 8 –-replication factor 1

ISILON:

kafka-topics.sh --zookeeper k0:2181 --create --topic F800r1 --partitions 8 --replication-factor 1

Producer Test 1 Commands:

DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic DASr1 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864

batch.size=8196

ISILON:

bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic rep1 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864

batch.size=194196

DAS Producer Throughput Result: 50,000,000 records sent, 1,231,861 records/sec (117 MB/sec)

25ms average latency, 42ms 95th percentile latency

ISILON Producer Throughput Result: 50,000,000 records sent, 1,401,424 records/sec (134 MB/sec)

Page 13: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

7ms average latency, 7ms 95th percentile latency

Test 2 - Single producer (5 brokers), 3x asynchronous replication, 50M records, 100 Byte record size

Test 2 is exactly the same as the previous one except that now each partition has three replicas (so the

total data written to cluster is three times greater). Each server is doing both writes from the producer for

the partitions for which it is a master, as well as fetching and writing data for the partitions for which it is

a follower.

Replication in this test is asynchronous. That is, the server acknowledges the write as soon as it has

written it to its local log without waiting for the other replicas to also acknowledge it. This means, if the

master were to crash, it would likely lose the last few messages that had been written but not yet

replicated. This makes the message acknowledgement latency a little better at the cost of some risk in

the case of server failure.

When using a JBOD configuration on DAS, replication is important to increase redundancy, however the

total cluster write capacity is 3x less with 3x replication (since each write is done three times). Isilon uses

erasure coding to increase redundancy and improve storage efficiency (near 80% efficiency). Also, the

Isilon high-speed interconnect provides data accessibility across all nodes, even when there is a node

failure on Isilon, data can be retrieved from the remaining nodes without any re-elections on Kafka.

Producer Test 2 Setup:

DAS:

kafka-topics.sh –-zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 –-create –-topic DASr3 –-partitions

8 –-replication factor 3

ISILON:

kafka-topics.sh --zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --topic F800r3 --partitions

8 --replication-factor 3

Page 14: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Producer Test 2 Commands:

DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic DASr3 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864

batch.size=8196

ISILON:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic F800r3 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864

batch.size=194196

DAS Producer Throughput Result: 50,000,000 records sent, 1,217,777 records/sec (116 MB/sec)

40ms average latency, 62ms 95th percentile latency

ISILON Producer Throughput Result: 50,000,000 records sent, 1,224,380 records/sec (117 MB/sec)

17ms average latency, 27ms 95th percentile latency

Asynchronous replication (acks=1) does decrease throughput, increases latency, and uses up more

storage space. When using Isilon, 3x data replication is not needed based on the redundancy and

efficiency built into Isilon/OneFS, but it is recommended for DAS Kafka cluster deployments.

For Isilon, a replication of 1 is fine and will offer the best throughput and storage efficiency, use a

replication of 2 to provide better Kafka server fault tolerance if Kafka compute node failure is a concern

when deploying Kafka with Isilon.

Page 15: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 3 - Single producer (5 brokers), 3x synchronous replication, 50M records, 100 Byte record size

Test 3 is the same as Test 2 except that now the master for a partition waits for acknowledgement from

the full set of in-sync replicas before acknowledging back to the producer. With synchronous

replication, Kafka ensures that messages will not be lost as long as one in-sync replica remains.

Synchronous replication (acks = -1) in Kafka is not fundamentally very different from asynchronous

replication. The leader for a partition always tracks the progress of the follower replicas, Kafka will not

send out messages to consumers until they are fully acknowledged by replicas. With synchronous

replication Kafka waits to respond to the producer request until the followers have replicated it.

Producer Test 3 Commands:

DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic DASr3 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=-1 bootstrap.servers=k0:9092

buffer.memory=67108864 batch.size=8196

ISILON:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic F800r3 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=-1 bootstrap.servers=k0:9092

buffer.memory=67108864 batch.size=194196

DAS Producer Throughput Result: 50,000,000 records sent, 269,879 records/sec (26 MB/sec)

2096ms average latency, 6850ms 95th percentile latency

ISILON Producer Throughput Result: 50,000,000 records sent, 1,046,703records/sec (100 MB/sec)

33ms average latency, 43ms 95th percentile latency

Synchronous replication (acks = -1) does decrease throughput significantly and also introduces

significant latency on the DAS cluster. In all three single producer test cases, the throughput and

latency results were better with the Isilon NAS cluster even though Isilon had less disks.

Page 16: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 4 - 5 producers, no replication, 250M records, 100 Byte record size

Test 4 is the same as Test1 except now we have increase the number of producers to 5 and the

generated record load to 250M records for each DAS and Isilon NAS Kafka cluster.

Producer Test 4 Setup:

DAS:

kafka-topics.sh –-zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 –-create –-topic 5DASr1 –-partitions

8 –-replication factor 1

ISILON:

kafka-topics.sh --zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --topic 5F800r1 --

partitions 8 --replication-factor 1

Producer Test 4 Commands:

Note: The commands are run simultaneously on each of the 5 Kafka producer servers (k0-k5), the only

change to each command is with the bootstrap.servers parameter, the local producer is referenced

with each command, i.e. server k0 used bootstrap.servers k0, k1 uses k1, etc. This has nothing to do with

where partitions are written to, this just tells Kafka where to pull the bootstrap information from.

DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5DASr1 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864

batch.size=8196

ISILON:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5F800r1 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=55108864

batch.size=194196

DAS Producer Throughput Result: 250,000,000 records sent, 4,428,438 records/sec (422 MB/sec)

592ms average latency, 2502ms 95th percentile latency

ISILON Producer Throughput Result: 250,000,000 records sent, 5,045,512 records/sec (481 MB/sec)

105ms average latency, 400ms 95th percentile latency

Note: The records/sec and MB/sec results shown represent the aggregate sum across the 5 producers.

The latency results shown are from the slowest producer in the.

Page 17: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 5 - 5 producers, 3x asynchronous replication, 250M records, 100 Byte record size

Test 5 is the same as Test4 except now 3x asynchronous data replication is configured. This is not a

needed (definitely not recommended) configuration for Isilon. This configuration only makes sense

when using a Kafka DAS cluster. Isilon provides much better data redundancy than DAS, however Isilon

does not provide protection against a Kafka broker failure. Use 2x asynchronous replication to protect

against Kafka broker failures when using Kafka with Isilon, otherwise stick to a replication factor of 1

when using Isilon with Kafka to ensure optimum performance and storage efficiency.

Producer Test 5 Setup:

DAS:

kafka-topics.sh –-zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 –-create –-topic 5DASr3 –-partitions

8 –-replication factor 3

ISILON:

kafka-topics.sh --zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --topic 5F800r3 --

partitions 8 --replication-factor 3

Producer Test 5 Commands:

DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5DASr3 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864

batch.size=8196

ISILON:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5F800r5 --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=55108864

batch.size=194196

DAS Producer Throughput Result: 250,000,000 records sent, 5,729,978 records/sec (546 MB/sec)

98ms average latency, 590ms 95th percentile latency

ISILON Producer Throughput Result: 250,000,000 records sent, 3,838,539 records/sec (367 MB/sec)

186ms average latency, 1,081ms 95th percentile latency

Page 18: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

As expected, 3x asynchronous replication increases latency on both clusters. A 3x replication factor

configuration with Isilon will put both the original data and replicated data all on Isilon since each Kafka

server is NFS mounting the associated kafka-logs directory from Isilon (see NFS configuration in Appendix

for details). This configuration unnecessarily adds a lot of network traffic and I/O requests to the Isilon

cluster which should be avoided. Alternatively, the DAS cluster only replicates data to a select number

of servers. Isilon does not need this amount of replication to provide data redundancy. Stick to

replication factor of 1 or at most a replication factor of 2 when using Isilon NAS with Kafka.

Page 19: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 6 – Effect of Record Size on Producer Throughput

So far all producer tests have used a small record size of 100 Bytes, this is not representative of most

production deployments of Kafka, but is a good size to use for stress testing purposes.

To get a better idea of individual producer throughput performance, different record sizes from 10 Bytes

to 100,000 Bytes were tested to see the effect on producer throughput.

Note: Producer throughput can be measured in two ways – number of records processed per second

or the byte throughput per second (MB/sec). Both results are shown below for reference.

1516203

1281251

217367

301663272

1585389

1294378

18032332912

3836

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

10 100 1000 10000 100000

Rec

ord

s/se

c

Record Size (Bytes)

Impact of Record Size on Throughput (Records/sec)

DAS

F800

Page 20: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

The graphs above show that the number of records Kafka can send per second decreases as the

records get larger is size. If we look at MB/second results, the total byte throughput increases as

messages get bigger. It is important to understand the typical record size in your environment so you

can size your Kafka cluster accordingly to meet your throughput requirements.

The results shown above are specific to using PowerEdge R730XD servers with the stated specifications,

different compute specifications will have different throughput results. Test your specific compute

model to get a good understanding of the Kafka throughput capabilities for your particular compute

nodes.

14

122

207

288 312

15

123

172

314

366

0

50

100

150

200

250

300

350

400

10 100 1000 10000 100000

MB

/sec

Record Size (Bytes)

Impact of Record Size on Throughput (MB/sec)

DAS F800

Page 21: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Consumer Throughput Tests

So far only producer testing has been conducted, producer tests provide insights on the write

capabilities of Kafka. As shown in the Kafka producer tests above, both the DAS and Isilon clusters can

write over 5M records a second with only 5 producers.

A key value proposition to note so far is that Isilon is providing very good performance with less spindles,

less power, and a smaller storage foot print. The 5 producers tested ran on 2U PowerEdge R730xd

servers with 25 drives each, thus the 5 node producer DAS Kafka cluster tested has a total of 125 disks

and takes 10U of rack space. Adhering to Kafka best practices for DAS clusters and using a replication

factor of 3, the approximate useable storage for the DAS cluster is only 42 TB (125 TB / 3).

On the other hand, Isilon offers similar performance with half the number of disk drives and uses only 4U

of rack space and provides over 80 TB of usable storage! This is great news for Kafka administrators and

organizations looking to reduce the foot print of their Kafka clusters without sacrificing performance.

Now let’s look at the read capabilities of Kafka by running various consumer throughput tests against

both DAS and Isilon NAS Kafka clusters. Note that the replication factor will not affect the outcome of

the consumer tests as the consumer only reads from one replica regardless of the replication factor.

Likewise, the acknowledgement level of the producer also doesn't matter as the consumer only ever

reads fully acknowledged messages.

Test 7 - Single consumer, 50M records, 100 Byte record size

Consumer Test 7 commands:

DAS:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic DASr3 --threads 1

ISILON:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic F800r3 --threads 1

DAS Consumer Throughput Result: 50,000,000 records read, 1,763,544 records/sec (168 MB/sec)

ISILON Consumer Throughput Result: 50,000,000 records read, 1,795,139 records/sec (171 MB/sec)

Page 22: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 8 - 5 consumers, 250M records, 100 Byte record size

Consumer Test 8 commands: (Executed on 5 Kafka consumers simultaneously)

DAS:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 5DASr3 --threads 1

ISILON:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 5F800r3 --threads 1

DAS Consumer Throughput Result: 50,000,000 records read, 8,428,965 records/sec (803 MB/sec)

ISILON Consumer Throughput Result: 50,000,000 records read, 8,743,123 records/sec (834 MB/sec)

Page 23: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 9 – 1 producer & 1 consumers, 50M records written, 50M records read

The performance tests conducted thus far covered just the Kafka producers and the Kafka consumers

running in isolation. A typical deployment of Kafka runs the producer and consumer together.

Technically, the performance tests above have been running both the producer and consumers

together as Kafka replication works by using the servers themselves as consumers.

For this test one producer and one consumer is run against an eight partition 3x replicated topic that

begins empty. The producer is using async replication. The throughput reported is the consumer

throughput.

Test 9 setup commands:

DAS:

kafka-topics.sh --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --partitions 8 --

replication-factor 3 --topic r3DASnew

ISILON:

kafka-topics.sh --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --partitions 8 --

replication-factor 3 --topic r3F800new

Test 9 test commands:

DAS Producer:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic r3DASnew --num-records 50000000 --

throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k1:9092 buffer.memory=67108864

batch.size=8196

DAS Consumer:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic r3DASnew --threads 1

ISILON Producer:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic r3F800new --num-records 50000000 -

-throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k1:9092

buffer.memory=67108864 batch.size=194196

ISILON Consumer:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic r3F800new --threads 1

DAS Consumer Throughput Result: 50,000,000 records read, 1,623,640 records/sec (155 MB/sec)

ISILON Consumer Throughput Result: 50,000,000 records read, 1,745,932 records/sec (166 MB/sec)

Page 24: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 10 – Stress testing Isilon F800 All-Flash Scale-out NAS

A single Isilon F800, as tested in this paper, only comes with 4 nodes. Isilon normally recommends having

one Isilon node for each compute node in a high performance distributed cluster. The previous results

already show that the Isilon F800 can easily handle a 5 node Kafka cluster.

This test will increase the compute count to 12 nodes to see how well a 4-node F800 Isilon cluster can

support 12 Kafka servers simultaneously generating 50M records then consuming 50M records from the

same Kafka Topic. This equates to generating and consuming 600M records in total with just a single

Isilon F800 chassis that has 4 nodes and 60 drives. Furthermore, three different record sizes will be tested,

10 Bytes, 100 Bytes, and 512 Bytes.

Test 10 Setup:

kafka-topics.sh --create --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --topic 12nodes10 --

partitions 8 --replication-factor 1

kafka-topics.sh --create --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --topic 12nodes100 --

partitions 8 --replication-factor 1

kafka-topics.sh --create --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --topic 12nodes512 --

partitions 8 --replication-factor 1

Test 10 commands: (Executed on 12 Kafka server nodes simultaneously)

Producer commands: 10 Byte Test:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 12nodes100 --num-records 50000000

--throughput -1 --record-size 10 --producer-props acks=1 bootstrap.servers=k0:9092

buffer.memory=55108864 batch.size=194196

100 Byte Test:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 12nodes100 --num-records 50000000

--throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092

buffer.memory=55108864 batch.size=194196

512 Byte Test:

kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 12nodes100 --num-records 50000000

--throughput -1 --record-size 512 --producer-props acks=1 bootstrap.servers=k0:9092

buffer.memory=55108864 batch.size=194196

Consumer commands:

10 Byte Test:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 12nodes10 --threads 1

100 Byte Test:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 12nodes100 --threads 1

512 Byte Test:

kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 12nodes512 --threads 1

Page 25: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Test 10 Results:

Producer results:

ISILON 10B Throughput Result: 600,000,000 records, 17,719,448 records/sec (168 MB/sec)

ISILON 100B Throughput Result: 600,000,000 records, 12,161,606 records/sec (1,158 MB/sec)

ISILON 512B Throughput Result: 600,000,000 records, 4,335,062 records/sec (2,115 MB/sec)

Consumer results:

ISILON 10B Throughput Result: 600,000,000 records read, 19,141,011 records/sec (1,825 MB/sec)

ISILON 100B Throughput Result: 600,000,000 records read, 20,030,055 records/sec (1,911 MB/sec)

ISILON 512B Throughput Result: 600,000,000 records read, 10,607,182 records/sec (5,179 MB/sec)

A single Isilon F800 performed very well under high load. The Kafka results show that producers can write

over 17+ million records a second and consumers can read over 20+ million records a second with a

single F800 Isilon chassis.

All the performance results shown thus far are directly from Kafka. We can also get stats directly from

Isilon while under load to see the utilization of the individual Isilon nodes, network throughput rates, disk

throughput rates, cpu utilization, active number of clients, etc.

Below are Isilon specific performance reporting charts during the 10 Byte, 100 Byte, and 512 Byte tests

with 12 Kafka servers.

Page 26: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Isilon Performance Report during 10 Byte Record Size Test

Isilon Performance Report during 100 Byte Record Size

Page 27: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Isilon Performance Report during 512 Byte Record Size

You can see from the Isilon performance reports above the all four nodes show even load distribution

for network and disk throughput rates. Load balancing network connections and disk I/O is a key value

proposition Isilon provides and it’s automatic. All the Kafka nodes just NFS mount Isilon by its FQDN (fully

qualified domain name) and OneFS transparently handles all the load balancing for you.

Also note that the CPU utilization stayed low during all test runs. This means the F800 has resources to

spare while the stress test was running, this is a testament to the engineering work that went into the

product.

Conclusions

This paper shows that a single Isilon F800 performed very well under high load with Kafka. The Kafka

results show that producers can write over 17+ million records a second and consumers can read over

20+ million records a second using a single Isilon F800 NAS system. During all performance tests, the

CPU utilization across the entire Isilon cluster stayed very low.

The Isilon F800 Scale-out NAS storage system performed just as well as a Kafka DAS (direct attached

storage) cluster that had 5x the number of disks. The embedded erasure code design with Isilon OneFS

also provides much better storage efficiency and data protection than Kafka DAS clusters that use a 3x

replication factor. This allows Kafka administrations to safely lower the Kafka replication factor and

increase storage capacity and efficiency with ease when using Isilon.

Page 28: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Appendix

Page 29: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Kafka Server Properties

Below is the running server property configuration file used for this paper, the only difference between

the Kafka DAS configuration and the Kafka Isilon configuration is what is highlighted below, everything

else is the same on both clusters.

broker.id=0

listeners=PLAINTEXT://k0:9092

num.network.threads=24

num.io.threads=8

socket.send.buffer.bytes=13107200

socket.receive.buffer.bytes=13107200

socket.request.max.bytes=104857600

# Just a single Isilon NFS mount needed with the Isilon config

log.dirs=/mnt/k0/kafka-logs

# Kafka DAS config has all direct attached disk drives (24) used, the remaining drive is for OS.

log.dirs=/data1/kafka/kafka-logs,/data2/kafka/kafka-logs,/data3/kafka/kafka-logs,/data4/kafka/kafka-

logs,/data5/kafka/kafka-logs,/data6/kafka/kafka-logs,/data7/kafka/kafka-logs,/data8/kafka/kafka-

logs,/data9/kafka/kafka-logs,/data10/kafka/kafka-logs,/data11/kafka/kafka-logs,/data12/kafka/kafka-

logs,/data13/kafka/kafka-logs,/data14/kafka/kafka-logs,/data15/kafka/kafka-logs,/data16/kafka/kafka-

logs,/data17/kafka/kafka-logs,/data18/kafka/kafka-logs,/data19/kafka/kafka-logs,/data21/kafka/kafka-logs

,/data21/kafka/kafka-logs,/data22/kafka/kafka-logs,/data23/kafka/kafka-logs,/data24/kafka/kafka-logs

num.partitions=8

num.recovery.threads.per.data.dir=1

offsets.topic.replication.factor=1

transaction.state.log.replication.factor=1

transaction.state.log.min.isr=1

log.retention.hours=168

log.segment.bytes=1073741824

log.retention.check.interval.ms=300000

zookeeper.connect=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181

zookeeper.connection.timeout.ms=6000

group.initial.rebalance.delay.ms=0

delete.topic.enable=true

Page 30: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Zookeeper Properties

Below is the zookeeper properties file used for both Kafka DAS and Kafka Isilon clusters.

clientPort=2181

maxClientCnxns=0

server.0=k0:2888:3888

server.1=k1:2888:3888

server.2=k2:2888:3888

server.3=k3:2888:3888

server.4=k4:2888:3888

initLimit=5

syncLimit=2

Producer Properties

bootstrap.servers=k0:9092,k1:9092,k2:9092,k3:9092,k4:9092

compression.type=lz4

Consumer Properties

bootstrap.servers=k0:9092,k1:9092,k2:9092,k3:9092,k4:9092

group.id=test-consumer-group

Page 31: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

NFS Client Configuration

The Kafka Isilon cluster uses NFS to mount the remote Isilon OneFS file system. Using NFS centralizes all

the Kafka data on Isilon, this provides Kafka administrators an easy way to immediately increase storage

capacity on the fly by simply adding more Isilon nodes. No need to add servers or modify any of the

Kafka configuration, as soon as new Isilon nodes are added to the Isilon cluster, the Kafka cluster will

immediately see an increase in storage capacity and performance.

Since NFS is being use as the protocol to network the Kafka servers to Isilon, it’s important to optimize the

NFS client settings to obtain the performance detailed in this paper. Below are the NFS mount options

used in /etc/fstab on each Kafka server for the Kafka Isilon cluster.

Note: The mount options are the same on each Kafka server, the only difference on each server is the

mount point itself. The below example is the configuration for Kafka server k0 only, which mounts /ifs/k0

from Isilon. Kafka server k1 mounts /ifs/k1 from Isilon and so on. Each Kafka server needs its own unique

NFS export from Isilon/OneFS. The Isilon configuration is describe in detail in the next section for

reference. A single export on Isilon could have been used as well, in this case, each Kafka server would

just mount a different sub directory.

Example Kafka Server K0 NFS Client Configuration:

/etc/fstab

isilon.example.com:/ifs/k0 /mnt/k0 nfs nolock, noacl, nocto, noatime, async, nodiratime,

nfsvers=3, tcp, rw, hard, intr, timeo=600, retrans=2, rsize=524288, wsize=524288 0 0

The isilon.example.com entry in /etc/fstab above corresponds to the OneFS SmartConnect Zone Name.

SmartConnect is what provides load balancing via DNS, so you must delegate this zone name to Isilon

on your DNS server to ensure a proper load balancing configuration for Kafka.

See the SmartConnect Whitepaper for further information.

A breakdown of what the NFS mount options above do are described below for reference. These

settings increase NFS performance.

Page 32: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

async mode allows Isilon to reply to the NFS client as soon as it has processed the I/O request and sent

it to the local filesystem.

nfsvers=3 specifies NFS version 3 to be used.

noacl disables Access Control List (ACL) processing.

noatime option specifies that inode access times are not updated on the filesystem.

nocto option suppresses the retrieval of new attributes when creating a file.

nodiratime option specifies that the directory inode is not updated on the filesystem when it is

accessed.

nolock option prevents the exchange of file lock information between the NFS server and this NFS client.

retrans specifies the number of tries the NFS client will make to retransmit the packet.

rsize and wsize options specify the number of bytes per NFS read and write request.

rw option mounts the remote NFS file system in read/write mode.

tcp option specify NFS over TCP instead of UDP.

timeo option is the amount of time the NFS client waits on the NFS server before retransmitting a

packet.

Page 33: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Isilon Configuration

The following features were configured on the Isilon cluster. The Smart features shown below are

product differentiators that significantly enhance data storage performance and resiliency.

Enable SmartPools settings across all Isilon nodes

Enable SmartConnect to provide automatic client connection load balancing and failover

capabilities

Enable SmartCache for write performance and Streaming Access for Data Access Optimization

Use optimization for streaming data access pattern

Use a 40 Gb/s external network ports for NFS connections and internal 40 Gb/s ports for data

interconnect network

Increase network MTU to 9000 (Jumbo Frames) for both internal and external networks

The SmartCache and Streaming Access optimizations are easily enabled in the Isilon OneFS GUI through

a File Pool Policy tab. The Kafka Isilon cluster screen-shot is shown below for reference.

Page 34: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

The Storage Pool is created on the SmartPools tab, this allows you to specify the Isilon nodes and

protection settings. Below is a screen-shot of the storage pool configured for the Kafka Isilon cluster.

The internal and external network configuration is set in the network configuration tab, here you specify

MTU size, IP info, and DNS server info.

Page 35: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

The SmartConnect info is configured within the pool properties, in the Kafka Isilon cluster, the pool name

is pool0 and the SmartConnect info (IP and DNS info blacked-out) is shown in the screen-shot below:

Note: Isilon provides 2 x 40 GbE front-end and 2 x 40GbE back-end ports with each node, the Kafka

Isilon cluster was only configured with one 40 GbE front-end port on each node during performance

testing. With production deployments, use both front-end 40 GbE ports.

Lastly, the NFS exports are configured for each Kafka server, namely /ifs/k0 to /ifs/k11 as shown below:

Page 36: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

The export settings for NFS v 3 are in the Global Settings tab:

Page 37: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

OneFS TCP Tuning

The default TCP stack of OneFS needs some tuning for Kafka and 40GbE connectivity. The tuning needs

to be done within the CLI directly on Isilon. A tcptune.sh script is available at Github.

Simply run sh ./tcptune.sh Max to make the changes, an example script run is shown below:

Before changes:

isilon# sh ./tcptune.sh Max

Tuning TCP stack to Max

TCP sysctls before...

kern.ipc.maxsockbuf=2097152

net.inet.tcp.sendbuf_max=2097152

net.inet.tcp.recvbuf_max=2097152

net.inet.tcp.sendbuf_inc=8192

net.inet.tcp.recvbuf_inc=16384

net.inet.tcp.sendspace=131072

net.inet.tcp.recvspace=131072

efs.bam.coalescer.insert_hwm=209715200

efs.bam.coalescer.insert_lwm=178257920

After Changes: Apply tuning...

Value set successfully

Value set successfully

Value set successfully

Value set successfully

Value set successfully

Value set successfully

Value set successfully

Value set successfully

TCP sysctls after...

kern.ipc.maxsockbuf=104857600

net.inet.tcp.sendbuf_max=52428800

Page 38: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

net.inet.tcp.recvbuf_max=52428800

net.inet.tcp.sendbuf_inc=16384

net.inet.tcp.recvbuf_inc=32768

net.inet.tcp.sendspace=26214400

net.inet.tcp.recvspace=26214400

efs.bam.coalescer.insert_hwm=209715200

efs.bam.coalescer.insert_lwm=178257920

net.inet.tcp.mssdflt=8948

That’s basically it for the Isilon configuration. With Dell EMC Isilon Scale-out NAS, you can now deploy

your own Kafka cluster and centrally store all your data while supporting millions of Kafka write and read

operations a second.

Page 39: 20+ MILLION RECORDS A SECOND - dellemc.com · Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS Overview Kafka is a distributed, horizontally-scalable, fault-tolerant,

Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS

Kafka End-to-End Latency Test

Latency information is provided in the results of some of the tests runs, however the Kafka results were

not end-to-end latency results.

Kafka provides a latency tool to test end-to-end latency between Kafka Producer and Kafka

Consumer. Below are the results of the end-to-end latency test for a single Producer and single

Consumer for both Kafka DAS and Kafka Isilon cluster. The 99.9th percentile latency is better with Isilon

F800. This tool does not provide throughput information. The test is for a 100 byte record size and 5000

records.

DAS bin/kafka-run-class.sh kafka.tools.EndToEndLatency k0:9092 DAS-latency 5000 all 100

0 204.818597

1000 2.255263

2000 1.697824

3000 1.760031

4000 1.704499

Avg latency: 2.7261 ms

Percentiles: 50th = 1, 99th = 36, 99.9th = 50

F800 bin/kafka-run-class.sh kafka.tools.EndToEndLatency k0:9092 F800-latency 5000 all 100

0 158.756064

1000 1.97554

2000 2.1549609999999997

3000 1.612731

4000 1.8153979999999998

Avg latency: 3.7892 ms

Percentiles: 50th = 2, 99th = 36, 99.9th = 41