single node processes 1 million tps in pure ... - amazon s3 · • latency vs. transactions per...

4
1 THE FIRST, FLASH-OPTIMIZED, IN-MEMORY, NoSQL DATABASE Aerospike, a next generation open source NoSQL in memory database; flash-optimized with ACID properties was recently benchmarked at 1 million transactions per second (TPS) on an Amazon C3.8xlarge instance at a cost of only $1.68 per hour. Aerospike team members Anshu Prateek and Rajkumar Iyer recently performed two price/performance tests to determine how Aerospike performs in the cloud. Knowing the power of Aerospike on bare metal servers they took the product to the cloud to determine if cloud equals high performance and which cloud configu- ration best supports Aerospike. Single Node processes 1 Million TPS in pure RAM for $1.68/hour In the first performance test, a single in-memory Aerospike server was deployed on a single Amazon instance. Multi- ple instance sizes were tested and the instance size was determined based on it’s ability to hold an entire data- set in memory and the ability of the network and CPU to support 10 million objects, each approximately 100 bytes. The small object size was chosen to accommodate data in memory for all selected instances. Data was loaded using the Aerospike Java benchmark cli- ent. The client process, run on non-server instances, was set to maximize load pushed in parallel to the Aerospike server. The number of client servers allocated varied, based on the server instance type. In order to maximize load on the Amazon infrastructure, an in-memory (RAM) workload of 100% read requests was distributed over the entire key space. Hardware Virtual Machine (HVM) instances were selected for the benchmark as they perform better than Para Virtual DATA CENTER AWS VPC HVM ENHANCED NETWORK

Upload: others

Post on 06-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Single Node processes 1 Million TPS in pure ... - Amazon S3 · • Latency Vs. Transactions Per Instance • Aerospike TPS Achieve Linear Scalability on a Handful of Nodes • Aerospike

1

THE FIRST, FLASH-OPTIMIZED, IN-MEMORY, NoSQL DATABASE

Aerospike, a next generation open source NoSQL in memory database; flash-optimized with ACID properties

was recently benchmarked at 1 million transactions per second (TPS) on an Amazon C3.8xlarge instance at a

cost of only $1.68 per hour.

Aerospike team members Anshu Prateek and Rajkumar Iyer recently performed two price/performance tests

to determine how Aerospike performs in the cloud. Knowing the power of Aerospike on bare metal servers

they took the product to the cloud to determine if cloud equals high performance and which cloud configu-

ration best supports Aerospike.

Single Node processes 1 Million TPS in pure RAM for $1.68/hour

In the first performance test, a single in-memory Aerospike

server was deployed on a single Amazon instance. Multi-

ple instance sizes were tested and the instance size was

determined based on it’s ability to hold an entire data-

set in memory and the ability of the network and CPU to

support 10 million objects, each approximately 100 bytes.

The small object size was chosen to accommodate data in

memory for all selected instances.

Data was loaded using the Aerospike Java benchmark cli-

ent. The client process, run on non-server instances, was

set to maximize load pushed in parallel to the Aerospike

server. The number of client servers allocated varied, based

on the server instance type. In order to maximize load on

the Amazon infrastructure, an in-memory (RAM) workload

of 100% read requests was distributed over the entire key

space.

Hardware Virtual Machine (HVM) instances were selected

for the benchmark as they perform better than Para Virtual

DATA CENTER

AWS

VPC

HVM ENHANCED NETWORK

Page 2: Single Node processes 1 Million TPS in pure ... - Amazon S3 · • Latency Vs. Transactions Per Instance • Aerospike TPS Achieve Linear Scalability on a Handful of Nodes • Aerospike

2

THE FIRST, FLASH-OPTIMIZED, IN-MEMORY, NoSQL DATABASE

instances at the same price point. A placement group (a logical grouping of instances within a single Avail-

ability Zone) was selected as it enables applications to participate in a low-latency, full bisection network

and is a prerequisite for using HVM instances.

Some additional Aerospike tuning was required to set the service thread config appropriately and to resolve

the bottleneck resulting from interrupt processing on a single core. The end result; superior performance

where five clients (C3.2xlarge) pushed the load to a single Aerospike node running on C3.8xlarge (with

60GB RAM and 2x320 SSD) to process 1 Million TPS at a cost of only $1.68/hour.

Four Node Cluster Performance

A second performance test was performed to evaluate the performance of a 4-node Aerospike in RAM clus-

ter deployed on 4 Amazon instances using 5 different read/write workloads; 100% writes, 50/50 balanced

reads/writes, 80/20 and 95/5 to 100% reads. The number of objects were increased to 40 million to test

that the cluster was doing more work than a single node, and to ensure synchronous replication and imme-

diate consistency. The nodes were in the same availability zone, supported enhanced networking and were

set up on a Virtual Private Cloud (VPC).

The Results:

• Aerospike Scales on large and xlarge Instances

• Latency Vs. Transactions Per Instance

• Aerospike TPS Achieve Linear Scalability on a Handful of Nodes

• Aerospike Shows Optimal Price/Performance on Amazon r3.2xlarge

Aerospike Scales on large and xlarge Instances

Figure 1 compares the number of trans-

actions processed by the 4 selected

instances for each of the workloads. The

results indicate the Amazon r3.large and

r3.2xlarge instances result in the highest

throughput across all workloads and pro-

vide linear scalability as the percentage

of reads increase.

Fig 1 TPS per instance

Page 3: Single Node processes 1 Million TPS in pure ... - Amazon S3 · • Latency Vs. Transactions Per Instance • Aerospike TPS Achieve Linear Scalability on a Handful of Nodes • Aerospike

3

THE FIRST, FLASH-OPTIMIZED, IN-MEMORY, NoSQL DATABASE

Latency Vs. Transactions Per Instance

The following 2 graphs show the relationship

between latency and the number of transac-

tions processed for the r3.large, m3.xlarge and

m1.large instances. As you can see each instance

has a limited number of requests that can be

processed with low latency. When the limit is

reached latency deteriorates even though in-

creased throughput is possible.

Figure 2 shows that the percentage of transac-

tions with read latency below 1ms with respect

to the number of transactions processed is the

same for all instances up to about 25,000 trans-

actions. From 25,000 to 60,000 transactions

the read latency for both the r3.large and the

m3.xlarge instances remains low while the read

latency for the m1.large rapidly deteriorates. As

the number of transactions increases beyond

60,000 the read latency for the r3.large dete-

riorates but the m3.xlarge continues to remain

under 1ms far beyond the 100,000-transaction

level.

Figure 3 shows that the percentage of transac-

tions with write latency below 2ms with respect

to the number of transactions processed is the

same for all instances below the 60,000 write

transaction level. When the write transaction

level exceeds 60,000 the latency for the r3.large

instance deteriorates while the m3.large and the

m1.large remain flat.

Aerospike TPS Achieve Linear Scalability on a Handful of Nodes

In Figure 4, we see how the number of transac-

tions processed for 100% reads and 80% read

20% write workloads are compared with the

number of nodes on an r3.large instance. Aero-

spike TPS show linear scalability from 27k TPS

on two nodes to 140k TPS on just 8 nodes.

Late

ncy

(m

illis

eco

nds)

Cluster TPS

Fig 3 %>2ms Read Latency

Clu

ster

TPS

Number of Nodes

Fig 4 TPS for cluster r3.large

Late

ncy

(m

illis

eco

nds)

Cluster TPS

Fig 2 %>1ms Read Latency

Page 4: Single Node processes 1 Million TPS in pure ... - Amazon S3 · • Latency Vs. Transactions Per Instance • Aerospike TPS Achieve Linear Scalability on a Handful of Nodes • Aerospike

4

THE FIRST, FLASH-OPTIMIZED, IN-MEMORY, NoSQL DATABASE

© 2014 Aerospike, Inc. All rights reserved. Aerospike and the Aerospike logo are trademarks or registered trademarks of Aerospike. All other names and trademarks are for identification purposes and are the property of their respective owners.

2525 E. Charleston Rd #201, Mountain View, CA 94043 +1 408.462.AERO (2376) | [email protected] | www.aerospike.com

4

Try it today: http://www.aerospike.com/download/

Aerospike Shows Optimal Price/Performance on Amazon r3.2xlarge

When comparing the number of transac-

tions per second processed to the cost per

month for each of the selected Amazon

instances we see that a range of 2-67

nodes are required to deliver performance

ranging from 15k TPS to 1 Million TPS,

for prices ranging from $252 to $8552

per month. However, the best price/per-

formance instance for Aerospike is the

r3.large or r3.2xlarge, with r3.2xlarge

requiring fewer nodes and decreased costs

for the same performance.

Cluster TPSCost

per

Month

Fig 5 $/month vs TPS

Aerospike Soars on Bare Metal and Cloud Servers

The price performance tests confirm that cloud does equal high performance and that Aerospike performs

well on both bare metal and cloud configurations.

“There has been no need for maintenance with Aerospike database; it just works out of the box.” – Amitabh Misra, Vice President of Engineering, Snapdeal

Here’s a happy customer