aerospike next-generation hyperscale database

21
Next-Generation Hyperscale Database Tim Faulkes, VP Solutions Architecture, Aerospike

Upload: others

Post on 30-Jul-2022

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aerospike Next-Generation Hyperscale Database

1 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Next-GenerationHyperscale Database

Tim Faulkes, VP Solutions Architecture, Aerospike

Page 2: Aerospike Next-Generation Hyperscale Database

2 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

NoSQL – An Emerging Market with Multiple Technologies

Aerospike Delivers Speed at Scale, Predictable Performance, Highest Availability, and Lowest TCO

NoSQL Market

TCO ($)

Scale TB

NoSQL Market

SpeedTPS

Scale TB

Significant functional overlap - Commodity

DB problem set

AlternativeTCO

Unique Functional Capabilities and High

Value Problem Set

AerospikeTCO

Page 3: Aerospike Next-Generation Hyperscale Database

3 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Use Cases

Page 4: Aerospike Next-Generation Hyperscale Database

4 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

XDR

Long term Object Store

Aerospike

Database

SoE Location 1

SoE Location 2

SoE Location 3

XDR

XDR

Aerospike Edge

Databases

XDR

XDR

Core Database – Real Time System of Record

System of Record Query & Reporting Store

Telemetry Gateway

Telemetry Gateway

Telemetry Gateway

Messaging Bus / Direct

Insert

XDR

Sensor Data

Sensor Data

Sensor Data

Pipeline based on design work done by HPE

Page 5: Aerospike Next-Generation Hyperscale Database

5 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Aerospike vs. Redis: ü 2500% increase

Aerospike vs. Cassandra: ü 1667% increase @ 15K ops/sec (sustained) ü 270% increase @ 92K ops/sec (not sustained)

Aerospike vs. RocksDB: ü 300% increase

Performance at Scale – Independent 3rd Party Test Results

*Redis could not ‘load’ 500M records in the required timeframe so 50Mrecords was used for its test

**Cassandra performance dropped ‘off a cliff’ on reads, writes could be sustained at >90K ops/sec

0 50000 100000 150000 200000 250000

Overall Throughput (ops/sec)

YCSB Workload A - Ops/Sec

CassandaRocksDBRedisAerospike

14,996**

79,948

249,980

10,000*

Cassandra

“The only solution that can get close with today's technology hardware and software is Aerospike running on Intel p-mem DIMMs, and that is about 280,000 sustained read and write operations per second. Which

is about 2,000 percent more than anything else out there.”

Page 6: Aerospike Next-Generation Hyperscale Database

6 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

ADTECH ECOMMERCE FINANCIAL NEWTECH TELCO GAMING

Powering Industry Leading Innovation Around the World

Page 7: Aerospike Next-Generation Hyperscale Database

7 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Powering Global Fraud Prevention Networkü $280B+ payments annually

Replaced Oracle + Terracotaü Reduced server footprint 15x

Improved SLAs ü 30x reduction in false positives

Increased revenueü 10x improvement in fraud calculation data used

“PayPal is innovating deep analytics to rapidly respond to emerging fraud patterns, then deploying into an event-driven, fast

data, in-memory architecture to accelerate detection, reduce losses and achieve near-continuous availability.”

Mikhail Kourjanski, PhD. - Lead Data Architect, Risk and Compliance Management Platform, PayPal

Real-time Fraud Prevention for Digital Payments

Page 8: Aerospike Next-Generation Hyperscale Database

8 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

2B user accounts: personalization, serving consumer dataüFor Mail, Finance, Sports across multiple

namespaces e.g. Yahoo, AOL, Techcrunchü3m reads per second & 5k writes per second

Spanning 5 datacentersü2-3 PBsüIntel Optane DC Persistent Memory &

Aerospike Cross-Datacenter Replication (XDR)

Aerospike selected for many reasonsüOperational simplicity, scalability, availability,

performance and latency

“We are excited to be working with Verizon Media to help them consolidate and reduce complexity of their back-end data systems”

JOHN DILLONCEO, Aerospike

US E C A S E :

Platform modernization: Global consolidation and streamlining of real-time data infrastructure

Legacy Data Store

Query & Reporting Store

Query & Reportingsec-to-mins

XDR

Corems

AI/ML

System of Record

XDR

Edgems

SoE Location 1

SoE Location 2

SoE Location 3

Streaming

Page 9: Aerospike Next-Generation Hyperscale Database

9 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

European Instant Payments Networkü 43 million transactions per dayü 24x7x365 availability

Cost a Challengeü Credit systems charge 2-3%ü Blockchain per transaction: minutes and Euros

Privacy a Challengeü Private systems sell informationü Blockchain is public record

Benefits with Aerospikeü Costs within €0.0020 per paymentü Scalable to 1000’s of European banks

“In hours I had my dev environment.”Vitangelo Lasorella - Technical Officer, IT Innovation and

Development DepartmentBanca D’Italia

Real-time Settlement of Instant Payments

Page 10: Aerospike Next-Generation Hyperscale Database

10 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Unfair Competitive Advantage

Flash Optimized Storage Layer

ü Significantly higher performance & IOPS

Multi-threadedMassively Parallel ü ‘Scale up’ and ‘Scale out’

Self-healing clusters

ü Superior Uptime, Availability and Reliability

Storage indices in DRAM Data on optimized SSD’s

ü Predictable Performance regardless of scale

ü Single-hop to data

patented

Aerospike Hybrid Memory Architecture TM

Page 11: Aerospike Next-Generation Hyperscale Database

11 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Indexes in DRAM, Data on SSD

Small amount of DRAMü Avoid cost and server sprawl

No cache so no cache missesü Predictable, low latency performance on NVMe/SSD

Optimized for SSDsü Reads done in parallelüWrites done optimally for SSD to reduce wear-and-teamü Partnerships with many SSD vendors

Continual Defragmentationü Predictable performance

Direct Device Accessü Removes file access penalties

P4610 S3710

P4800X

Page 12: Aerospike Next-Generation Hyperscale Database

12 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Aerospike Deployment ModelsPure In-memoryü Indexes in DRAMü Data in DRAMü Optional persistence to Flash

Use Casesü Exceptionally fast reads/writesü Small data volume to avoid

memory constrained server sprawl

Hybrid DRAM/Flashü Indexes in DRAMü Data in Flash

Use Casesü Very fast reads/writesü Medium density nodes to prevent

long restart times on cold restart

Pure PMEMü Indexes in PMEMü Data in PMEM

Use Casesü Exceptionally fast reads/writesü Small-ish data volume to avoid

memory constrained server sprawlü Persistence even with server restart

Hybrid PMEM/Flashü Indexes in PMEMü Data in Flash

Use Casesü Very fast reads/writesü High density nodes (10-100TB / node)ü Node restart times are important

All Flashü Indexes in Flashü Data in Flash

Use Casesü Vast data volumes (trillions of records) or

small objectsü Lower performance (5-10ms) is

acceptable

Latency

Scale

99% < 1ms 95% < 10ms95% < 1ms

Up to 10s of TB Up to PB 10s of PB or more

A single Aerospike cluster can contain one, some or all of these deployment models.

PMEM Advantages over DRAM:• Higher density per node• Lower cost per GB• Improved reliability• Minimal impact on performance• Dramatic improvement on restart

times• commit-to-device penalties virtually

eliminated for data in PMEM

Page 13: Aerospike Next-Generation Hyperscale Database

13 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Massively Parallel

CLUSTER DATA

5%

5%

5%

5%

5%

% OF CLUSTER DATA

CLUSTER DATA CLUSTER DATA CLUSTER DATA25% 25% 25%

SSD 1

SSD 2

SSD 3

SSD 4

SSD 5

Linear Scalingü Scale UP – take full advantage of hardwareü Scale OUT – linear scaling with number of nodes

Automatic Distribution of Data usingSmart PartitionsTM Algorithmü Even amount of data on every node and flash deviceü All hardware used equallyü Load on all servers is balancedüNo “hot spots”üNo config changes as workload or use case changes

Smart Clientsü Single “hop” from client to serverü Cluster-spanning operations (scan, query, batch) sent to all

nodes for parallel processing

Page 14: Aerospike Next-Generation Hyperscale Database

14 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Actual Customer Comparisons when moving to Aerospike

Cassandra to Aerospike• Digital identity management

Relational + Cache to Aerospike (Fraud detection)

Page 15: Aerospike Next-Generation Hyperscale Database

15 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Ad Tec Use Case - Signal Gaming Use Case - Playtika

TCO savings of app $1.4M/year & $4.2M over 3 yrs

200 Couchbase Servers Only 30 Aerospike Servers

200K/s Reads+Writes 600K/s Reads+Writes

Total Cost of Ownership

TCO savings of 1 app $7.75M over 3 yrs

450 Cassandra Servers Only 60 Aerospike Servers

“Simplicity of IT Ops frees teams up for future-looking initiatives.”- Guy Almog, Director of Engineering, Playtika“Free is even too expensive compared to Aerospike.”

- Jason Yanowitz, EVP and CTO, Signal

Aerospike’s cacheless architecture is simpler, faster and cheaper than cache-based architectures

Page 16: Aerospike Next-Generation Hyperscale Database

16 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Aerospike is a Database for System of Record Applications

Single Record / Primary Key

Strong Consistency for Single Record / Primary Keyü No acknowledged write can be lost, even in the face of network partitions

Linearizableü Provides a real-time (i.e., wall-clock) guarantee on reads/writes on a single object

(no stale reads)

Sequential Consistencyü All processes see shared accesses in the same order.ü Accesses are not ordered in real-time

Rack Aware Readsü Favor reads from a local copy of the dataü Minimize inter-AZ costs

Unavailable Readsü Read a copy of the data, even if possibly incorrect

Read Consistency

Most Consistent

Most Available

Write Consistency

Page 17: Aerospike Next-Generation Hyperscale Database

17 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

USA West Rack 1

Node 1 R1

Node 2

Node 3

USA East Rack 2

Node 1

Node 2

Node 3M

USA Central Rack 3

Node 1

Node 2R2

Node 3

≡Roster

• Racks 1, 2, and 3 form cluster• Strong consistency• Linearizable isolation• Synchronous active-active

replication

Aerospike Database 5 Multi-site Clustering

Latency for writes in the 100 milliseconds rangeReads at sub millisecond

Continental Deployment Example

Page 18: Aerospike Next-Generation Hyperscale Database

18 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

USA West Rack 1

Node 1 R1

Node 2

Node 3

USA East Rack 2

Node 1

Node 2

Node 3M

USA Central Rack 3

Node 1

Node 2R2

Node 3

≡Roster

Aerospike Database 5 Multi-site Clustering

Latency for writes in the 100 milliseconds rangeReads at sub millisecond

• Connections to Central Site lost• Rack 1 and 2 automatically reform

cluster• Rack 3 unavailable• No writes lost• Clients of Rack 3 automatically

redirected to Rack 1 or Rack 2

Continental Deployment ExampleResiliency

Page 19: Aerospike Next-Generation Hyperscale Database

20 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Aerospike Connect for Kafka

Example Use Cases

ü Edge-to-core data movement in fraud prevention systems and financial trading systems

ü Edge-to-core data integration in retail user behavior at the POS, checkout with Product data for propensity & dynamic pricing

ü Integration of operational data stored in Aerospike with DataLake / EDW

Page 20: Aerospike Next-Generation Hyperscale Database

21 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Aerospike Connect for Spark

Example Use Cases

ü Fraud prevention: transaction data via streaming and need to analyze based on historical data in real time

ü Recommendation Engines: Real-time recommendations and targeting based on user behavior

üAd Tech: Ad Fraud and real-time retargeting base on user behavior

üDigital Identity Management

ü Industrial Internet of Things (IIoT): Real-time & closed loop business decisions

Page 21: Aerospike Next-Generation Hyperscale Database

22 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.

Thank YouQuestions?