aerospike hybrid memory architecture

45
High Performance NoSQL Database Powering New Opportunities at Scale Ask The Experts: Architectural Overview

Upload: aerospike-inc

Post on 21-Jan-2018

559 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Aerospike Hybrid Memory Architecture

High PerformanceNoSQL Database

Powering New Opportunities at Scale

Ask The Experts: Architectural Overview

Page 2: Aerospike Hybrid Memory Architecture

Overview

■  Overview ■ Database landscape ■ Use cases

■  Architecture ■  Storage ■  Indexes and Operations ■  Cross Datacenter Replication ■  Questions

Page 3: Aerospike Hybrid Memory Architecture

Response time: Hours, Weeks TB to PB Read Intensive

TRANSACTIONS (OLTP)

Response time: Seconds Gigabytes of data

Balanced Reads/Writes

ANALYTICS (OLAP)

STRUCTURED DATA

Response time: Seconds Terabytes of data

Read Intensive

BIG DATA ANALYTICS

Real-time Transactions Response time: < 5 ms 1-100 TB Balanced Reads/Writes 24x7x365 Availability

UNSTRUCTURED DATA

REAL-TIME BIG DATA

Database Landscape

Page 4: Aerospike Hybrid Memory Architecture

Next Generation Systems of Engagement – An Emerging Market with Multiple Technologies

Aerospike Delivers Predictable Performance, Highest Availability, and Lowest TCO

Systems of Engagement - TCO

TCO ($)

Scale TB

Systems of Engagement – Many Choices

Alternative TCO

Aerospike TCO

Speed TPS

Scale TB

Significant functional overlap - Commodity DB problem set

Unique Functional Capabilities and High Value Problem Set

Page 5: Aerospike Hybrid Memory Architecture

High Performance NoSQL +

■  Unlimited Key Value pairs, record size up to 128KB - 1MB.

■  Complex & Scalar Types - integer, double, string, blob, list, map, geospatial.

■  Distributed Queries on secondary

indices (exact match, integer range, geospatial queries).

■  User Defined Functions extend the database.

■  Patented Indexed Map-Reduce – distributed queries can be filtered, transformed, aggregated, and reduced.

Page 6: Aerospike Hybrid Memory Architecture

Use cases

6

Page 7: Aerospike Hybrid Memory Architecture

MILLIONS OF CONSUMERS BILLIONS OF DEVICES

APP SERVERS

DATA WAREHOUSE INSIGHTS

Advertising Technology Stack

WRITE CONTEXT

In-memory NoSQL

WRITE REAL-TIME CONTEXT READ RECENT CONTENT PROFILE STORE Cookies, email, deviceID, IP address, location, segments, clicks, likes, tweets, search terms... REAL-TIME ANALYTICS Best sellers, top scores, trending tweets

BATCH ANALYTICS Discover patterns, segment data: location patterns, audience affinity

Currently about 3.0M / sec in North American

Page 8: Aerospike Hybrid Memory Architecture

Challenge •  Billions of users & cookies across the internet •  Accessible using provisioning applications

(self-serve and through support personnel) •  Real-time algorithms used for targeting, offers.

Need for Extremely High Availability, Reliably, Low latency

•  10’s TBs of data •  1B ~ 10B objects •  1M ~ 10M TPS

Selected NoSQL •  Clustered HA system •  Predictable low latency at high throughput •  Highly-available and reliable on failure •  Cross data center (XDR) support

AdTech – Targeting, Bidding, Programmatic

INTERNETAD EXCHANGE

BIDDINGAPPLICATION

SEARCHES

VISITSTIME ON PAGE

AUDIENCE

HISTORICALDATA

BEHAVIOR MODELSMACHINELEARNING

Page 9: Aerospike Hybrid Memory Architecture

Travel Portal

PRICING DATABASE (RATE LIMITED)

Poll for Pricing Changes

PRICING DATA

Store Latest Price

SESSION MANAGEMENT

Session Data

Read Price

XDR

Airlines forced interstate banking Legacy mainframe technology Multi-company reservation and pricing Requirement: 1M TPS allowing overhead

Travel App

Page 10: Aerospike Hybrid Memory Architecture

Financial Services – Intraday Positions

10M+ user records Primary key access 1M+ TPS

•  Challenge –  DB2 stores positions for 10 Million

customers –  Value-at-risk calculations in minutes,

not hours –  Consistent view of trade state across all

applications –  Must update stock prices, show balances

on 300 positions, process 250M transactions, 2 M updates/day

–  Cache uneconomical – 150 servers growing to 1000

•  Need to scale reliably –  3 à 13 TB –  100 à 400 Million objects –  200k à I Million TPS

•  Selected NoSQL –  Flash –  Predictable Low latency at High

Throughput –  Immediate consistency –  Cross data center (XDR) support –  10 Server Cluster

IBM DB2 (MAINFRAME)

Read/Write

Start of Day Data Loading

End of Day Reconciliation

Query REAL-TIME DATA FEED

ACCOUNT POSITIONS

XDR

Page 11: Aerospike Hybrid Memory Architecture

QOS & Real-Time Billing for Telcos

Challenge •  Per-account routing rules win edge systems •  Traffic shaping to implement account policies •  Accessible using provisioning applications

(self-serve and through support personnel) Need for Extremely High Availability, Reliably, Low latency

•  TBs of data •  10-100M objects •  10-200K TPS

Selected NoSQL •  Clustered system •  Predictable low latency at high throughput •  Highly-available and reliable on failure •  Cross data center (XDR) support

SOURCE DEVICE/USER DESTINATION Real-Time

Auth. QoS Billing

Request Execute Request

Real-Time Checks Config Module App

Update Device User Setting

Hot-Standby

XDR

Page 12: Aerospike Hybrid Memory Architecture

Traditional SOE Architecture Has Significant Limitations

Challenges: • Complex • Maintainability • Durability • Consistency • Scalability • Cost ($) • Data Lag Caching Layer

Operational Database

Legacy RDBMS HDFS BASED

Fast speed – Consumer Scale

Real-time Consumer Facing

Pricing / Inventory/Billing

Real-time Decisioning

Streaming Data

Legacy Database (Mainframe)

RDBMS Database

Transactional Systems

Enterprise Environment

Page 13: Aerospike Hybrid Memory Architecture

XDR

Aerospike

Hybrid Memory Systems - Enabling a New Class of Real-time Applications

Aerospike Delivers Predictable Performance, Highest Availability, and Lowest TCO

Legacy Database (Mainframe)

RDBMS Database

Transactional Systems

Enterprise Environment

Powered by High Performance NoSQL

Fast speed – Consumer Scale

Hybrid Memory Database

Benefits: • Simplicity • Maintainability • Durability • Consistency • Scalability • Cost ($) • Data Lag Reduced

Real-time Consumer Facing

Pricing / Inventory/Billing

Real-time Decisioning

Streaming Data

Legacy RDBMS HDFS BASED

Page 14: Aerospike Hybrid Memory Architecture

Architecture

Page 15: Aerospike Hybrid Memory Architecture

Architecture – The Big Picture

1)  No Hotspots – Distributed Hash Table simplifies data partitioning

2)  Smart Client – 1 hop to data, no load balancers

3)  Shared Nothing Architecture, every node is identical

4)  Smart Cluster, Zero Touch

– auto-failover, rebalancing, rack aware, rolling upgrades

5)  Transactions and long-running tasks prioritized in real-time

6)  XDR – sync replication across data centers ensures Zero Downtime

Page 16: Aerospike Hybrid Memory Architecture

How Data is Organized

Aerospike RDBMS

Namespace Tablespace or Database

Set Table

Record Row

Bin Column

Bin type

Integer

Double

String

BLOB

List

Map / SortedMap

GeoJSON

Page 17: Aerospike Hybrid Memory Architecture

Smart Client™

■  The Aerospike Client is implemented as a library, JAR or DLL, and consists of 2 parts: ■ Operation APIs – These are the operations that you can execute on the

cluster – CRUD+ etc. ■ First class observer of the Cluster – Monitoring the state of each node and

aware on new nodes or node failures.

Page 18: Aerospike Hybrid Memory Architecture

Smart Client - Distributed Hash table

■  Distributed Hash Table with No Hotspots ■ Every key hashed with RIPEMD160

into an ultra efficient 20 byte (fixed length) string ■ Hash + additional (fixed 64 bytes) data

forms index entry in RAM ■ Some bits from hash value are used to

calculate the Partition ID (4096 partitions) ■ Partition ID maps to Node ID in the cluster

■  1 Hop to data ■ Smart Client simply calculates Partition

ID to determine Node ID ■ No Load Balancers required

Page 19: Aerospike Hybrid Memory Architecture

Even record distribution

Node A Node B Node C

Z

Z’

Y

Y’

X

X’

AerospikeClient

Application

Page 20: Aerospike Hybrid Memory Architecture

Automatic rebalancing

Adding, or removing a node, the cluster automatically rebalances 1.  Cluster discovers new node via gossip

protocol 2.  Paxos vote determines new data

organization 3.  Partition migrations scheduled 4.  When a partition migration starts,

write journal starts on destination 5.  Partition moves atomically 6.  Journal is applied and source data deleted After migration is complete, the cluster is evenly balanced.

Page 21: Aerospike Hybrid Memory Architecture

Predictable high performance

Page 22: Aerospike Hybrid Memory Architecture

Data is distributed evenly across nodes in a cluster using the Aerospike Smart Partitions™ algorithm. ■  RIPEMD160 (no collisions yet found) ■  4096 Data Partitions ■  Even distribution of

■ Partitions across nodes ■ Records across Partitions ■ Data across Flash devices

■  Primary and Replica Partitions

Even Data Distribution

Page 23: Aerospike Hybrid Memory Architecture

Massively Parallel

Automatic Distribution of Data •  Even amount of data on all nodes and all drives •  All hardware used equally •  Load on all servers is balanced •  No “hot spots” •  No configuration changes as workload or use

case changes

Smart Clients •  Single “hop” from client to server •  Cluster-spanning operations (scan, query, batch)

sent to all processing nodes for parallel processing.

Page 24: Aerospike Hybrid Memory Architecture

Scale up Architecture - Server internals TC

P/IP

Soc

ket

Flash Storage

Service Threads

Service Queues

Transaction Threads

Page 25: Aerospike Hybrid Memory Architecture

Predictable Performance

DIGEST & TREE INFO

RECORD METADATA

STORAGE POINTER

Reads

Single hop DRAM Read

OWNING SERVER PRIMARY INDEX STORAGE

DIGEST & TREE INFO

RECORD METADATA

STORAGE POINTER

Writes

Single hop DRAM Write

OWNING SERVER PRIMARY INDEX MEMORY BUFFER

Flush

ASYNC STORAGE

DIGEST & TREE INFO

RECORD METADATA

STORAGE POINTER

DRAM Write

REPLICA SERVER PRIMARY INDEX MEMORY BUFFER

Flush

ASYNC STORAGE

Synchronous Replica Write, Single hop

Page 26: Aerospike Hybrid Memory Architecture

Predictable Performance

Performance Built In •  Written in C with memory-optimized libraries => No garbage collection •  Continual defragmentation of storage => No compactions •  Known master for any piece of data => No quorum reads •  Designed as a distributed database => Networking primary consideration

Storage Optimizations •  Writes done to memory buffer => Avoid storage slowdown •  Storage used in “block” mode => No file system overhead •  Reads and writes striped across devices => Concurrent use of hardware

Smart Clients •  Single “hop” from client to server

Page 27: Aerospike Hybrid Memory Architecture

Data Consistency

•  Written data should be immediately consistent within a cluster without introducing additional latency

•  Mixed workloads (true concurrent reads/writes) should not cause issues

•  Written data should be asynchronously written to remote clusters

Page 28: Aerospike Hybrid Memory Architecture

Data Consistency

OWNING SERVER REPLICA SERVER

Local Cluster

Remote Cluster

ASYNC REPLICATION

SYNCHRONOUS REPLICATION

XD

R

WRITE

READ

Page 29: Aerospike Hybrid Memory Architecture

Storage

Page 30: Aerospike Hybrid Memory Architecture

Data Storage Layer – Hybrid Architecture

Page 31: Aerospike Hybrid Memory Architecture

Data in RAM

Data in RAM is very fast – at a price ■  Indexes and Data both in-memory ■  $$$ (great < 100G, Cloud) ■  More servers ■  Super fast ■  Optional HDD as backing store

Page 32: Aerospike Hybrid Memory Architecture

Data on Flash / SSD

■ Record data stored contiguously ■ 1 read per record

■ Automatic continuous defragment ■ Data written in flash optimal blocks ■ Automatic distribution across drives ■ Writes buffered

BLOCK INTERFACE

SSD SSD SSD

AEROSPIKE

HYBRID MEMORY SYSTEM™

Page 33: Aerospike Hybrid Memory Architecture

10x LOWER TCO10x Fewer

Page 34: Aerospike Hybrid Memory Architecture

Indexes and Operations

Page 35: Aerospike Hybrid Memory Architecture

Indexes in DRAM, Data on SSD

•  Small amount of DRAM => avoid cost and server sprawl

•  No concept of cache misses => Predictable, low latency performance on NVMe/SSD

Page 36: Aerospike Hybrid Memory Architecture

Primary Index

Primary index ■ DHT of rbTrees (one per partition)

■  Index entry ■ 64 bytes ■ Write generation ■ Time To Live ■ Last Update Time ■ Storage address ■ Uses shared memory for

Fast Restart

Page 37: Aerospike Hybrid Memory Architecture

Key Value operations using the Primary Index

■  Put ■  Exists ■  Get ■  CAS ■  Increment (counters) ■  Append/Prepend ■  List Operations ■  SortedMap Operations ■  Touch ■  Delete ■  Batch Read/Exists ■  Scan

Page 38: Aerospike Hybrid Memory Architecture

Secondary Indexes

■  Bin (Column) indices ■  Declarative index

■ String, Integer, List, Map Keys, ■  Map Values, GeoJSON

■  In RAM – fast ■  Multi-node

■ Co-located with primary index ■  Reference local data only ■  Index creation

■ Tools: AQL, ascli ■ Client API – developer only

Page 39: Aerospike Hybrid Memory Architecture

Queries on Secondary Indexes

A query is a value based lookup using a secondary index similar to a SQL select statement. The query is sent to all nodes in the cluster in parallel ■  Scatter-gather ■  Multi-threaded

Best for “low selectivity” indices Good for “high selectivity” indices

Selectivity =Cardinality / Rows*100

SECONDARY INDEX

PRIMARY INDEX

UDF UDF UDF

RECORD RECORD RECORD RECORD

SSD

SSD

DRAM

… … …

Page 40: Aerospike Hybrid Memory Architecture

Cross Datacenter Replication - XDR

Page 41: Aerospike Hybrid Memory Architecture

XDR Architecture

Each node in the cluster Distributed clusters

Page 42: Aerospike Hybrid Memory Architecture

XDR Topologies

Star Replication

Simple Active-Passive Simple Active-Active

More Complex Topology

Page 43: Aerospike Hybrid Memory Architecture

Failure Handling

Node failure within a cluster – nodes with replica data will continue Link failure – XDR keeps track of link failures and data to be shipped over that link. It will recover when the link comes up.

Node failure in a Cluster Link failure between Clusters

Page 44: Aerospike Hybrid Memory Architecture

Aerospike – Enabling Your Digital Transformation

Powered by High Performance NoSQL

Aerospike – The Next Generation Operational Database

TRUE HYBRID MEMORY ARCHITECTURE•  No cache required – simpler architecture! Smaller Server Footprint•  Patented Flash Optimization – Log structured File System•  Record Oriented, Schema Free NoSQL KV Store

PREDICTABLE PERFORMANCE•  True Real Time DB engine, multi threaded, massively parallel•  DRAM or Hybrid DRAM/Flash for Persistence•  Stable, Low Latency and high throughput under any condition•  Deployable on Bare Metal, virtualized, containerized, or Cloud

DYNAMIC CLUSTER MANAGEMENT•  Highest Uptime & Availability (5 nines plus), Scalable •  Automatic DB Cluster formation, self healing and dynamic sharding •  Cross Data Center Replication (XDR)

INTELLIGENT CLIENTS•  Machine Learning•  Broad language support (C/C++, Java,C#, Python, Go, Node.js, PHP)•  Patented functionality, DB aware Clients, No load balancers required•  Rich API’s - Accelerated development

TCO•  Optimized for Flash and DRAM•  Demonstrated 10:1 price performance savings•  Up to 10x reduction in servers deployed•  Huge operational efficiency – “Set it and Forget it”

$

Page 45: Aerospike Hybrid Memory Architecture

High PerformanceNoSQL Database

Powering New Opportunities at Scale

@aerospikedb

NEXT STEPS: See how much you can save with Aerospike: http://www.aerospike.com/tco-calculator/ Ready to get started? http://www.aerospike.com/quick-start/ If you have any questions or want to further explore if Aerospike is right for you, contact us: [email protected]