introduction to aerospike

32
© 2014 Aerospike. All rights reserved. Confidential 1 Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability ROCKET ENGINE FOR CONTEXT DRIVEN APPS THAT PERSONALIZE THE INTERNET BY YOUNG PAIK DIRECTOR SALES ENGINEERING, AEROSPIKE

Upload: aerospike-inc

Post on 15-Jan-2015

6.271 views

Category:

Technology


4 download

DESCRIPTION

Whats the buzz about? When it comes to NoSQL, what do some of the most experienced developers know about NoSQL that makes them select Aerospike over any other NoSQL database? Find the full webinar with audio here - http://www.aerospike.com/webinars This presentaion will review how real-time big data driven applications are changing consumer expectations and enterprise requirements for operational databases that enable powerful and personalized customer experiences. We will describe common use cases, typical customer deployments and present an overview of Aerospike's hybrid in-memory (DRAM + Flash) and scale-out architecture.

TRANSCRIPT

Page 1: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 1

Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability

ROCKET ENGINEFOR CONTEXT DRIVEN APPS THAT

PERSONALIZE THE INTERNET

BY YOUNG PAIKDIRECTOR SALES ENGINEERING, AEROSPIKE

Page 2: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 2

Aerospike NoSQL Database

Page 3: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 3

AGENDA

1. How the game has changed, driving need for next-gen NoSQL

2. Who uses Aerospike and why

3. Architecture Overview

Page 4: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 4

Internet Enterprises have changed the game…

Simple, Personalized, Instant

Complex, Standardized, Silo-ed

Page 5: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 5

1. Instant Response■“Every 100ms latency costs

Amazon 1% in sales”– Greg Linden, Amazon

■“An extra ½ sec in search page generation dropped traffic 20%” – Google (average 1.5 sec)

■“A 1 sec delay can cause 7% decline in conversion”– Walmart

Consumers Expect and “Want it All”

Page 6: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 6

Consumers Expect and “Want it All”

1. Instant Response

2. Intuitive Service ■ Personalized & Consistent

across channels

■ Mobile devices, tablet, car…■ Web, mobile, social media…

Page 7: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 7

Consumers Expect and “Want it All”

1. Instant Response

2. Intuitive Service ■ Personalized & Consistent

across channels

■ Mobile devices, tablet, car…■ Web, mobile, social media…

■ Seamless across the business■ Marketing, sales, support…

Page 8: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 8

Consumers Expect and “Want it All”

1. Instant Response

2. Intuitive Service

3. Always-On■ How much does down-time cost?

Page 9: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 9

Enterprises must “Deliver it All”

■ Use every swipe, search, share to delight - Instantly, Intuitively, Always-On

■ (DIY or SaaS)

■ IDENTITY■SessionIDs, Cookies, DeviceIDs, ip-Addr

■ ATTRIBUTES■Demographic, geographic

■ BEHAVIOR (REAL-TIME)■Presence, swipe, search, share.. ■Channels – web, phone, in-store..■Services – frequency, sophistication

■ SEGMENTS (PRE-CALCULATED)■Attitudes, values, lifestyle, history..

■ TRANSACTIONS■Payments, campaigns

CONTEXT

Page 10: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 10

How Big is Real-time Big Context?

■How many Objects?■# People * # Devices * # Browsers■People move around, cookies get

cleared..■* 2x Replication

■“100M people ≈ 2 Billion cookies” - eBay

# People Per Profile

10 M Customers *

25 kb 250 GB

500 M Prospects *

1 kb + 500 GB

Real-time Context

= 750 GB

■ IDENTITY■SessionIDs, Cookies, DeviceIDs, ip-

Addr

■ ATTRIBUTES■Demographic, geographic

■ BEHAVIOR (REAL-TIME)■Presence, swipe, search, share.. ■Channels – web, phone, in-store..■Services – frequency, sophistication

■ SEGMENTS (PRE-CALCULATED)■Attitudes, values, lifestyle, history..

■ TRANSACTIONS■Payments, campaigns

CONTEXT

Page 11: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 11

Aerospike Database – Powering Context-Driven Apps

1. ROCKET ENGINE- In-Memory, Flash-optimized

2. WEB SCALE– Distributed, Shared nothing

3. ACID RELIABILITY– Immediate Consistency, High Availability

4. NoSQL FLEXIBILITY– Distributed Queries, Real-time Analytics

■Apps that Personalize the Internet Instantly, Intuitively & Always-On

Page 12: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 12

MILLIONS OF CONSUMERSBILLIONS OF DEVICES

AEROSPIKE CLUSTER

APP SERVERS RDBMS

DATA WAREHOUSE

SEGMENTS

Calculate models, Discover SEGMENTS eg early adopter, bargain hunter, mass affluent…

BATCH ANALYTICS

CONTEXT

IDENTITYCookies, device

ID..

ATTRIBUTESAge, Gender..

BEHAVIOR Click, search…

SEGMENTS

R / WREAL-TIME

CONTEXT

REAL-TIMEANALYTICS 

QUERIES & AGGREGATIONS Risk scores, best sellers, trending now…

+

New Architecture for Context Computing

Page 13: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 13

AppNexus

eXelateFederated Media

[x+1]

Pioneered by Ad-Tech

Page 14: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 14

Powering the profile store for AppNexus (RTB)

■ “For the last three years, Aerospike’s database has been managing our vast volumes of user data. With Aerospike, we process many terabytes of data daily across our global data centers at a rate in excess of a million requests per second. – Mike Nolet, CTO

■ “AppNexus* operates at massive scale while paying close attention to the economics of the platform. Aerospike’s flash optimizations running on top of Intel® SSDs have given us the price, performance, reliability, and serviceability we need to grow our business.” – Timothy G Smith, SVP Technical Operations

• 50 Billion Ad + 300 Billion Bid Requests/dayfor Microsoft Ad Exchange, Interactive Media (Deutsche Telekom), Collective…

• 6 Billion Mobile Ads/day for Millennial Media Exchange

• 100ms SLA from click to view

Page 15: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 15

Powering the eXelate Data Exchange (DMP)■200 publishers/ marketers access real-time context on

700Million Consumers■Demographics, purchase intent, behavioral propensities

from online /offline sources eg Nielsen, MasterCard Advisors, Bizo

■Found SQL DBs an order of magnitude too expensive, considered several NoSQL DBs

■200 servers ingest 2 TB clickstream data per day to Aerospike and an analytics DWH

■Models calculated in DWH, loaded into Aerospike:

■20 TB data, 60 Billion transactions per month■50/50 balanced reads/writes■12 node clusters, 7 SSDs/128GB DRAM per node■Data synchronized across clusters in 4 data centers

■Aerospike delivered on all of these requirements.” – Elad Efraim, CTO of eXelate

Page 16: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 16

Powering [x+1] Origin Digital Marketing Hub (DMP + DSP)■ Marketing Hub

■Multi-channel analytics and personalization of messages across touchpoints

■Integrated with leading CRM platforms■Deployed at many Fortune 500 companies

■ “It was a challenge to find an extremely high-performance, high availability database…

■4 TB of data, 2 Billion profiles■5,000-10,000 attributes per profile analyzed in 4ms■10 relevant recommendations suggested in 50ms

- each time a visitor clicks on a website

…providing fast reliable access to data in real-time is simple to say, but it’s not easy to do….

Aerospike has proven that our choice to buy, not build was the right decision.” – Patrick DeAngelis, CTO, [x+1]

Page 17: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 17

Internet-scale Context Computing Platforms

RETAILE-COMMERCE

MOBILE

OMNICHANNEL GAMIN

G

WEB

VIDEO

SOCIAL

SEARCH

EMAIL

Page 18: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 18

Context driven Apps - Use Cases

ADVERTISING• REAL-TIME BIDDING• DEMAND SIDE PLATFORM

(DSP)• DATA MGMT PLATFORM

(DMP)• SUPPLY SIDE PLATFORM

(SSP)

MARKETING• MULTI-SCREEN OFFERS• MULTI-CHANNEL

PERSONALIZATION• REAL-TIME

RECOMMENDATIONS• ONE TIME COUPONS • LOYALTY REWARDS• DEALS NEAR YOU• RELATED ITEMS

SALES• PRODUCT AVAILABILITY• DYNAMIC PRICING• RISK SCORES• FRAUD PREVENTION• STREAM ANALYSIS

SUPPORT• REAL-TIME DASHBOARDS• PERSONAL FINANCE

PORTFOLIOS• REAL-TIME REPORTS

Page 19: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 19

Next Gen NoSQL

Page 20: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 20

Aerospike Database / Context Computing Platform

1. ROCKET ENGINE- In-Memory, Flash-optimized

2. WEB SCALE– Distributed, Shared nothing

3. ACID RELIABILITY– Immediate Consistency, High Availability

4. NoSQL EXTENSIBILITY– Distributed Queries, Real-time Analytics

■Powering Context driven Apps that Personalize the Internet Instantly, Intuitively & Always-On!

Page 21: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 21

OTHER DATABASE

OS FILE SYSTEM

PAGE CACHE

BLOCK INTERFACE

HDD SSD

BLOCK INTERFACE

SSD SSD

OPEN NVM

SSD

OTHER DATABASE

FLASH OPTIMIZEDIN-MEMORY DATABASE

Ask and I’ll tell you now.Ask me. I’ll look up the answer and then tell it to you.

AEROSPIKE

HYBRID MEMORY SYSTEM™

• Indexes in DRAM• Data in DRAM /

SSD• Balanced Reads

& Writes• Highly

Parallelized• Lock-free + ACID

ROCKET ENGINE

Page 22: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 22

Aerospike Certification Tool (ACT) for SSDs

■Industry Standard Flash (SSD / PCI-E) Benchmark■Open Source Tool used by Flash Vendors to certify drives

Page 23: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 23

10X Faster for Balanced Read/Write loads

Balanced Read-Heavy0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

Aerospike CassandraMongoDB Couchbase 2.0*

*We were forced to exclude Couchbase...since when run with either disk or replica durability on it was unable to complete the test.” – Thumbtack Technology

0 50,000 100,000 150,000 200,0000

2.5

5

7.5

10Balanced Workload Read Latency

AerospikeCas-sandraMongoDB

Throughput, ops/sec

Avera

ge L

ate

ncy,

ms

0 50,000 100,000 150,000 200,0000

4

8

12

16Balanced Workload Update Latency

AerospikeCas-sandraMongoDB

Throughput, ops/sec

Avera

ge L

ate

ncy,

ms

HIGH THROUGHPUT LOW LATENCY

4 Node Cluster, each with:CPU: 8 x Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, RAM: 31 GBSSD: 4 x INTEL SSDSA2CW120G3, 120 GB (94 GB over-provisioned) HDD: ST500NM0011, 500 GB, SATA III, 7200 RPMOS: Ubuntu Server 12.04.1 64-bit (Linux kernel v.3.2.0)

Page 24: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 24

DRAM & HDD SSD & DRAMStorage /server 180 GB (196 GB) 2.4 TB (4 x 700 GB)

TPS /server 500,000 500,000Cost /server $8,000 $11,000Server costs $1,488,000 $154,000

Power /server 0.9 kW 1.1 kWPower (2 years) $0.12 per kWh

ave. US$352,000 $32,400

Maintenance (2 years) $3,600 /server

$670,000 $50,400

Total $2,510,000 $236,800

Actual customer analysis99% < 1ms500K TPS

10 TB Storage2x Replication

186 SERVERS REQUIRED 14 SERVERS

OTHER DATABASES

ONLY

Page 25: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 25

WEB SCALE

■ Distributed Hash Table with No Hotspots

■Every key hashed with RIPEMD160 into a 20 byte (fixed length) string

■Hash + additional (fixed 64 bytes) data stored in DRAM in the index

■Some bits from hash value are used tocalculate the Partition ID (4096 partitions)

■Partition ID maps to Node ID in the cluster

■ 1 Hop to data■Client just calculates Partition ID, determines Node ID

■No Load Balancers required

■ Shared Nothing architecture

■Every node is identical

cookie-abcdefg-12345678cookie-abcdefg-12345678

182023kh15hh3kahdjsh182023kh15hh3kahdjsh

Partition ID Master Node ID

Replica Node ID

… 1 4

1820 2 3

1821 3 2

4096 4 1

Page 26: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 26

OHIO

WEB SCALE with ACID RELIABILITY

1) No Hotspots – DHT with RIPEMD160 simplifies data partitioning

2) Smart Client – 1 hop to data, no load balancers

3) Shared Nothing Architecture, every node identical

7) XDR – asynch replication across data centers ensures Zero Downtime

4) Single row ACID – synch replication in cluster

5) Smart Cluster, Zero Touch – auto-failover, rebalancing, rolling upgrades..

6) Transactions and long running tasks prioritized real-time

Page 27: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 27

XDR ensures Zero Downtime

■ Cross Data Center Replication (XDR) enables geographic redundancy and location proximity

■ Maximum flexibility■Replication set at namespace level■Active-Passive /Active-Active modes■Changes in one data center can be

■replicated to multiple data centers■forwarded to another data center

■Clusters can have different number of nodes■Automatic failure handling ensures continuity

in spite of node failures

■ Super Storm Sandy 2012■Power outage, NYC Cluster down for 17 hours■Once power returned, XDR synched in 1 hour

“Aerospike allows us to handle business continuity and reliability across 4 data centers seamlessly.” - Elad Efraim, CTO

Page 28: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 28

AP

P S

ER

VER

AER

OS

PIK

E S

ER

VER

APPLICATION

AEROSPIKE SMART CLIENT™

• APIs (C, C#, Java, PHP, Python, Ruby, Erlang…)• Transactions, Cluster awareness

EXTENSIBLE DATA MODEL

• Str, Int, Lists, Maps• Lookups, Queries, Scans

• User Defined Functions • Distributed Aggregations

MONITORING & MANAGEMENT

• Aerospike Monitoring Console™

• Command Line Tools

• Plugins-Naglos, Graphite, Zabbix

AEROSPIKE

SMART CLUSTER™

AEROSPIKE

HYBRID MEMORY SYSTEM™

AEROSPIKE (XDR)

CROSS DATA CENTER REPLICATION™

AEROSPIKEREAL-TIMEENGINE™

APP/WEB SERVER

AEROSPIKE CLUSTER

Architected for Context ComputingPatents pending Written in ‘C’

1) ROCKET FAST

2) WEB SCALE 3) ACID RELIABILITY

4) NOSQL FLEXIBILITY

Page 29: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 29

NOSQL EXTENSIBILITY

■Namespaces (policy containers) ■Determine storage - DRAM or Flash■Determine replication factor■Contain records and sets

■Sets (tables) of records■Arbitrary grouping

■Records (rows) ■Max 128k, contain key and bins■Bin with same name can contain values of

different types■String, integer, bytes (raw, blob, etc)■list ( an ordered collection of values )■map ( a collection of keys and values )

■Bins can be added anytime

Page 30: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 30

Real-time Analytics on Operational Data

DISTRIBUTED QUERIES1. “Scatter” requests to all nodes

2. Indexes in DRAM for fast map of secondary primary keys

3. Indexes co-located with data to guarantee ACID, manage migrations

4. Records read in parallel from all SSDs using lock free concurrency control

5. Aggregate results on each node

6. “Gather” results from all nodes on client

STREAM AGGREGATIONS7. Push Code/ Security Policies/ Rules to Data with UDFs

8. Pipe Query results through UDFs toFilter, Transform, Aggregate.. Map, Reduce

REAL-TIME ANALYTICS on OPERATIONAL DATA (No ETL)■In Database, within the same Cluster■On the same Data, on XDR Replicated Clusters

Page 31: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 31

Aerospike Database / Context Computing Platform

1. ROCKET ENGINE- In-Memory, Flash-optimized

2. WEB SCALE– Distributed, Shared nothing

3. ACID RELIABILITY– Immediate Consistency, High Availability

4. NoSQL EXTENSIBILITY– Distributed Queries, Real-time Analytics

■Powering Context driven Appsthat Personalize the Internet Instantly, Intuitively & Always-On!

Page 32: Introduction to Aerospike

© 2014 Aerospike. All rights reserved. Confidential 32

Recognized as the only Visionary in Gartner's Magic Quadrant for Operational Database Management Systems

Gartner, Magic Quadrant for Operational Database Management Systems Donald Fienberg et al. October 23, 2013

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available at www.aerospike.com .Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.