aerospike adtech gets hacked in lower manhattan

26
© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 1 Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability YOU SNOOZE YOU LOSE OR HOW TO WIN IN AD TECH? THE ONLY FLASH-OPTIMIZED DATABASE BRIAN BULKOWSKI FOUNDER, CTO, PRODUCT STACK EXCHANGE MEETUP APRIL 10, 2014

Upload: aerospike

Post on 15-Jan-2015

750 views

Category:

Technology


0 download

DESCRIPTION

Aerospike's highly reliable and scalable database, using NoSQL and In-memory technology, presentation slides given at Stack Exchange on April 10th with NSOne and advertising technology luminaries. AdTech Gets Hacked in Lower Manhattan Stack Exchange, 110 William St 28th Floor, New York, NY 10038

TRANSCRIPT

Page 1: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 1

Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability

YOU SNOOZE YOU LOSE OR

HOW TO WIN IN AD TECH?

THE ONLY FLASH-OPTIMIZED DATABASE

BRIAN BULKOWSKI FOUNDER, CTO, PRODUCT

STACK EXCHANGE MEETUP

APRIL 10, 2014

Page 2: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 2

Aerospike: the gold standard for high throughput, low latency, high reliability transactions

Performance

• Over ten trillion transactions per month

• 99% of transactions faster than 2 ms

• 150K TPS per server

Scalability

• Billions of Internet users • Clustered Software • Automatic Data Rebalancing

Reliability

• 50 customers; zero service down-time

• Immediate Consistency • Rapid Failover; Data Center Replication

Price/Performance

• Makes impossible projects affordable

• Flash-optimized • 1/10 the servers required

Page 3: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 3

Aerospike Proven in Production ■  AppNexus - #2 RTB after Google

■  45 Billion auctions per day ■  2M QPS ■  3 12 server clusters ■  4.8T Flash per server ■  120K read TPS, 60K write TPS

■  Chango – #2 Search after Google ■  Sees more Searches than

Yahoo! + bing ■  Data on 300 Million users

■  TradeDesk – first Ad Exchange

■  Facebook Exchange partner ■  FBX serves 25% of Ads on the Internet

■  Snapdeal ■  2 servers replace 10 mongo servers ■  10GB data ■  “changed our company”

“Aerospike has operated without interruptions and easily scaled to meet our performance demands.” – Mike Nolet, CTO, AppNexus

© 2013 Aerospike. All rights reserved. Pg. 3

Page 4: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 4

MILLIONS OF CONSUMERS BILLIONS OF DEVICES

AEROSPIKE CLUSTER

APP SERVERS RDBMS

DATA WAREHOUSE

SEGMENTS

WRITE REAL-TIME CONTEXT READ RECENT CONTENT PROFILE STORE Cookies, email, deviceID, IP address, location, segments, clicks, likes, tweets, search terms... REAL-TIME ANALYTICS Best sellers, top scores, trending tweets

BATCH ANALYTICS Discover patterns, segment data: location patterns, audience affinity

TYPICAL REAL-TIME DATABASE DEPLOYMENT

TRANSACTIONS

WRITE CONTEXT

Page 5: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 5

KEY CHALLENGES 1.  Handle extremely high rates of read/write transactions

over persistent data

2.  Avoid hot spots to maintain tight latency SLAs

3.  Provide immediate consistency with replication

4.  Ensure long running tasks do not slow down transactions

5.  Scale linearly as data sizes and workloads increase

6.  Add capacity with no service interruption

Page 6: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 6

SYSTEM ARCHITECTURE FOR 100% UPTIME

Page 7: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 7

SHARED-NOTHING SYSTEM:100% DATA AVAILABILITY ■  Every node in a cluster is identical,

handles both transactions and long running tasks

■  Data is replicated synchronously with immediate consistency within the cluster

■  Data is replicated asynchronously across data centers

OHIO Data Center

© 2013 Aerospike. All rights reserved Pg. 7

Page 8: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 8

ROBUST DHT TO ELIMINATE HOT SPOTS How Data Is Distributed (Replication Factor 2)

■  Every key is hashed into a 20 byte (fixed length) string using the RIPEMD160 hash function

■  This hash + additional data (fixed 64 bytes) are stored in RAM in the index

■  Some bits from this hash value are used to compute the partition id

■  There are 4096 partitions

■  Partition id maps to node id based on cluster membership

cookie-abcdefg-12345678

182023kh15hh3kahdjsh

Partition ID

Master node

Replica node

… 1 4

1820 2 3

1821 3 2

4096 4 1

© 2013 Aerospike. All rights reserved Pg. 8

Page 9: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 9

REAL-TIME PRIORITIZATION TO MEET SLA

1.  Write sent to row master

2.  Latch against simultaneous writes

3.  Apply write to master memory and replica memory synchronously

4.  Queue operations to disk

5.  Signal completed transaction (optional storage commit wait)

6.  Master applies conflict resolution policy (rollback/ rollforward)

master replica

1.  Cluster discovers new node via gossip protocol

2.  Paxos vote determines new data organization

3.  Partition migrations scheduled

4.  When a partition migration starts, write journal starts on destination

5.  Partition moves atomically

6.  Journal is applied and source data deleted

transactions continue Writing with Immediate Consistency Adding a Node

Page 10: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 10

INTELLIGENT CLIENT TO MAKE APPS SIMPLER

■  Implements Aerospike API ■  Optimistic row locking ■  Optimized binary protocol

■  Cluster tracking ■  Learns about cluster changes,

partition map ■  Gossip protocol

■  Transaction semantics ■  Global transaction ID ■  Retransmit and timeout

■  Linear scale ■  No extra hop ■  No load balancers

© 2013 Aerospike. All rights reserved Pg. 10

Page 11: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 11

OTHER DATABASE

OS FILE SYSTEM

PAGE CACHE

BLOCK INTERFACE

SSD HDD

BLOCK INTERFACE

SSD SSD

OPEN NVM

SSD

OTHER DATABASE

AEROSPIKE FLASH OPTIMIZED IN-MEMORY DATABASE

Ask me and I’ll tell you the answer. Ask me. I’ll look up the answer and then tell it to you.

AEROSPIKE

HYBRID MEMORY SYSTEM™

•  Direct device access •  Large Block Writes •  Indexes in DRAM •  Highly Parallelized •  Log-structured FS “copy-on-write” •  Fast restart with shared memory

FLASH OPTIMIZED HIGH PERFORMANCE

Page 12: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 12

Storage type DRAM & NoSQL SSD & DRAM Storage per server 180 GB (196 GB Server) 2.4 GB (4 x 700 GB)

TPS per server 500,000 500,000 Cost per server $8,000 $11,000

Server costs $1,488,000 $154,000 Power/server 0.9 kW 1.1 kW

Power (2 years) $0.12 per kWh ave. US $352,000 $32,400

Maintenance (2 years) $3,600 per server $670,000 $50,400

Total $2,510,000 $236,800

THE BOTTOM LINE

Actual customer analysis. Customer requires 500K TPS,

10 TB of storage, with 2x replication factor.

186 SERVERS REQUIRED 14 SERVERS REQUIRED

OTHER DATABASES

ONLY

Page 13: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 13

Only up in 2013, 2014

Everyone wants that “Facebook architecture”

Facebook and Apple bought at least$200+M in FusionIO cards in 2012

+ = $200+M

© 2013 Aerospike. All rights reserved Pg. 13

Page 14: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 14 © 2012 Aerospike. All rights reserved. Pg. 14

Measure your drives! Aerospike Certification Tool (ACT) http://github.com/aerospike/act Transactional database workload Reads: 1.5KB

(can’t batch / cache reads, random) Writes: 128K blocks

(log based layout) (plus defragmentation)

Turn up the load until latency is over required SLA

Page 15: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 15

➤  Super Storm Sandy 2012 §  NYC down for 17 hours §  Back up and synched in 1 hour via

Aerospike Cross-Data Center Replication (XDR)

Replication that Works

“Aerospike allows us to handle business continuity and reliability across 4 data centers seamlessly. And we can now expand our deployment to new data centers in less than a week.” - Elad Efraim, CTO

Page 16: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 16

HOT ANALYTICS BY ROW

Page 17: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 17

➤  Namespaces (policy containers) §  Determine storage - DRAM or Flash §  Determine replication factor §  Contain records and sets

➤  Sets (tables) of records §  Arbitrary grouping

➤  Records (rows) §  Max 128k, contain key and bins §  Bin with same name can contain

values of different types u  String, integer, bytes (raw, blob, etc) u  list ( an ordered collection of

values ) u  map ( a collection of keys and

values ) §  Bins can be added anytime

NOSQL EXTENSIBILITY

Page 18: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 18

DISTRIBUTED QUERIES 1.  “Scatter” requests to all nodes

2.  Indexes in DRAM for fast map of secondary à primary keys

3.  Indexes co-located with data to guarantee ACID, manage migrations

4.  Records read in parallel from all SSDs using lock free concurrency control

5.  Aggregate results on each node

6.  “Gather” results from all nodes on client

STREAM AGGREGATIONS 1.  Push Code/ Security Policies/ Rules to Data with UDFs

2.  Pipe Query results through UDFs to Filter, Transform, Aggregate.. Map, Reduce

REAL-TIME ANALYTICS on OPERATIONAL DATA (No ETL) ➤  In Database, within the same Cluster ➤ On the same Data, on XDR Replicated Clusters

Real-time Analytics on Operational Data

Page 19: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 19

LESSONS LEARNED

Page 20: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 20

NATIVE FLASH à PERFORMANCE

■  Low Latency at High Throughput

0

2.5

5

7.5

10

0 50,000 100,000 150,000 200,000

Aver

age

Late

ncy,

ms

Throughput, ops/sec

Balanced Workload Read Latency (Full view)

Aerospike

Cassandra

MongoDB

0

4

8

12

16

0 50,000 100,000 150,000 200,000

Aver

age

Late

ncy,

ms

Throughput, ops/sec

Balanced Workload Update Latency (Full view)

Aerospike

Cassandra

MongoDB

0

1

2

3

4

0 75,000 150,000 225,000 300,000

Aver

age

Late

ncy,

ms

Throughput, ops/sec

Read-Heavy Workload Read Latency (Full view)

Aerospike

Cassandra

MongoDB

0

6

12

18

24

0 75,000 150,000 225,000 300,000

Aver

age

Late

ncy,

ms

Throughput, ops/sec

Read-Heavy Workload Update Latency (Full view)

Aerospike

Cassandra

MongoDB

© 2013 Aerospike. All rights reserved Pg. 20

Page 21: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 21

LESSONS

1.  Keep architecture simple ■ No hot spots (e.g., robust DHT) ■ Scales up easily (e.g., easy to size) ■ Avoids points of failure (e.g., single node type)

2.  Avoid manual operation – automate, automate! ■ Self-managed cluster responds to node failures ■ Data rebalancing requires no intervention ■ Real-time prioritization allows unattended system operation

3.  Keep system asynchronous ■ Shared nothing – nodes are autonomous ■ Async writes across data centers ■  Independent tuning parameters for different classes of tasks

© 2013 Aerospike. All rights reserved Pg. 21

Page 22: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 22

LESSONS (cont’d)

4.  Monitor the Health of the System Extensively ■ Growth in load sneaks up on you over weeks ■ Early detection means better service ■ Most failures can be predicted (e.g., capacity, load, …)

5.  Size clusters properly ■ Have enough capacity ALWAYS! ■ Upgrade SSDs every couple years ■ Reduce cluster sizes to make operations simple

6.  Have geographically distributed data centers ■ Size the distributed data centers properly ■ Use active-active configurations if possible ■ Size bandwidth requirements accurately

© 2013 Aerospike. All rights reserved Pg. 22

Page 23: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 23

LESSONS (CONT’D)

7.  Have plan for unforeseen situations ■  Devise scenarios and practice during normal work time ■  Ensure you can do rolling upgrades during high load time ■  Make sure that your nodes can restart fast (< 1 minute)

8.  Constantly test and monitor app end-to-end ■  Application level metrics are more important than DB metrics ■  Most issues in a service are due to a combination of application, network,

database, storage, etc. 9.  Separate online and offline workloads

■  Reserve real-time edge database for transactions and hot analytics queries (where newest data is important)

■  Avoid ad-hoc queries on on-line system ■  Perform deep analysis in offline system (Hadoop)

10.  Use the Right Data Management System for the job ■  Fast NoSQL DB for real-time transactions and hot analytics on rapidly

changing data ■  Hadoop or other comparable systems for exhaustive analytics on mostly

read-only data

© 2013 Aerospike. All rights reserved Pg. 23

Page 24: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 24

AEROSPIKE REAL-TIME BIG DATA PLATFORM Rapid Development Complete Customizability

➤  Support for popular languages and tools §  ASQL and Aerospike Client in

Java, C#, Ruby, Python..

➤  Complex data types §  Nested documents

(map, list, string, integer) §  Large (Stack, Set, List) Objects

➤  Queries §  Single record §  Batch multi-record lookups §  Equality and range §  Aggregations and MapReduce

➤  User Defined Functions (UDFs) §  In-DB processing

➤  Aggregation Framework §  UDF Pipeline §  MapReduce ++

➤  Time Series Queries §  Just 2 IOPs for most r/w

independent of object size

Page 25: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 25

HOW TO GET AEROSPIKE?

Free Community Edition Enterprise Edition ➤  For developers looking

for speed and stability and transparently scale as they grow

➤  No transaction limits ➤  No time limit ➤  No production limit ➤  Data per cluster limit ➤  Community support

➤  For mission critical apps needing to scale right from the start §  Unlimited number of

nodes, clusters, data centers

§  Cross data center replication

§  Premium 24x7 support §  Priced by TBs of unique

data (not replicas)

© 2013 Aerospike. All rights reserved Pg. 25

Page 26: Aerospike AdTech Gets Hacked in Lower Manhattan

© 2013 Aerospike, Inc. All rights reserved. Confidential. | <Title of Presentation> | 26

QUESTIONS?

[email protected]

www.aerospike.com

© 2013 Aerospike. All rights reserved Pg. 26