leveraging big data with hadoop, nosql and rdbms

13
Leveraging Big Data with Hadoop, NoSQL and RDBMS April 7, 2016 Brian Bulkowski

Upload: aerospike-inc

Post on 23-Jan-2017

349 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Leveraging Big Data with Hadoop, NoSQL and RDBMS

Leveraging Big Data with Hadoop, NoSQL and RDBMS

—April 7, 2016Brian Bulkowski

Page 2: Leveraging Big Data with Hadoop, NoSQL and RDBMS

2Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Sustainable Differentiators

SPEED AT SCALE• Index in RAM, Data in Flash: Persistence• Key-value store• Open SourceSMART CLIENTS• Ease of use for programmers• Patented functionality that accelerates development by shielding

developers from low level programmingCLUSTERING• Availability• Patented clustering algorithms that solve the hardest problems of

distributed computingTCO• Optimized for Flash• Demonstrated 10 to 1 price performance savings• Battle tested, almost no downtime

Page 3: Leveraging Big Data with Hadoop, NoSQL and RDBMS

3Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

10X Faster or 10X Fewer

Page 4: Leveraging Big Data with Hadoop, NoSQL and RDBMS

4Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Polyglot Architecture

Transactions (Compliance, Legacy)

(Oracle, DB/2, SQLServer, MariaDB,

Postgres)

OperationalIn-memory

(Fast, Scalable, HA)(Aerospike, Cassandra with Cache, Coherence)

Operational Analytics

RelationalAd-Hoc

(Exadata, Aster, Teradata)

Hadoop Big Data(Cloudera, Hortonworks,

MapR)

Column(Vertica, RedShift,

SnowFlake, Oracle)

Graph(Neo4J, Objectivity, ...

)

Streaming(Spark, Heron)

… fast analytics but still analytics

App server architecture

In-memory SQL(VoltDB, MemSQL, …)

Polyglot Analytics systems (multiple are required because of performance reasons –data layout matters)

OperationalIn-memory

Read Write

Page 5: Leveraging Big Data with Hadoop, NoSQL and RDBMS

5Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Old Style Architecture Has Significant Limitations

Challenges• Complex

• Maintainability

• Durability

• Consistency

• Scalability

• Cost ($)

• Data Lag

Caching Layer

Operational Database

Real-timeConsumer Facing

Pricing /Inventory/Billing

Real-timeDecisioning

StreamingData

Legacy Database(Mainframe)

RDBMS Database

Transactional Systems

Enterprise Environment

Legacy RDBMS HDFS BASED

Page 6: Leveraging Big Data with Hadoop, NoSQL and RDBMS

6Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

High Performance NoSQL Enables Real-time Applications

Real-timeConsumer Facing

Pricing /Inventory/Billing

Real-timeDecisioning

StreamingData

AerospikeConnectors

Legacy Database(Mainframe)

RDBMS Database

Transactional Systems

Enterprise EnvironmentXDR

Aerospike

Legacy RDBMS HDFS BASED

Next Generation Operational Database

Business transactions of 500/sec X 2500 reads/writes/calculations per business transaction

= 1.25M Server Transactions/sec

Speed at Scale, Predictable Performance, Highest Availability & Lowest TCO

Page 7: Leveraging Big Data with Hadoop, NoSQL and RDBMS

7Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Business Challenge• Meet SLAs of 750 ms per business transaction• Differentiate between fraudulent and legitimate orders in real-

time• Support next-generation Machine Learning• Stop loss of business due to latency• Support hundreds of DB reads/writes per credit card

transaction• Increase operational data 10x

Prevent Only Fraudulent Transactions

Page 8: Leveraging Big Data with Hadoop, NoSQL and RDBMS

8Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Selected Aerospike in-Memory NoSQL• Built for Flash• Predictable Low latency at

High Throughput • Immediate consistency, no data loss• Cross data center (XDR) support• 20 Server Cluster • Dell 730xd w/ 4NVMe SSDs

Prevent Only Fraudulent Transactions

Credit Card Processing System

Fraud Detection & Protection App

RulesRule 1Rule 2Rule 3

Historical Data

Rule 1-PassedRule 2-PassedRule 3-Failed

Account BehaviorStatic Data

AccountStatistics

Page 9: Leveraging Big Data with Hadoop, NoSQL and RDBMS

9Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Intra-day System of Record

Challenge• DB2 (RDBMS) stores positions for 10 Million customers• Must update stock prices, show balances on 300 positions,

process 250M transactions, 2M updates/day• Risk, System of Engagement and Mobile• Data inconsistencies, long restarts, growing

number of servers, restarts take 1 hour

Page 10: Leveraging Big Data with Hadoop, NoSQL and RDBMS

10Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Intra-day System of Record

Selected Aerospike in-Memory NoSQL• Built for Flash• Predictable Low latency at High Throughput • Immediate consistency, no data loss• Hot standby implementation for extra redundancy• Cross data center (XDR) support• 10 Server Cluster

IBM DB2(Mainframe)

Real-Time App Record App

Finance App

Real-TimeData Feed

Start of the DayData Loading

End of DayReconciliation

AccountPositions

Page 11: Leveraging Big Data with Hadoop, NoSQL and RDBMS

11Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

AdTech – Predictive Analytics at Scale

Challenge• Low read latency (milliseconds)• 100K to 5M operations/second• Ensure 100% uptime • Provide global data replication

Performance achieved• 1 to 6 billion cookies tracked• 5.0M auctions per second • 100ms ad rendering, 50ms real-time bidding,

1ms database access• 1.5KB median object size

Selected Aerospike NoSQL over competition• 10X fewer nodes• 10X better TCO• 20X better read latency • High throughput at low latency

Ads is Displayed

Publishers

Ad Networks & SSPs

Ad Exchanges

Demand SidePlatform

Data Management Platforms

Brands Agencies Buyers

0 ms 100 ms

Page 12: Leveraging Big Data with Hadoop, NoSQL and RDBMS

12Proprietary & Confidential || © 2015 Aerospike Inc. All rights reserved. [ ]

Telco – Real-Time Billing and Charging Systems

Challenge• Edge access to regulate traffic• Accessible using provisioning applications

(self-serve and through support personnel)

Need for extremely high availability, reliably, low latency

• > TBs of data• 10-100M objects• 10-200K TPS

Selected Aerospike in-Memory NoSQL• Clustered system• Predictable low latency at high throughput• Highly-available and reliable on failure• Cross data center (XDR) support

SOURCEDEVICE/USER DESTINATIONReal-Time

Auth. QoS Billing

Request ExecuteRequest

Real-Time ChecksConfig Module App

Update DeviceUser Setting

Hot-Standby

XDR