o'reilly webinar: simplicity scales - big data

SIMPLICITY SCALES

Big data application management & operations

WHOAMI?

Tyler HannanDirector, Technical Marketing

tyler@basho.com@tylerhannan

SQLNoSQL

SQLNoSQLNewSQL

CHANGE IN ARCHITECTURAL DESIGN

App App App App

Virtualization

Server

Aggregation

Server Server Server Server

SMALL APPSBIG SERVERS

ONE LOCATION

BIG APPSCOMMODITY SERVERS

MANY LOCATIONS

In 2014, 20% of enterprise data

projects add distributed processes

into production

THE BENEFITS OF RIAK

Riak is an operationally friendly database that is:

• Fault-tolerant

• Highly-available

• Scalable

• Self-healing

THE PROPERTIES OF A DISTRIBUTED DB

Riak is a multi-model database that is:

• Open Source & Commercial

• Distributed

• Masterless

• Eventually Consistent

Why THOSE properties?

This is NOT about Riak.

This is about design decisions in distributed systems.

This IS about Riak.

And learning from Basho’s architectural decisions.

DISTRIBUTED SYSTEMS – A DEFINITION

“A distributed system is a software system in which components located on network computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal.”

--Wikipedia

“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable”

--Leslie Lamport

“Everything works at small scale. Understand failure modalities to understand your realities.”

--Tyler Hannan

THINKING DISTRIBUTED

What we consider, when we think distributed:

• Availability

• Fault Tolerance

• Latency

UPTIME IS A POOR METRIC…

Availability

AVAILABILITY

“…widespread underestimation of the specific difficulties of size seems one of the major underlying causes of current software failure.”

--Wikipedia

THE CAP THEOREM

HARVEST AND YIELD

Harvest• a fraction• data available / complete data

Yield• a probability• queries completed / queries requested

Failure will cause known linear reduction to one of these

UNDERSTANDING “CONSISTENCY

HARVEST

Queries Issued

Queries Offered

Data Available

Total Dataset

HARVEST AND YIELD

Traditional design demands 100% HARVEST…

but success of modern applications is often measured in YIELD.

HARVEST AND YIELD: USE CASE

RELATIONAL AVAILABILITY

primary

replica replica replica

coordination

Write/Read

RELATIONAL AVAILABILITY

primary replica replica

coordination

Write/Read

RIAK AVAILABILITY

Riak has a masterless architecture in which every node in a cluster is capable of serving read and write requests.

Each1/Nth the data1/Nth the performance

RIAK AVAILABILITY

Availability Requires Scalability

RELATIONAL SCALABILITY

A - K L - P Q - Z

Designed to scale vertically Cost of vertical scaling

Sharding

ADD CAPACITY AS NEEDED

Node 3

Node 4

Node 5

Node 0

Node 1

Node 2

Designed for Horizontal Scale

Deployed on commodity Hardware

Consistent Hashing

PERFECTION IS UNATTAINABLE…

FAILURES WILL HAPPEN

Fault Tolerance

FAULT TOLERANCE

How many hosts/replicas do you need to survive “F” failures?

• F + 1 – fundamental minimum

• 2F + 1 – a majority are alive

• 3F + 1 – Byzantine Fault Tolerance

NAÏVE HASHING

NH(Ka) = 0NH(Kb) = 1NH(Kc) = 2NH(Kd) = 0…

Node # = HASH(KEY) % NUM_NODES

NAÏVE HASHING

Node 0 Node 1 Node 2

Ka Kd Kg

Kj Km Kp

Kb Ke Kh

Kk Kn Kq

Kc Kf Ki

Kl Ko Kr

NAÏVE HASHING

Ka Ke Ki

Kb Kf Kj

Kc Kg Kk

Node 4

Kd Kh Kl

NAÏVE HASHING

• K = # of Keys• NN = # of Nodes

As NN grows factor essentially becomes 1, thus ALL keys move

K * (NN – 1) / NN => K

CONSISTENT HASHING

• # of Partitions remains CONSTANT• Key always maps to the SAME

Partition• Node owns Partitions• Partitions contain keys• Extra Level of Indirection

Partition # = HASH(KEY) % Partitions

CONSISTENT HASHING

P4 P7 P2 P5 P8 P3 P6 P9P1

Ka Kd Kg

Kj Km Kp

Kb Ke Kh

Kk Kn Kq

Kc Kf Ki

Kl Ko Kr

CONSISTENT HASHING

P4 P7 P2 P5P8 P3 P6 P9 P1

KaKd Kg

KjKm Kp

Kb KeKh

Kk KnKq

Kc Kf Ki

Kl Ko Kr

Node 4

CONSISTENT HASHING

• K = # of Keys• NN = # of Nodes• Q = # of Partitions

As K grows NN becomes constant, thus K/Q keys move

NN * K/Q => K/Q

RIAK AVAILABILITY

Node 0

Node 1

Node 2

BRIEF MILLISECONDSPHOTONS FLYING THROUGH GLASSTIME STOPS FOR NO ONE

Latency

UNDERSTANDING LATENCY

299,792,458 meters/second in a vacuum

LATENCY: GOOGLE’S BIG TABLE

95th percentile: 24 ms

99.9th percentile: 994 ms

REDUNDANCY REDUCES LATENCY

IF response > 10 msTHEN send 2nd request

5% increase in total requests

99.9th percentile latency = 50 ms

Overall latency is determined by latency of the SLOWEST machine.

Get data close to your users.

MULTI-CLUSTER REPLICATION

Replicate data across datacenters or across the world

UNDERSTANDING PERFORMANCE

You get fast read and write performance

What does this mean?

THE PLAN, THE ENVIRONMENT

• How do we measure performance?

• What do we measure when we measure performance?

• basho_bench• Google Cloud

CLUSTER EXPANSION

NODE FAILURE

THINKING DISTRIBUTED

What we consider, when we think distributed:

• Availability

• Fault Tolerance

• Latency

“Everything works at small scale. Understand failure modalities to understand your realities.”

--Tyler Hannan

NoSQLDatabase

Unstructured Data

No pre-defined Schema

Small and Large Data Sets on Commodity HW

Many Models: K/V, document store, graph

Variety of Query Methods

RELATIONAL & NOSQLWhat’s the difference?

RelationalDatabase

Structured Data

Defined Schema

Tables withRows/Columns

Indexedw/ Table Joins

THE EVOLUTION OF NOSQL

UnstructuredData Platforms

Multi-Model Solutions

Point Solutions

42% of database decision makers admit they

struggle to manage the NoSQL solutions deployed in their environments”

COMPLEX TECHNOLOGY STACK

Simplify the ComplexityEnsure High AvailabilityScale Horizontally

Enter Basho Data Platform

BASHO DATA PLATFORM?

Riak KV Client

Basho Redis Proxy

Client Application

Redis Redis Riak KV

Riak KV

Service Manager

Spark Spark

Spark Connector

Spark Client

Riak KV EnsembleBasho Data Platform

Redis Services

Spark Services

Leader Election

Read/Solr

MANAGING COMPLEXITY AT SCALE

THOUSANDS OF USERS

XfinityTV

MILLIONS OF RECORDS

Information requested and

amended more than 2.6 BILLION times

a year

42 MILLION Summary Care

Records

1.3 BILLION prescription messages

BILLIONS OF MOBILE DEVICES

10 BILLION data transactions a day – 150,000 a second

Forecasting 2.8 BILLION locations around the world

Generates 4GB OF DATA every second

?Tyler Hannan@tylerhannan

learn more at ricon.io

o'reilly webinar: simplicity scales - big data

riak availability riak

distributed systems

distributed db riak

distributed processes

benefits of riak riak

needed node

yield traditional design

poor metric availability

Documents

o'reilly ado activex data objects

o'reilly stores

o'reilly drupal webcast

o'reilly automotive report

o'reilly automotive stock research

tony o'reilly interview

o'reilly etech conference: laszlo ria

pc hardware - o'reilly media

o'reilly - initial coverage

o'reilly websquared whitepaper2009

layers - o'reilly

o'reilly - essential snmp

java web services (o'reilly)

o'reilly - java web services

treatment - o'reilly - pancan

o'reilly auto parts

o'reilly -java_web_services

sales - o'reilly

simplicity & power in one package - vande berg scales ·...

communicating with hardware - o'reilly media ·...