o'reilly webinar: simplicity scales - big data

67
SIMPLICITY SCALES Big data application management & operations 1

Upload: basho-technologies

Post on 07-Aug-2015

49 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: O'Reilly Webinar: Simplicity Scales - Big Data

1

SIMPLICITY SCALES

Big data application management & operations

Page 2: O'Reilly Webinar: Simplicity Scales - Big Data

WHOAMI?

Tyler HannanDirector, Technical Marketing

[email protected]@tylerhannan

Page 3: O'Reilly Webinar: Simplicity Scales - Big Data

SQL

Page 4: O'Reilly Webinar: Simplicity Scales - Big Data

SQLNoSQL

Page 5: O'Reilly Webinar: Simplicity Scales - Big Data

SQLNoSQLNewSQL

Page 6: O'Reilly Webinar: Simplicity Scales - Big Data

CHANGE IN ARCHITECTURAL DESIGN

App App App App

Virtualization

Server

App

Aggregation

Server Server Server Server

SMALL APPSBIG SERVERS

ONE LOCATION

BIG APPSCOMMODITY SERVERS

MANY LOCATIONS

Page 7: O'Reilly Webinar: Simplicity Scales - Big Data

In 2014, 20% of enterprise data

projects add distributed processes

into production

Page 8: O'Reilly Webinar: Simplicity Scales - Big Data

THE BENEFITS OF RIAK

Riak is an operationally friendly database that is:

• Fault-tolerant

• Highly-available

• Scalable

• Self-healing

Page 9: O'Reilly Webinar: Simplicity Scales - Big Data

THE PROPERTIES OF A DISTRIBUTED DB

Riak is a multi-model database that is:

• Open Source & Commercial

• Distributed

• Masterless

• Eventually Consistent

Page 10: O'Reilly Webinar: Simplicity Scales - Big Data

Why THOSE properties?

Page 11: O'Reilly Webinar: Simplicity Scales - Big Data

This is NOT about Riak.

This is about design decisions in distributed systems.

Page 12: O'Reilly Webinar: Simplicity Scales - Big Data

This IS about Riak.

And learning from Basho’s architectural decisions.

Page 13: O'Reilly Webinar: Simplicity Scales - Big Data

DISTRIBUTED SYSTEMS – A DEFINITION

“A distributed system is a software system in which components located on network computers communicate and coordinate their actions by passing messages. The components interact with each other in order to achieve a common goal.”

--Wikipedia

Page 14: O'Reilly Webinar: Simplicity Scales - Big Data

DISTRIBUTED SYSTEMS – A DEFINITION

“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable”

--Leslie Lamport

Page 15: O'Reilly Webinar: Simplicity Scales - Big Data

DISTRIBUTED SYSTEMS – A DEFINITION

“Everything works at small scale. Understand failure modalities to understand your realities.”

--Tyler Hannan

Page 16: O'Reilly Webinar: Simplicity Scales - Big Data

THINKING DISTRIBUTED

What we consider, when we think distributed:

• Availability

• Fault Tolerance

• Latency

Page 17: O'Reilly Webinar: Simplicity Scales - Big Data

UPTIME IS A POOR METRIC…

Availability

Page 18: O'Reilly Webinar: Simplicity Scales - Big Data

AVAILABILITY

“…widespread underestimation of the specific difficulties of size seems one of the major underlying causes of current software failure.”

--Wikipedia

Page 19: O'Reilly Webinar: Simplicity Scales - Big Data

THE CAP THEOREM

Page 20: O'Reilly Webinar: Simplicity Scales - Big Data

HARVEST AND YIELD

Harvest• a fraction• data available / complete data

Yield• a probability• queries completed / queries requested

Failure will cause known linear reduction to one of these

Page 21: O'Reilly Webinar: Simplicity Scales - Big Data

UNDERSTANDING “CONSISTENCY

HARVEST

YIELD

Queries Issued

Queries Offered

Data Available

Total Dataset

Page 22: O'Reilly Webinar: Simplicity Scales - Big Data

HARVEST AND YIELD

Traditional design demands 100% HARVEST…

but success of modern applications is often measured in YIELD.

Page 23: O'Reilly Webinar: Simplicity Scales - Big Data

HARVEST AND YIELD: USE CASE

Page 24: O'Reilly Webinar: Simplicity Scales - Big Data

HARVEST AND YIELD: USE CASE

Page 25: O'Reilly Webinar: Simplicity Scales - Big Data

RELATIONAL AVAILABILITY

primary

replica replica replica

coordination

Write/Read

Page 26: O'Reilly Webinar: Simplicity Scales - Big Data

RELATIONAL AVAILABILITY

primary replica replica

coordination

X

Write/Read

Page 27: O'Reilly Webinar: Simplicity Scales - Big Data

RIAK AVAILABILITY

Riak has a masterless architecture in which every node in a cluster is capable of serving read and write requests.

Page 28: O'Reilly Webinar: Simplicity Scales - Big Data

Each1/Nth the data1/Nth the performance

=

RIAK AVAILABILITY

Page 29: O'Reilly Webinar: Simplicity Scales - Big Data

Availability Requires Scalability

Page 30: O'Reilly Webinar: Simplicity Scales - Big Data

RELATIONAL SCALABILITY

A - K L - P Q - Z

Designed to scale vertically Cost of vertical scaling

Sharding

Page 31: O'Reilly Webinar: Simplicity Scales - Big Data

ADD CAPACITY AS NEEDED

Node 3

Node 4

Node 5

Node 0

Node 1

Node 2

Designed for Horizontal Scale

Deployed on commodity Hardware

Consistent Hashing

Page 32: O'Reilly Webinar: Simplicity Scales - Big Data

PERFECTION IS UNATTAINABLE…

FAILURES WILL HAPPEN

Fault Tolerance

Page 33: O'Reilly Webinar: Simplicity Scales - Big Data

FAULT TOLERANCE

How many hosts/replicas do you need to survive “F” failures?

• F + 1 – fundamental minimum

• 2F + 1 – a majority are alive

• 3F + 1 – Byzantine Fault Tolerance

Page 34: O'Reilly Webinar: Simplicity Scales - Big Data

NAÏVE HASHING

NH(Ka) = 0NH(Kb) = 1NH(Kc) = 2NH(Kd) = 0…

Node # = HASH(KEY) % NUM_NODES

Page 35: O'Reilly Webinar: Simplicity Scales - Big Data

NAÏVE HASHING

Node 0 Node 1 Node 2

Ka Kd Kg

Kj Km Kp

Kb Ke Kh

Kk Kn Kq

Kc Kf Ki

Kl Ko Kr

Page 36: O'Reilly Webinar: Simplicity Scales - Big Data

NAÏVE HASHING

Node 0 Node 1 Node 2

Ka Ke Ki

Km Kq

Kb Kf Kj

Kn Kr

Kc Kg Kk

Ko

Node 4

Kd Kh Kl

Kp

Page 37: O'Reilly Webinar: Simplicity Scales - Big Data

NAÏVE HASHING

• K = # of Keys• NN = # of Nodes

As NN grows factor essentially becomes 1, thus ALL keys move

K * (NN – 1) / NN => K

Page 38: O'Reilly Webinar: Simplicity Scales - Big Data

CONSISTENT HASHING

• # of Partitions remains CONSTANT• Key always maps to the SAME

Partition• Node owns Partitions• Partitions contain keys• Extra Level of Indirection

Partition # = HASH(KEY) % Partitions

Page 39: O'Reilly Webinar: Simplicity Scales - Big Data

CONSISTENT HASHING

P4 P7 P2 P5 P8 P3 P6 P9P1

Node 0 Node 1 Node 2

Ka Kd Kg

Kj Km Kp

Kb Ke Kh

Kk Kn Kq

Kc Kf Ki

Kl Ko Kr

Page 40: O'Reilly Webinar: Simplicity Scales - Big Data

CONSISTENT HASHING

P4 P7 P2 P5P8 P3 P6 P9 P1

Node 0 Node 1 Node 2

KaKd Kg

KjKm Kp

Kb KeKh

Kk KnKq

Kc Kf Ki

Kl Ko Kr

Node 4

Page 41: O'Reilly Webinar: Simplicity Scales - Big Data

CONSISTENT HASHING

• K = # of Keys• NN = # of Nodes• Q = # of Partitions

As K grows NN becomes constant, thus K/Q keys move

NN * K/Q => K/Q

Page 42: O'Reilly Webinar: Simplicity Scales - Big Data

RIAK AVAILABILITY

Node 0

Node 1

Node 2

Page 43: O'Reilly Webinar: Simplicity Scales - Big Data

BRIEF MILLISECONDSPHOTONS FLYING THROUGH GLASSTIME STOPS FOR NO ONE

Latency

Page 44: O'Reilly Webinar: Simplicity Scales - Big Data

UNDERSTANDING LATENCY

299,792,458 meters/second in a vacuum

Page 45: O'Reilly Webinar: Simplicity Scales - Big Data

UNDERSTANDING LATENCY

Page 46: O'Reilly Webinar: Simplicity Scales - Big Data

LATENCY: GOOGLE’S BIG TABLE

95th percentile: 24 ms

99.9th percentile: 994 ms

Page 47: O'Reilly Webinar: Simplicity Scales - Big Data

REDUNDANCY REDUCES LATENCY

IF response > 10 msTHEN send 2nd request

5% increase in total requests

99.9th percentile latency = 50 ms

Page 48: O'Reilly Webinar: Simplicity Scales - Big Data

UNDERSTANDING LATENCY

Overall latency is determined by latency of the SLOWEST machine.

Get data close to your users.

Page 49: O'Reilly Webinar: Simplicity Scales - Big Data

MULTI-CLUSTER REPLICATION

Replicate data across datacenters or across the world

Page 50: O'Reilly Webinar: Simplicity Scales - Big Data

UNDERSTANDING PERFORMANCE

You get fast read and write performance

What does this mean?

Page 51: O'Reilly Webinar: Simplicity Scales - Big Data

THE PLAN, THE ENVIRONMENT

• How do we measure performance?

• What do we measure when we measure performance?

• basho_bench• Google Cloud

Page 52: O'Reilly Webinar: Simplicity Scales - Big Data

CLUSTER EXPANSION

Page 53: O'Reilly Webinar: Simplicity Scales - Big Data

NODE FAILURE

Page 54: O'Reilly Webinar: Simplicity Scales - Big Data

THINKING DISTRIBUTED

What we consider, when we think distributed:

• Availability

• Fault Tolerance

• Latency

Page 55: O'Reilly Webinar: Simplicity Scales - Big Data

DISTRIBUTED SYSTEMS – A DEFINITION

“Everything works at small scale. Understand failure modalities to understand your realities.”

--Tyler Hannan

Page 56: O'Reilly Webinar: Simplicity Scales - Big Data

CV CV

NoSQLDatabase

Unstructured Data

No pre-defined Schema

Small and Large Data Sets on Commodity HW

Many Models: K/V, document store, graph

Variety of Query Methods

RELATIONAL & NOSQLWhat’s the difference?

RelationalDatabase

Structured Data

Defined Schema

Tables withRows/Columns

Indexedw/ Table Joins

SQL

Page 57: O'Reilly Webinar: Simplicity Scales - Big Data

THE EVOLUTION OF NOSQL

UnstructuredData Platforms

Multi-Model Solutions

Point Solutions

Page 58: O'Reilly Webinar: Simplicity Scales - Big Data

42% of database decision makers admit they

struggle to manage the NoSQL solutions deployed in their environments”

Riak

Spark

COMPLEX TECHNOLOGY STACK

Page 59: O'Reilly Webinar: Simplicity Scales - Big Data

Simplify the ComplexityEnsure High AvailabilityScale Horizontally

Page 60: O'Reilly Webinar: Simplicity Scales - Big Data

Enter Basho Data Platform

Page 61: O'Reilly Webinar: Simplicity Scales - Big Data

BASHO DATA PLATFORM?

Page 62: O'Reilly Webinar: Simplicity Scales - Big Data

Riak KV Client

Basho Redis Proxy

Client Application

Redis Redis Riak KV

Riak KV

Service Manager

Spark Spark

Spark Connector

Spark Client

Riak KV EnsembleBasho Data Platform

Read

Write

Read

Write

Read

Redis Services

Read

Spark Services

Leader Election

Query

Write

Read/Solr

MANAGING COMPLEXITY AT SCALE

Page 63: O'Reilly Webinar: Simplicity Scales - Big Data

THOUSANDS OF USERS

XfinityTV

Page 64: O'Reilly Webinar: Simplicity Scales - Big Data

MILLIONS OF RECORDS

Information requested and

amended more than 2.6 BILLION times

a year

42 MILLION Summary Care

Records

1.3 BILLION prescription messages

Page 65: O'Reilly Webinar: Simplicity Scales - Big Data

BILLIONS OF MOBILE DEVICES

10 BILLION data transactions a day – 150,000 a second

Forecasting 2.8 BILLION locations around the world

Generates 4GB OF DATA every second

Page 66: O'Reilly Webinar: Simplicity Scales - Big Data

?Tyler Hannan@tylerhannan

Page 67: O'Reilly Webinar: Simplicity Scales - Big Data

learn more at ricon.io