how & when to use nosql at websummit dublin

Post on 01-Jul-2015

235 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

In this talk from the Dublin Websummit 2014 AWS Technical Evangelist Danilo Poccia discusses NoSQL technology. Includes an introduction to NoSQL DB and a discussion of when it is time to consider NoSQL. Danilo also introduces Amazon DynamoDB as a NoSQL solution and talks through several case studies of customers that are using Amazon DynamoDB today.

TRANSCRIPT

How and When to Use NoSQL DBs Danilo Poccia | Technical Evangelist danilop@amazon.com @danilop

We will talk about

•  What is NoSQL DB •  When it is time to consider NoSQL •  Amazon DynamoDB as an NoSQL solution •  Case studies

What is a “Database”?

•  Data Persistence •  Data Model •  Data Retrieval

Traditional RDBMS vs NoSQL

RDBMS NoSQL

Data Model

Data Retrieval

Traditional RDBMS vs NoSQL

RDBMS NoSQL

Data Model Relational

Data Retrieval SQL

Traditional RDBMS vs NoSQL

RDBMS NoSQL

Data Model Relational Not Relational

Data Retrieval SQL Not SQL

Scale Dat

abas

e Q

uery

Per

form

ance

Desired  consistency,  predictability

Relational Databases

Scale Dat

abas

e Q

uery

Per

form

ance

Desired  consistency,  predictability

Actual  degraded  

performance  with  scale

Relational Databases

Scale Dat

abas

e Q

uery

Per

form

ance

Desired  consistency,  predictability

Actual  degraded  

performance  with  scale

Management problems  

Data  sharding  Data  caching  Provisioning  

Cluster  management  Fault  management  

Relational Databases

NoSQL DB typical features •  Non-relational •  Distributed •  Eventually consistent •  Horizontally scalable •  Support replication •  Schema-free

Typical use cases for NoSQL

•  Fast reads/writes with very low jitter •  Huge amounts of data

NoSQL on AWS

NoSQL on AWS •  Deploy popular third-party solutions via EC2

– MongoDB – Cassandra – Redis

NoSQL on AWS •  Deploy popular third-party solutions via EC2

– MongoDB – Cassandra – Redis

•  Use a managed AWS service – Amazon DynamoDB

OR

S3 Glacier

Redshift

RDS Size

Large

Small

Low High Latency / Complexity

Tools Positioning Matrix

Elastic MapReduce

Dynamo DB

Elasti Cache

Amazon DynamoDB

Amazon DynamoDB

NoSQL Database

Fast & predictable performance

Seamless Scalability

Easy administration

“Even though we have years of experience with large, complex NoSQL architectures, we are happy to be finally out of the business of managing it ourselves.” - Don MacAskill, CEO

ADMIN

Access and Query Model

 Two primary key options §  Hash key: Key lookups: “Give me the status for user foo” §  Composite key (Hash with Range): “Give me all the status updates for user ‘foo’

that occurred within the past 24 hours”

 Support for multiple data types §  Scalar: number, string, binary §  Multi-valued: number set, string set, binary set

 Consistency: Supports both strong and eventual consistency §  Pick consistency mode at per API call §  Different parts of applications can make different choices

Operations

  Table Operations §  CreateTable,  UpdateTable,  DeleteTable  §  ListTables,  DescribeTable  

  Item Operations

§  GetItem,  BatchGetItem  §  PutItem,  UpdateItem,  DeleteItem,  BatchWriteItem    

  Query & Scan §  Query,  Scan  

  Atomic Counters and Conditional updates

 

Social Network hash + range schemas

Social Network •  Store info about users •  Query for a user’s friends

Social Network

Users Table Friends Table

User Birthday Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Social Network

Users Table

User Birthday Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Social Network

Users Table

Item

User Birthday Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Social Network

Users Table

Attribute (string, number, binary, set)

User Birthday Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Social Network

Users Table

Primary Key (Hash)

User Birthday Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Social Network

Friends Table

User Friend Bob Alice Alice Bob Alice Carol Alice Dan

Users Table

Friends Table

User Friend Bob Alice Alice Bob Alice Carol Alice Dan

Social Network

Hash-and-Range Primary Key Schema User Birthday

Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Users Table

Friends Table

User Friend Bob Alice Alice Bob Alice Carol Alice Dan

Social Network

Query for Alice’s friends

User Birthday Bob 11.02 Alice 15.10 Carol 30.06 Dan 05.01

Users Table

DynamoDB Indexes

  Local Secondary Index (LSI) - Same Hash Key as Primary Key - Different Range Key

 Global Secondary Index (GSI) - New Hash and Range Key - Eventually Consistent

DynamoDB Highlights: Performance

  Typical service-side latency: single digit milliseconds

  Solid State Drive (SSD)-backed service

  Latency is consistent –  As throughput increases –  As storage grows

  No need for tuning.

DynamoDB Highlights: Reliability

  Durability –  Synchronous replication for high durability –  A write is only acknowledged (committed) once it exists in at least two

physical data centers –  All writes occur to disk, not memory

  Availability –  Regional service –  Spans multiple

Availability Zones (AZ) –  All data is continuously

replicated to multiple AZ’s

Provisioned Throughput  Reserve the throughput needed for each table

–  Set at table creation –  Example: My table needs 1,000 writes/second and

5,000 reads/seconds of capacity   Increase / decrease via API call  Pay for throughput and storage (not instances)

Provisioned Throughput  Why?

–  Simpler capacity planning –  Easy to translate calls to Apps => calls to database –  Not in terms of servers and disk IOPS –  Do not be locked in your peak

for month or year!

Actual traffic Capacity we can provision with DynamoDB

Capacity we needed before DynamoDB

DynamoDB Customer Highlights

x

Amazon DynamoDB Demo  Quick look at the Console  Sample use case

Geohash indexed by a GSI

How and When to Use NoSQL DBs Danilo Poccia | Technical Evangelist danilop@amazon.com @danilop

top related