hrx meetup group 8/20/2014: cassandra and how to scale your database

31
Cassandra Pretty Cool

Upload: planet-cassandra

Post on 15-Jan-2015

712 views

Category:

Technology


2 download

DESCRIPTION

HR5 alum Stephen Portanova will be presenting on the highly scalable database Cassandra, which is used by Reddit, Netflix, CERN, and The Weather Channel. 'nuff said.

TRANSCRIPT

Page 1: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CassandraPretty Cool

Page 2: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

HistoryGoogle Big Table

Amazon Dynamo

Page 3: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Today

Page 4: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Why Should You Care● Horizontal Scaling (basically auto sharding)

● Multiple Nodes - Highly Available

● Really Fast Writes

● Not too shabby at reads either - SLICES!!

● Bright Future

Page 5: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

The Cluster

● replication factor (rf)● read consistency (r)● write consistency (w)● clustering - shard on

partition key

Page 6: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

The One Ring

Page 7: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Storage - Vnodes

Page 8: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model

● Wide rows

● Slices Queries

● Denormalization

● Index tables

Page 9: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id));

Data Model - Simple Key

ROW KEY

Page 10: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model - Simple InsertsINSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘999’, ‘wat‘, ‘[email protected]‘, ‘[email protected]‘, ‘is going on?’);

Page 11: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model Simple Inserts Result

Select * from email_app.emails;

111subject to_add cc body

wat horse@ giraffe@ is going on999

subject to_add cc body

party cat@ hippo@ at my place

Page 12: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Hash

111

to cc bodyColumn Values

Row Keys 999

subject to cc bodysubject

Page 13: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

Data Model - Simple Insert - Again

111 subject to_add cc body

party cat@ hippo@ at my place

subject to_add cc body

wat horse@ giraffe@ Is going on?999IDEMPOTENT

Page 14: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id, subject));

Data Model - Composite Key 1

ROW KEY CLUSTERING KEY

Page 15: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

Data Model - Composite Insert 1

Same as Before. Right???

Page 16: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model Composite Insert Result

Select * from emails WHERE user_id = 111;

111 party|to_ad party|cc party|body

cat@ hippo@ At my place

Subject

Page 17: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Hash

111

to_add cc bodyColumn Values

Row Key

partyClustering Column

user_id

subject

Page 18: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ’swim’, ‘[email protected]‘, ‘[email protected]‘, ‘in the pool’);

Data Model - Composite Insert 2

Page 19: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Composite Insert 2 Result

Select * from emails WHERE user_id = ‘111’;

111 party|to_add party|cc party|body

cat@ hippo@ at my place

Subject

swim|to_add swim|cc swim|body

cat@ hippo@b in the pool

Sorted by clustering column - “subject”

Page 20: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Sorted Hash

111

party

to cc body

Clustering Column

Column Values

Row Key

swim

to cc body

subject

user_id

Page 21: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Why sorted?

SELECT * FROM emails WHERE user_id = '111' AND (subject) >= ('s') AND (subject) < (‘t’);

111 party|to_add party|cc party|body

cat@ giraffe@ At my place

SLICE QUERIES!!

swim|to_add swim|cc swim|body

cat@ hippo@b in the pool

Page 22: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add));

DM - Compound Composite Key

ROW KEY CLUSTERING KEY

Page 23: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘wat‘, ‘[email protected]‘, ‘[email protected]‘, ‘is going on?’);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

Composite / Compound Inserts

Page 24: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Composite Insert 2 Result

SELECT * FROM emails WHERE user_id = ‘111’AND subject = ‘party’;

111:partycat@|cc cat@|body

hippo@ At my place

SELECT * FROM emails WHERE user_id = ‘111’;

to_add

Page 25: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model - Composite Insert 1

SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’;

111:partycat@|cc cat@...|body

giraffe@ At my place

dog@|cc dog@|body

hippo@b all the time

Sorting / slice on - “to_add”

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘all the time’);

to_add

Page 26: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add, cc));

DM - Compound Composite Key 2

ROW KEY CLUSTERING KEYS

Page 27: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Composite / Clustered InsertsINSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘all the time);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘At my place’);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘At my place’);

Page 28: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

DM - Composite / Clustered InsertsSELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’;

111|partycat@|hippo@|body cat@|mouse@|body

at my place at my place

dog@|hippo@|body

all the time

Slice on (to_add) OR (to_add, cc)

Page 29: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Sorted Hash

111|party

cat dog

hippo mouse hippo

body body body

Clustering Columns

Column Values

Row Key

to_add

cc

user_id +subject

Page 30: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Part 2 / 8 of this 7 hour talk

● Denormalization

● Index Column Families

● Cassandra Internals (memtables, SSTables, compaction, repair)

Page 31: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Part 8 / 8: The Future

● Continually improving● More and more adoption● Awesome projects● http://www.datastax.

com/documentation/cassandra/2.0/pdf/cassandra20.pdf

● http://planetcassandra.org/