cassandra nice use cases and worst anti patterns no sql-matters barcelona

79
@doanduyhai Cassandra nice use-cases and worst anti- patterns DuyHai DOAN, Technical Advocate

Upload: duyhai-doan

Post on 16-Apr-2017

578 views

Category:

Technology


0 download

TRANSCRIPT

@doanduyhai

Cassandra nice use-cases and worst anti-patternsDuyHai DOAN, Technical Advocate

@doanduyhai

Shameless self-promotion

2

Duy Hai DOAN Cassandra technical advocate•  talks, meetups, confs•  open-source devs (Achilles, …)•  technical point of contact

[email protected]•  production troubleshooting

@doanduyhai

Agenda

3

Anti-patterns •  Queue-like designs•  CQL null values •  Intensive updates on same column•  Design around dynamic schema

@doanduyhai

Agenda

4

Nice use-cases •  Rate-limiting•  Anti Fraud•  Account validation•  Sensor data timeseries

Data Model Crash Course

@doanduyhai

Last Write Win (LWW)

6

jdoe age name

33 John DOE

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

#partition

@doanduyhai

Last Write Win (LWW)

jdoe age (t1) name (t1)

33 John DOE

7

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

auto-generated timestamp (μs)

.

@doanduyhai

Last Write Win (LWW)

8

UPDATE users SET age = 34 WHERE login = jdoe;

jdoe age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

SSTable1 SSTable2

@doanduyhai

Last Write Win (LWW)

9

DELETE age FROM users WHERE login = jdoe;

jdoe age (t3)

ý

tombstone

jdoe age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

SSTable1 SSTable2 SSTable3

@doanduyhai

Last Write Win (LWW)

10

SELECT age FROM users WHERE login = jdoe;

? ? ?

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

@doanduyhai

Last Write Win (LWW)

11

SELECT age FROM users WHERE login = jdoe;

✓ ✕ ✕

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

@doanduyhai

Compaction

12

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

New SSTable

jdoe age (t3) name (t1)

ý John DOE

@doanduyhai

Simple Table

13

CREATE TABLE users (login text,name text,age int,…PRIMARY KEY(login));

partition key (#partition)

@doanduyhai

Clustered table (1 – N)

14

CREATE TABLE sensor_data (sensor_id text,date timestamp,raw_data blob,PRIMARY KEY((sensor_id), date));

partition key clustering column (sorted)

unicity

@doanduyhai

Sorted on disk

sensor_id date1(t1) date2(t2) date3(t3) date4(t4) date5(t5) …

… … … … …

SSTable1

Worst anti-patterns

Queue-like designsCQL null

Intensive updates on same columnDesign around dynamic schema

@doanduyhai

Failure level

17

☠☠

☠☠☠

☠☠☠☠

@doanduyhai

Queue-like designs

18

Adding new message ☞ 1 physical insert

@doanduyhai

Queue-like designs

19

Adding new message ☞ 1 physical insert

Consuming message = deleting it ☞ 1 physical insert (tombstone)

@doanduyhai

Queue-like designs

20

Adding new message ☞ 1 physical insert

Consuming message = deleting it ☞ 1 physical insert (tombstone)

Transactional queue = re-inserting messages ☞ physical insert * <many>

@doanduyhai

Queue-like designs

21

A

FIFO queue

{ A }

@doanduyhai

Queue-like designs

22

A B

FIFO queue

{ A, B }

@doanduyhai

Queue-like designs

23

A B C

FIFO queue

{ A, B, C }

@doanduyhai

Queue-like designs

24

A B C A

FIFO queue

{ B, C }

@doanduyhai

Queue-like designs

25

A B C A D

FIFO queue

{ B, C, D }

@doanduyhai

Queue-like designs

26

A B C A D B

FIFO queue

{ C, D }

@doanduyhai

Queue-like designs

27

A B C A D B C

FIFO queue

{ D }

@doanduyhai

Queue-like designs

28

A A A A A A A A A A

FIFO queue, worst case

{ }

@doanduyhai

Failure level

29

☠☠☠

@doanduyhai

Queue-like designs

30

Solution: event-sourcing •  write ahead, never delete•  read = move a cursor forward (or backward in time for history)

A B C D A E

Write cursor

Read cursor. Next read will give {A, E}

@doanduyhai

CQL null semantics

31

Reading null value means•  value does not exist (has never bean created)•  value deleted (tombstone)

SELECT age FROM users WHERE login = jdoe; à NULL

@doanduyhai

CQL null semantics

32

Writing null means•  delete value (creating tombstone)•  even though it does not exist

UPDATE users SET age = NULL WHERE login = jdoe;

@doanduyhai

CQL null semantics

33

Seen in production: prepared statement

UPDATE users SET age = ?, …geo_location = ?,mood = ?, …WHERE login = ?;

@doanduyhai

CQL null semantics

34

Seen in production: bound statement

preparedStatement.bind(33, …, null, null, null, …);

null ☞ tombstone creation on each update …

jdoe age name geo_loc mood status

33 John DOE ý ý ý

@doanduyhai

Failure level

35

@doanduyhai

Intensive update

36

Context•  small start-up•  cloud-based video recording & alarm•  internet of things (sensor)•  10 updates/sec for some sensors

@doanduyhai

Intensive update on same column

37

Data model

sensor_id value

45.0034

CREATE TABLE sensor_data (sensor_id long,value double,PRIMARY KEY(sensor_id));

@doanduyhai

Intensive update on same column

38

Updates

sensor_id value (t1)

45.0034

UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …;UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …;UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …;

sensor_id value (t13)

47.4182 sensor_id

value (t36)

48.0300

@doanduyhai

Intensive update on same column

39

Read

SELECT sensor_value from sensor_data WHERE sensor_id = …;

read N physical columns, only 1 useful … (until compaction)

sensor_id value (t1)

45.0034 sensor_id

value (t13)

47.4182 sensor_id

value (t36)

48.0300

@doanduyhai

Failure level

40

☠☠

@doanduyhai

Intensive update on same column

41

Solution 1: leveled compaction! (if your I/O can keep up)

sensor_id value (t1)

45.0034 sensor_id

value (t13)

47.4182 sensor_id

value (t36)

48.0300

sensor_id value (t36)

48.0300

@doanduyhai

Intensive update on same column

42

Solution 2: reversed timeseries & DateTiered compaction strategy

CREATE TABLE sensor_data (sensor_id long,date timestamp,value double,PRIMARY KEY((sensor_id), date))

WITH CLUSTERING ORDER (date DESC);

@doanduyhai

Intensive update on same column

43

Data cleaning by configuration the strategy (base_time_seconds)

SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1;

sensor_id date3(t3) date2(t2) date1(t1) ...

48.0300 47.4182 45.0034 …

@doanduyhai

Design around dynamic schema

44

Customer emergency call•  3 nodes cluster almost full•  impossible to scale out•  4th node in JOINING state for 1 week•  disk space is filling up, production at risk!

@doanduyhai

Design around dynamic schema

45

After investigation•  4th node in JOINING state because streaming is stalled•  NPE in logs

@doanduyhai

Design around dynamic schema

46

After investigation•  4th node in JOINING state because streaming is stalled•  NPE in logs

Cassandra source-code to the rescue

@doanduyhai

Design around dynamic schema

47

public class CompressedStreamReader extends StreamReader{ … @Override public SSTableWriter read(ReadableByteChannel channel) throws IOException { … Pair<String, String> kscf = Schema.instance.getCF(cfId); ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right);

NPE here

@doanduyhai

Design around dynamic schema

48

The truth is•  the devs dynamically drop & recreate table every day•  dynamic schema is in the core of their design

Example:

DROP TABLE catalog_127_20140613;CREATE TABLE catalog_127_20140614( … );

@doanduyhai

Design around dynamic schema

49

Failure sequence

n1

n2

n4

n3

catalog_x_y

catalog_x_y

catalog_x_y

catalog_x_y

1 4

2

3

5

6

@doanduyhai

Design around dynamic schema

50

Failure sequence

n1

n2

n4

n3

catalog_x_y

catalog_x_y

catalog_x_y

catalog_x_y

1 4

2

3

5

6

catalog_x_z

catalog_x_z

catalog_x_z

catalog_x_z

@doanduyhai

Design around dynamic schema

51

Failure sequence

n1

n2

n4

n3

1 4

2

3

5

6

catalog_x_z

catalog_x_z

catalog_x_z

catalog_x_z

catalog_x_y ????

@doanduyhai

Design around dynamic schema

52

Nutshell•  dynamic schema change as normal prod operation is not

recommended •  schema AND topology change at the same time is an anti-pattern

@doanduyhai

Failure level

53

☠☠☠☠

Q & R

! "

Nice Examples

Rate limitingAnti Fraud

Account Validation

@doanduyhai

Rate limiting

56

Start-up company, reset password feature1) /password/reset

2) SMS with token A0F83E63DB935465CE73DFE….

Phone number Random token

3) /password/new/<token>/<password>

@doanduyhai

Rate limiting

57

Problem 1•  account created with premium phone number

@doanduyhai

Rate limiting

58

Problem 1•  account created with premium phone number•  /password/reset x 100

@doanduyhai

Rate limiting

59

« money, money, money, give money, in the richman’s world » $$$

@doanduyhai

Rate limiting

60

Problem 2•  massive hack

@doanduyhai

Rate limiting

61

Problem 2•  massive hack •  106 /password/reset calls from few accounts

@doanduyhai

Rate limiting

62

Problem 2•  massive hack •  106 /password/reset calls from few accounts•  SMS messages are cheap

@doanduyhai

Rate limiting

63

Problem 2•  ☞ but not at the 106/per user/per day scale

@doanduyhai

Rate limiting

64

Solution•  premium phone number ☞ Google libphonenumber

@doanduyhai

Rate limiting

65

Solution•  premium phone number ☞ Google libphonenumber •  massive hack ☞ rate limiting with Cassandra

@doanduyhai

Cassandra Time To Live

66

Time to live•  built-in feature•  insert data with a TTL in sec•  expires server-side automatically •  ☞ use as sliding-window

@doanduyhai

Rate limiting in action

67

Implementation•  threshold = max 3 reset password per sliding 24h per

user

@doanduyhai

Rate limiting in action

68

Implementation•  when /password/reset called•  check threshold•  reached ☞ error message/ignore

•  not reached ☞ log the attempt with TTL = 86400

Rate Limiting Demo

@doanduyhai

Anti Fraud

70

Real story•  many special offers available•  30 mins international calls (50 countries)

•  unlimited land-line calls to 5 countries •  …

@doanduyhai

Anti Fraud

71

Real story•  each offer has a duration (week/month/year)•  only one offer active at a time

@doanduyhai

Anti Fraud

72

Cassandra TTL•  when granting new offer

INSERT INTO user_special_offer(login, offer_code, …) VALUES(‘jdoe’, ’30_mins_international’,…)IF NOT EXISTS USING TTL <offer_duration>;

Anti Fraud Demo

@doanduyhai

Account Validation

74

Requirement•  user creates new account•  sends sms/email link with token to validate account•  10 days to validate

@doanduyhai

Account Validation

75

How to ?•  create account with 10 days TTL

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33)USING TTL 864000;

@doanduyhai

Account Validation

76

How to ?•  create random token for validation with 10 days TTL

INSERT INTO account_validation(token, login, name, age) VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33)USING TTL 864000;

@doanduyhai

Account Validation

77

On token validation•  check token exist & retrieve user details

SELECT login, name, age FROM account_validationWHERE token = ‘A0F83E63DB935465CE73DFE…’;

•  re-insert durably user details without TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

Q & R

! "

Thank You @doanduyhai

[email protected]

https://academy.datastax.com/