cassandra nice use cases and worst anti patterns no sql-matters barcelona
TRANSCRIPT
@doanduyhai
Shameless self-promotion
2
Duy Hai DOAN Cassandra technical advocate• talks, meetups, confs• open-source devs (Achilles, …)• technical point of contact
☞ [email protected]• production troubleshooting
@doanduyhai
Agenda
3
Anti-patterns • Queue-like designs• CQL null values • Intensive updates on same column• Design around dynamic schema
@doanduyhai
Agenda
4
Nice use-cases • Rate-limiting• Anti Fraud• Account validation• Sensor data timeseries
@doanduyhai
Last Write Win (LWW)
6
jdoe age name
33 John DOE
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
#partition
@doanduyhai
Last Write Win (LWW)
jdoe age (t1) name (t1)
33 John DOE
7
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
auto-generated timestamp (μs)
.
@doanduyhai
Last Write Win (LWW)
8
UPDATE users SET age = 34 WHERE login = jdoe;
jdoe age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
SSTable1 SSTable2
@doanduyhai
Last Write Win (LWW)
9
DELETE age FROM users WHERE login = jdoe;
jdoe age (t3)
ý
tombstone
jdoe age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
SSTable1 SSTable2 SSTable3
@doanduyhai
Last Write Win (LWW)
10
SELECT age FROM users WHERE login = jdoe;
? ? ?
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
@doanduyhai
Last Write Win (LWW)
11
SELECT age FROM users WHERE login = jdoe;
✓ ✕ ✕
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
@doanduyhai
Compaction
12
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
New SSTable
jdoe age (t3) name (t1)
ý John DOE
@doanduyhai
Simple Table
13
CREATE TABLE users (login text,name text,age int,…PRIMARY KEY(login));
partition key (#partition)
@doanduyhai
Clustered table (1 – N)
14
CREATE TABLE sensor_data (sensor_id text,date timestamp,raw_data blob,PRIMARY KEY((sensor_id), date));
partition key clustering column (sorted)
unicity
@doanduyhai
Sorted on disk
sensor_id date1(t1) date2(t2) date3(t3) date4(t4) date5(t5) …
… … … … …
SSTable1
Worst anti-patterns
Queue-like designsCQL null
Intensive updates on same columnDesign around dynamic schema
@doanduyhai
Queue-like designs
19
Adding new message ☞ 1 physical insert
Consuming message = deleting it ☞ 1 physical insert (tombstone)
@doanduyhai
Queue-like designs
20
Adding new message ☞ 1 physical insert
Consuming message = deleting it ☞ 1 physical insert (tombstone)
Transactional queue = re-inserting messages ☞ physical insert * <many>
@doanduyhai
Queue-like designs
30
Solution: event-sourcing • write ahead, never delete• read = move a cursor forward (or backward in time for history)
A B C D A E
Write cursor
Read cursor. Next read will give {A, E}
@doanduyhai
CQL null semantics
31
Reading null value means• value does not exist (has never bean created)• value deleted (tombstone)
SELECT age FROM users WHERE login = jdoe; à NULL
@doanduyhai
CQL null semantics
32
Writing null means• delete value (creating tombstone)• even though it does not exist
UPDATE users SET age = NULL WHERE login = jdoe;
@doanduyhai
CQL null semantics
33
Seen in production: prepared statement
UPDATE users SET age = ?, …geo_location = ?,mood = ?, …WHERE login = ?;
@doanduyhai
CQL null semantics
34
Seen in production: bound statement
preparedStatement.bind(33, …, null, null, null, …);
null ☞ tombstone creation on each update …
jdoe age name geo_loc mood status
33 John DOE ý ý ý
@doanduyhai
Intensive update
36
Context• small start-up• cloud-based video recording & alarm• internet of things (sensor)• 10 updates/sec for some sensors
@doanduyhai
Intensive update on same column
37
Data model
sensor_id value
45.0034
CREATE TABLE sensor_data (sensor_id long,value double,PRIMARY KEY(sensor_id));
@doanduyhai
Intensive update on same column
38
Updates
sensor_id value (t1)
45.0034
UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …;UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …;UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …;
sensor_id value (t13)
47.4182 sensor_id
value (t36)
48.0300
@doanduyhai
Intensive update on same column
39
Read
SELECT sensor_value from sensor_data WHERE sensor_id = …;
read N physical columns, only 1 useful … (until compaction)
sensor_id value (t1)
45.0034 sensor_id
value (t13)
47.4182 sensor_id
value (t36)
48.0300
@doanduyhai
Intensive update on same column
41
Solution 1: leveled compaction! (if your I/O can keep up)
sensor_id value (t1)
45.0034 sensor_id
value (t13)
47.4182 sensor_id
value (t36)
48.0300
sensor_id value (t36)
48.0300
@doanduyhai
Intensive update on same column
42
Solution 2: reversed timeseries & DateTiered compaction strategy
CREATE TABLE sensor_data (sensor_id long,date timestamp,value double,PRIMARY KEY((sensor_id), date))
WITH CLUSTERING ORDER (date DESC);
@doanduyhai
Intensive update on same column
43
Data cleaning by configuration the strategy (base_time_seconds)
SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1;
sensor_id date3(t3) date2(t2) date1(t1) ...
48.0300 47.4182 45.0034 …
@doanduyhai
Design around dynamic schema
44
Customer emergency call• 3 nodes cluster almost full• impossible to scale out• 4th node in JOINING state for 1 week• disk space is filling up, production at risk!
@doanduyhai
Design around dynamic schema
45
After investigation• 4th node in JOINING state because streaming is stalled• NPE in logs
@doanduyhai
Design around dynamic schema
46
After investigation• 4th node in JOINING state because streaming is stalled• NPE in logs
Cassandra source-code to the rescue
@doanduyhai
Design around dynamic schema
47
public class CompressedStreamReader extends StreamReader{ … @Override public SSTableWriter read(ReadableByteChannel channel) throws IOException { … Pair<String, String> kscf = Schema.instance.getCF(cfId); ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right);
NPE here
@doanduyhai
Design around dynamic schema
48
The truth is• the devs dynamically drop & recreate table every day• dynamic schema is in the core of their design
Example:
DROP TABLE catalog_127_20140613;CREATE TABLE catalog_127_20140614( … );
@doanduyhai
Design around dynamic schema
49
Failure sequence
n1
n2
n4
n3
catalog_x_y
catalog_x_y
catalog_x_y
catalog_x_y
1 4
2
3
5
6
@doanduyhai
Design around dynamic schema
50
Failure sequence
n1
n2
n4
n3
catalog_x_y
catalog_x_y
catalog_x_y
catalog_x_y
1 4
2
3
5
6
catalog_x_z
catalog_x_z
catalog_x_z
catalog_x_z
@doanduyhai
Design around dynamic schema
51
Failure sequence
n1
n2
n4
n3
1 4
2
3
5
6
catalog_x_z
catalog_x_z
catalog_x_z
catalog_x_z
catalog_x_y ????
@doanduyhai
Design around dynamic schema
52
Nutshell• dynamic schema change as normal prod operation is not
recommended • schema AND topology change at the same time is an anti-pattern
@doanduyhai
Rate limiting
56
Start-up company, reset password feature1) /password/reset
2) SMS with token A0F83E63DB935465CE73DFE….
Phone number Random token
3) /password/new/<token>/<password>
@doanduyhai
Rate limiting
58
Problem 1• account created with premium phone number• /password/reset x 100
@doanduyhai
Rate limiting
62
Problem 2• massive hack • 106 /password/reset calls from few accounts• SMS messages are cheap
@doanduyhai
Rate limiting
65
Solution• premium phone number ☞ Google libphonenumber • massive hack ☞ rate limiting with Cassandra
@doanduyhai
Cassandra Time To Live
66
Time to live• built-in feature• insert data with a TTL in sec• expires server-side automatically • ☞ use as sliding-window
@doanduyhai
Rate limiting in action
67
Implementation• threshold = max 3 reset password per sliding 24h per
user
@doanduyhai
Rate limiting in action
68
Implementation• when /password/reset called• check threshold• reached ☞ error message/ignore
• not reached ☞ log the attempt with TTL = 86400
@doanduyhai
Anti Fraud
70
Real story• many special offers available• 30 mins international calls (50 countries)
• unlimited land-line calls to 5 countries • …
@doanduyhai
Anti Fraud
71
Real story• each offer has a duration (week/month/year)• only one offer active at a time
@doanduyhai
Anti Fraud
72
Cassandra TTL• when granting new offer
INSERT INTO user_special_offer(login, offer_code, …) VALUES(‘jdoe’, ’30_mins_international’,…)IF NOT EXISTS USING TTL <offer_duration>;
@doanduyhai
Account Validation
74
Requirement• user creates new account• sends sms/email link with token to validate account• 10 days to validate
@doanduyhai
Account Validation
75
How to ?• create account with 10 days TTL
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33)USING TTL 864000;
@doanduyhai
Account Validation
76
How to ?• create random token for validation with 10 days TTL
INSERT INTO account_validation(token, login, name, age) VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33)USING TTL 864000;
@doanduyhai
Account Validation
77
On token validation• check token exist & retrieve user details
SELECT login, name, age FROM account_validationWHERE token = ‘A0F83E63DB935465CE73DFE…’;
• re-insert durably user details without TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);