cassandra training

By:András Fehér Certified Cassandra [email protected]

AGENDA

Agenda

● Only the basics

● Topics that are not covered today:• Performance tuning• Security with integrated authentication, encryption, auditing (see docs)• Data compression (see docs)• Backups: snapshot, incremental, point-in-time recovery (see docs)• Amazon’s Dynamo and Google’s BigTable concepts (see the Dynamo and BigTable papers)• Benefits of the enterprise version• Full text searching• Using Cassandra with other DBMS, Big Data tools• Choosing the appropriate hardware• …..

https://docs.datastax.com/en/cql/3.3/cql/cql_using/useSecureTOC.html

http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAboutConfigCompress.html

http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupRestore.html

http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf

RECOMMENDED RESOURCES

Overview

DataStax docsFree Self-Paced Video Curses Cassandra: The Definitive Guide, 2nd Edition

Comparing Oracle with Cassandra / DataStax EnterpriseHow To Run a Multi-Node Cluster Database with Cassandra on Ubuntu 14.04

Patrick McFadin’s video about most common CQL data modeling use cases

https://docs.datastax.com/en/landing_page/doc/landing_page/current.html

https://academy.datastax.com/courses

http://shop.oreilly.com/product/0636920043041.do

http://www.datastax.com/wp-content/uploads/2013/11/WP-DataStax-Oracle.pdf

https://www.digitalocean.com/community/tutorials/how-to-run-a-multi-node-cluster-database-with-cassandra-on-ubuntu-14-04

https://youtu.be/g5zfpUylx4w

REQUIREMENTS

Challenges that database management systems supporting websites must meet today• Need to be in more than one datacenter, active-active

• Scaling from 0 to ...

• Process incoming data from millions of Web and mobile users around the globe

• Extreme read performance

• Near 100% uptime - maintenace/version upgrade doesn’t count!

• Should not cost mega $ ...

Overview

WHAT IS CASSANDRA?● open source● distributed● decentralized: “shared nothing” architecture● elastically scalable: linear scale performance● highly available● fault tolerant: multi-data center replication● tuneably consistent: AP from the CAP theorem● flexible, dynamic schema data modeling: supports structured, semi-structured

and unstructured data● guaranteed data safety: commit log design● powerful query language (CQL): subset of SQL● active everywhere design

Overview

CASSANDRA ADOPTION

Overview

Source: http://db-engines.com/en/ranking

http://db-engines.com/en/ranking

PROVENCompanies that have large, active data sets:

● Constant Contact● CERN● Comcast● eBay● GitHub● GoDaddy● Hulu● Instagram● Intuit● Netflix● Reddit ● The Weather Channel● ... over 1500 more

Some of the largest production deployments:● Apple: over 75,000 nodes storing over 10 PB of data● Netflix: 2,500 nodes, 420 TB, over 1 trillion requests per day● Chinese search engine Easou (270 nodes, 300 TB, over 800 million requests per day)● eBay:over 100 nodes, 250 TB

Overview

COMPETITORS - SPEED BENCHMARK

Overview

ARCHITECTURE

• Basic elements

• Replication

• Communication between the nodes

• Single data center – Multi data center topologies

• Tunable consistency

• Node recovery

• Read / Write process on a node

Architecture

CASSANDRA OBJECTS VS. RDBMS

Data Modeling

● Keyspace: ○ can be compared to an Oracle Schema or a SQL Server database○ defines replication of data

● Table: ○ same concept○ high number of columns○ column level reads

● Column: can be loosely compared to a relational table column● Primary Key: same concept● Partition Key: identifies which node in the cluster will store the row, must be the first part

of the primary key● Clustering Key: orders rows based on the column’s value● Index: secondary indexes only● Partitioner: hashing algorithm that generates a hash value token from the partition key

THE TOKEN RING

Architecture

● Cluster: a ring of Cassandra nodes● Node: a Cassandra instance, responsible for

its token ranges● Tokens: hash values assigned to partitions

REPLICATION

Architecture

Replication factor: • total number of copies• automatically maintained

even when nodes are removed, added or go down

• Primary range: range that the node is responsible for

• Secondary range: replica stored on the node

VIRTUAL NODES

Architecture

GOSSIP PROTOCOL

Architecture

• For avoiding routing to nodes that are down• Each node initiates a gossip round every few seconds• Picks one to three nodes to gossip with• Nodes can gossip with ANY other node in the cluster• Probabilistically (slightly favour) seed and downed

nodes• Nodes do not track which nodes they gossiped with

prior• Reliably and efficiently spreads node metadata

through the cluster• Fault tolerant – continues to spread when nodes fail

SINGLE DATA CENTER

Architecture

• Replication strategy: SimpleStrategy• Clients can read from or write to any node ->

coordinator• Any node can be coordinator -> masterless • Data read or replicated in parallel• RF=3 in this example

MULTIPLE DATA CENTER SUPPORT

Architecture

Implementation• Replication strategy: NetworkTopologyStrategy• Client writes local• Data syncs across WAN• Replication Factor per Data Center (DC)• Different number of nodes per data center

Reasons for multiple data centers• Maximum chance for uptime• Reduce latency by bringing the data closer to the user• Workload isolation (onlyine – analytics, reporting)

TUNABLE CONSISTENCY

Architecture

• Consistency Level (CL)• Client specifies per operation• Handles multi-data center operations

Most commonly used consistency levels:• ALL = all replicas must acknowledge, erases

fault tolerance• QUORUM => 51% of replicas must acknowlege,

strong consistency but performance penalty• LOCAL_QUORUM => 51% of replicas in local DC

must acknowlege, strong consistency but only in local DC

• ONE: only one replica must acknowlege, good for log data

Plus a lot more... (see docs)

CONSISTENCY LEVEL: QUORUM

Architecture

• Recommended for production• Good balance between performance and

consistency• Reads allways return the latest version of

the data

NODE FAILURE

Architecture

• Replication Factor + Consistency Level = Success• This example:• RF = 3• CL = QUORUM

NODE RECOVERY – HINTED HANDOFF

Architecture

• Coordinator stores the hint locally (3 hours by default)• When node recovers, the coordinator replays the missed writes• Hint does not count the consistency level• If node is unavailable for more than 3 hours you must run

repair manually• Consistency level of ANY means

storing a hint suffices• Consistency level of ONE means

at least one replica must successfully write, hint does not suffice

FAST WRITE AT NODE

Architecture

• Write acknowledged after appending to commit log

• Data is written to both memory (Memtable) and commit log

• Memtable is flushed to disk SSTable• SSTables are compacted to eliminate

obsolate data

READING DATA FROM A NODE

Architecture

Bloom filter: it is absolutely not in the SSTable or could be? False positive is very rare.

GRAPHICAL INTERFACES

Operation

Datastax DevCenter Datastax OpsCenter

DRIVERS

Operation

Commercially and community supported : • Java• Python• Ruby• C# / .NET• Node.js• PHP• C++• Apache Spark• Clojure• Erlang• Go• Haskell• Rust• Scala

DATA STORAGE INTERNALS

• Row – Partition• Table• Partition key and primary key

Data storage internals

ROW (PARTITION)

• Partition is the smallest atomic storage; 1 partition - 1 node• If where is not on partition key, all nodes would have to be searched• Hashtables have constant time lookup


CREATE TABLE test.videos ( id int PRIMARY KEY, name text, runtime int, year int)

In this case partition key = primary key

TABLES


A table is just an artificial construct generated for the users from hash of hashes

COMPOSITE PARTITION KEYS


Primary key = partition key

CLUSTERING COLUMNS


• Group by and order by altogether• RDBMS primary key: uniqueness• Cassandra primary key: we would like to query on AND uniqueness

Sorted in ascending order (default)

Partition key

Clustering column

KEY (PARTITION KEY)


CREATE TABLE discography ( title text, year int, genre text, performer text, tracks map<int, text>, PRIMARY KEY ((title, year)))

CREATE TABLE beatles_discography ( disc_title text, year int, track_no int, title text, PRIMARY KEY ((title, year), track_no))

Partition key

Partition key

Clustering column

CASSANDRA QUERY LANGUAGE (CQL)

Cassandra Query Language (CQL)

• Basic syntax• Data types• Tuples and user-defined types• Batch• Transactions• Scripts• Indexing• Materialized views

CQL BASICS


create keyspace league WITH replication = {'class': 'SimpleStrategy', replication_factor':1};

use league;

create table teams (team_name varchar,player_name varchar,jersey int,primary key((team_name), player_name)

);

insert into teams (team_name, player_name, jersey) values ('Mighty Mutts', 'Felix',90);

select * from teams;

• Subset of SQL e.g. SELECT * FROM users• No JOINs!• Usual statements: CREATE / DROP / ALTER TABLE / SELECT

DATA TYPES


CQL Type Constants Descriptionascii strings US-ASCII character stringbigint integers 64-bit signed longblob blobs Arbitrary bytes (no validation), expressed as hexadecimalboolean booleans true or falsecounter integers Distributed counter value (64-bit long)decimal integers, floats Variable-precision decimaldouble integers 64-bit IEEE-754 floating pointfloat integers, floats 32-bit IEEE-754 floating pointinet strings IP address string in IPv4 or IPv6 format*int integers 32-bit signed integerlist n/a A collection of one or more ordered elementsmap n/a A JSON-style array of literals: { literal : literal, literal : literal ... }set n/a A collection of one or more elementstext strings UTF-8 encoded stringtimestamp integers, strings Date plus time, encoded as 8 bytes since epochuuid uuids A UUID in standard UUID formattimeuuid uuids Type 1 UUID only (CQL 3)varchar strings UTF-8 encoded stringvarint integers Arbitrary-precision integer

Collection types

Distributed increment /decrement

Distributed uniqueids

+ tuples and user defined types (UDT)

http://en.wikipedia.org/wiki/Universally_unique_identifier

UUID AND TIMEUUID


128 bit number to Replace integer Ids because on distributed systems it is nearly impossible to generate correct sequence.

Universally Unique Identifier (UUID):• Example: 99051fe9-6a9c-46c2-b949-38ef78858dd0

Time UUID:• Timestamp value + MAC address• Sortable• Example: a9af60b0-58a5-11e6-8f4b-1f871fe45bc4

COUNTERS


• Counter data type should be used for counting because int data type has concurrency issues Can be incremented/decremented, but cannot directly assign them.

• It must be the only non-partitioning column inside of a table.

• Can be multiple counter columns in the table. In this case all columns must be counter type except the primary key.

• Not always 100% accurate

COLLECTION COLUMNS


• Multi valued columns• Designed to store small amount of data• Retrieved in its entirety• Cannot nest a collection inside another collection

• Types• Set: unique values, ordered by values• List: non-unique values, ordered by position• Map: key-value pairs, ordered by unique keys

SET EXAMPLES


CREATE TABLE users ( user_id text PRIMARY KEY, first_name text, last_name text, emails set<text> );

INSERT INTO users (user_id, first_name, last_name, emails) VALUES('frodo', 'Frodo', 'Baggins', {'[email protected]', '[email protected]'});

UPDATE users SET emails = emails + {'[email protected]'} WHERE user_id = 'frodo';

UPDATE users SET emails = {} WHERE user_id = 'frodo'; DELETE emails FROM users WHERE user_id = 'frodo';

Observe the syntax

Add new element to a set:

Remove all elements:

LIST EXAMPLES


CREATE TABLE users ( user_id text PRIMARY KEY, top_places list<text> );

UPDATE users SET top_places = [ 'rivendell', 'rohan' ] WHERE user_id = 'frodo';

UPDATE users SET top_places = [ 'the shire' ] + top_places WHERE user_id = 'frodo';

UPDATE users SET top_places = top_places + [ 'mordor' ] WHERE user_id = 'frodo';

UPDATE users SET top_places[2] = 'riddermark' WHERE user_id = 'frodo';

DELETE top_places[3] FROM users WHERE user_id = 'frodo';

UPDATE users SET top_places = top_places - ['riddermark'] WHERE user_id = 'frodo';

Set multiple values:

Prepend an element:

Append an element:

Set an element to a particular position:

Delete an element at a particular position:

Delete all elements having a particular value:

MAP EXAMPLES


CREATE TABLE users ( user_id text PRIMARY KEY, todo map<timestamp, text>);

Set map data:

Set a specific element:

UPDATE users SET todo = { '2012-9-24' : 'enter mordor', '2012-10-2 12:00' : 'throw ring into mount doom' } WHERE user_id = 'frodo';

UPDATE users SET todo['2012-10-2 12:00'] = 'throw my precious into mount doom' WHERE user_id = 'frodo';

Delete an element from the map

DELETE todo['2012-9-24'] FROM users WHERE user_id = 'frodo';

Compute the TTL to use to expire todo list element on the day of the timestamp, and set the elements to expire

UPDATE users USING TTL <computed_ttl> SET todo['2012-10-1'] = 'find water' WHERE user_id = 'frodo';

TUPLES AND USER-DEFINDED TYPES


User-defined type: attach multiple data fields, each named and typed, to a single column, cannot be nested

Tuple: holds fixed-length sets of typed positional fields, can be nested

CREATE TABLE collect_things ( k int PRIMARY KEY, v tuple<int, text, float> );

INSERT INTO collect_things (k, v) VALUES(0, (3, 'bar', 2.1))

CREATE TYPE basic_info ( birthday timestamp, nationality text, weight text, height text );

CREATE TABLE users ( id uuid PRIMARY KEY, lastname text, basics FROZEN<basic_info>);

FROZEN: • treated like a blob• after Cassandra version 3.6 not required if the

UDT contains only non-collection fields

When is UDT a good choice?• Reuse the structure in multiple tables• Nesting is not necessary• Update fields individually if not frozen

BATCH STATEMENT


• Use it for keeping table in sync, NOT for performance.• Performance penalty

BEGIN BATCH;UPDATE users SET name=’Jim’ where id = 1;UPDATE users_by_ssn set name=’Jim’ where ssn=’888–99–9999';APPLY BATCH;

Good example:

Bad example:

BEGIN BATCH;INSERT INTO tester.users (userID, firstName, lastName) VALUES (1, 'Jim', 'James')INSERT INTO tester.users (userID, firstName, lastName) VALUES (2, 'Ernie', 'Orosco')INSERT INTO tester.users (userID, firstName, lastName) VALUES (3, 'Jose', 'Garza')INSERT INTO tester.users (userID, firstName, lastName) VALUES (4, 'Sammy', 'Mason')INSERT INTO tester.users (userID, firstName, lastName) VALUES (5, 'Larry', 'Bird')INSERT INTO tester.users (userID, firstName, lastName) VALUES (6, 'Jim', 'Smith')APPLY BATCH;

IMPORTING AND EXPORTING DATA


users.csv

cqlsh -k killrvideo -e 'copy videos(video_id, added_date, description, title, user_id) to STDOUT' | head -n -1 | cqlsh -k mykspc -e 'copy videos(video_id, added_date, description, title, user_id) from STDIN'

Copying data from one table to another:

EXECUTING SCRIPTS


Execute from cqlsh:

/home/datastax/script.cql

Execute as command line parameter:

LIGHTWEIGHT TRANSACTIONS


• Not the RDBMS concept!• Also known as Compare and Set (CAS)• Normally if the row exists, it is updated („UPSERT”)• Executed only if the „IF” condition is met at INSERT or UPDATE

operation• Use only if necessary, because it is slow!

INSERT INTO cyclist_name (id, lastname, firstname) VALUES (4647f6d3-7bd2-4085-8d6c-1229351b5498, 'KNETEMANN', 'Roxxane') IF NOT EXISTS;

UPDATE cycling.cyclist_name SET firstname = ‘Roxane’ WHERE id = 4647f6d3-7bd2-4085-8d6c-1229351b5498 IF firstname = ‘Roxxane’;

MATERIALIZED VIEWS


• Remember: joins are not supported• Cannot add any static columns from the source table• Columns of source table primary key must be part of the materialized view’s

primary key• Only one new column can be added to the primary key

CREATE MATERIALIZED VIEW cyclist_by_age AS SELECT age, birthday, name, country FROM cyclist_mv WHERE age IS NOT NULL AND cid IS NOT NULL PRIMARY KEY (age, cid);

INDEXING (SECONDARY INDEXES)


• Provides means to access data using attributes other than the partition key• Can be used for any column except for counter columns• Hidden table that contains only the indexed values • SSTable Attached Secondary Index (SASI): improved the performance drastically from Cassandra version 3.4• Do NOT use index:▪ High-cardinality coulmns▪ In tables that use counter column▪ On frequently updated or deleted column▪ To look for a row in a large partition unless narrowly queried

INDEXING AND QUERYING EXAMPLES

Cassandra Query Language (CQL

CREATE INDEX ryear ON rank_by_year_and_name (race_year);

CREATE INDEX team_idx ON cyclist_career_teams ( teams ); SELECT * FROM cycling.cyclist_career_teams WHERE teams CONTAINS 'Nederland bloeit';

CREATE INDEX team_year_idx ON cyclist_teams ( KEYS (teams) ); SELECT * From cycling.cyclist_teams WHERE teams CONTAINS KEY 2015;

CREATE INDEX blist_idx ON cycling.birthday_list (ENTRIES(blist)); SELECT * FROM cycling.birthday_list WHERE blist['age'] = '23';

CREATE INDEX rnumbers_idx ON cycling.race_starts (FULL(rnumbers)); SELECT * FROM cycling.race_starts WHERE rnumbers = [39,7,14];

Simple fields

Set and list collections

Map keys

Map entries

CREATE INDEX blist_idx ON cycling.birthday_list (VALUES(blist)); SELECT * FROM cycling.birthday_list CONTAINS 'NETHERLANDS';

Map values

Full content of a frozen map

LABWORK


• CQLSH: Command line interface to execute CQL statements• Create the simplest contact application keyspace and a table• Virtual Machine username and password: datastax/datastax

LABWORK CONTINUATION

In terminal start the CQL shell:


Get the list of existing keyspaces:

If contact_app keyspace exists, drop it:

Create the keyspace contact_app:

cqlsh

describe keyspaces;

drop keyspace if exists contact_app;

create keyspace contact_app WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

Check the created keyspace:

describe keyspace contact_app ;


Make the contact_app keyspace default for all operations:


Create the table contacts, name primary key, phone numbers set of varchar:

Get the list of existing tables

Check the created table:

use contact_app;

create table contacts(name varchar primary key, phone_numbers set<varchar>);

describe tables;

describe table contacts;

Insert a record into the table (you can enter any values for the name and phone fields):

insert into contacts(name, phone_numbers) values ('andras', {'239847','3249872398','348576'});


Check the content of the table


Add a phone number to the set:


Remove any existing phone number

select * from contacts;

update contacts set phone_numbers = phone_numbers + {'8475847587'} where name = 'andras';


update contacts set phone_numbers = phone_numbers - {'8475847587'} where name = 'andras';




Remove all phone numbers:


Check the content of the table:

update contacts set phone_numbers = {} where name = 'andras';


Drop the table:

drop table contacts;

DATA MODELING

Data Modeling

• Principles

• Examples

PRINCIPLES

Data Modeling

There is no...:• Foreign keys• Joins• Referential integrity• ACID transactions

But we have:• Wide rows• Collection columns• AID transactions (Consistency is missing as there is no foreign key or referencial integrity, instead we

have tunable consistency)

Do not model in entity-relationship-attribute fashion (RDBMS).

Understand the questions you will need to ask the database

SHOPPING CART

Data Modeling

• Each customer can have or more shopping carts• De-normalize data for fast access• Shopping cart == One partition (Row Level Isolation)• Each item is a new column

CREATE TABLE user {username varchar,firstname varchar,lastname varchar,shopping_carts set<varchar>,PRIMARY KEY (username)

};

CREATE TABLE shopping_cart {username varchar,cart_name text,item_id int,item_name varchar,price float,item_detail map<varchar, varchar>PRIMARY KEY ((username, cart_name),

item_id)};

USER ACTIVITY

Data Modeling

• React to user input in real time• Scale for speed• Store history for batch processing later

CREATE TABLE user_activity {username varchar,interaction_time timeuuid,activity_code varchar,detail varchar,PRIMARY KEY (username, interaction time)

} WITH CLUSTERING ORDER BY (interaction_time DESC);

CREATE TABLE user_activity_history {username varchar,interaction_date varchar,interaction_time timeuuid,activity_code varchar,detail varchar,PRIMARY KEY ((username, interaction date), interaction

time);

INSERT INTO user_activity (..) VALUES (...) USING TTL 2592000

Reverse based on timestamp

Delete after 30 day

LOG COLLECTION

Data Modeling

• Collect log data at high speed• Dice data for various uses. Dashboard, lookup, etc.• Batch analysis of logs is too slow in some cases• Flume: distributed service to collect and fan out data

CREATE TABLE log_lookup {source varchar,date_to_minute varchar,timestamp timeuuid,raw_log blob,PRIMARY KEY ((source, date_to_minute),

timestamp)};

Common practice to GZIP

raw logs

CREATE TABLE login_failure {source varchar,date_to_minute varchar,successful_logins counter,PRIMARY KEY ((source), date_to_minute)

} WITH CLUSTERING ORDER BY (date_to_minute DESC);

CREATE TABLE login_success {source varchar,date_to_minute varchar,successful_logins counter,PRIMARY KEY ((source), date_to_minute)

} WITH CLUSTERING ORDER BY (date_to_minute DESC);

see definition in notes

LOG DASHBOARD

Data Modeling

SELECT date_to_minute, successful_loginsFROM login_successLIMIT 20;

SELECT date_to_minute, failed_loginsFROM login_failureLIMIT 20;

REAL TIME DATA SOURCE FOR INTERNET BANK

Cassandra Data Modeling Principles

• Real use case• Several TBs of historical transaction data• Search the top N transactions within given date interval in time real time:

• for a specific account• for a specific user

acc_id tran_date amount tran_id

acc001 2016-09-01 100 Tr301

acc001 2016-08-16 200 Tr201

acc002 2016-09-21 60 Tr101

acc003 2016-09-16 300 Tr001

For a specific user by account number in ascending and transaction date in descending order

tran_date amount tran_id

2016-09-21 100 tr301

2016-09-16 200 tr201

2016-09-01 60 tr101

2016-08-16 300 tr001

For a specific account order by transaction date in descending order (=newest first)

CLOSING

• Overview of the homework• Questions

DATABASE FOR A SIMPLE RECOMMENDATION ENGINE

Homework

See Amazon’s website...The database has to support recommendations based on • previous purchases• sponsored products

The homework must contain the CQL script that:• creates the „recommendation” keyspace• the table(s) that contain description of the products,

number of stars, price, recommended products based on previous purchases, sponsored products

• the insert scripts that load the some sample data to the table(s) - short description is enough, pictues not needed

• query or queries that return the recommended products for a given product id

Hints:• Denormalize!• Consider the collection types• Use the CQL reference documentation

http://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlReferenceTOC.html

cassandra training

Data & Analytics