cassandra training
TRANSCRIPT
By:András Fehér Certified Cassandra [email protected]
AGENDA
Agenda
● Only the basics
● Topics that are not covered today:• Performance tuning• Security with integrated authentication, encryption, auditing (see docs)• Data compression (see docs)• Backups: snapshot, incremental, point-in-time recovery (see docs)• Amazon’s Dynamo and Google’s BigTable concepts (see the Dynamo and BigTable papers)• Benefits of the enterprise version• Full text searching• Using Cassandra with other DBMS, Big Data tools• Choosing the appropriate hardware• …..
RECOMMENDED RESOURCES
Overview
DataStax docsFree Self-Paced Video Curses Cassandra: The Definitive Guide, 2nd Edition
Comparing Oracle with Cassandra / DataStax EnterpriseHow To Run a Multi-Node Cluster Database with Cassandra on Ubuntu 14.04
Patrick McFadin’s video about most common CQL data modeling use cases
REQUIREMENTS
Challenges that database management systems supporting websites must meet today• Need to be in more than one datacenter, active-active
• Scaling from 0 to ...
• Process incoming data from millions of Web and mobile users around the globe
• Extreme read performance
• Near 100% uptime - maintenace/version upgrade doesn’t count!
• Should not cost mega $ ...
Overview
WHAT IS CASSANDRA?● open source● distributed● decentralized: “shared nothing” architecture● elastically scalable: linear scale performance● highly available● fault tolerant: multi-data center replication● tuneably consistent: AP from the CAP theorem● flexible, dynamic schema data modeling: supports structured, semi-structured
and unstructured data● guaranteed data safety: commit log design● powerful query language (CQL): subset of SQL● active everywhere design
Overview
CASSANDRA ADOPTION
Overview
Source: http://db-engines.com/en/ranking
PROVENCompanies that have large, active data sets:
● Constant Contact● CERN● Comcast● eBay● GitHub● GoDaddy● Hulu● Instagram● Intuit● Netflix● Reddit ● The Weather Channel● ... over 1500 more
Some of the largest production deployments:● Apple: over 75,000 nodes storing over 10 PB of data● Netflix: 2,500 nodes, 420 TB, over 1 trillion requests per day● Chinese search engine Easou (270 nodes, 300 TB, over 800 million requests per day)● eBay:over 100 nodes, 250 TB
Overview
COMPETITORS - SPEED BENCHMARK
Overview
ARCHITECTURE
• Basic elements
• Replication
• Communication between the nodes
• Single data center – Multi data center topologies
• Tunable consistency
• Node recovery
• Read / Write process on a node
Architecture
CASSANDRA OBJECTS VS. RDBMS
Data Modeling
● Keyspace: ○ can be compared to an Oracle Schema or a SQL Server database○ defines replication of data
● Table: ○ same concept○ high number of columns○ column level reads
● Column: can be loosely compared to a relational table column● Primary Key: same concept● Partition Key: identifies which node in the cluster will store the row, must be the first part
of the primary key● Clustering Key: orders rows based on the column’s value● Index: secondary indexes only● Partitioner: hashing algorithm that generates a hash value token from the partition key
THE TOKEN RING
Architecture
● Cluster: a ring of Cassandra nodes● Node: a Cassandra instance, responsible for
its token ranges● Tokens: hash values assigned to partitions
REPLICATION
Architecture
Replication factor: • total number of copies• automatically maintained
even when nodes are removed, added or go down
• Primary range: range that the node is responsible for
• Secondary range: replica stored on the node
VIRTUAL NODES
Architecture
GOSSIP PROTOCOL
Architecture
• For avoiding routing to nodes that are down• Each node initiates a gossip round every few seconds• Picks one to three nodes to gossip with• Nodes can gossip with ANY other node in the cluster• Probabilistically (slightly favour) seed and downed
nodes• Nodes do not track which nodes they gossiped with
prior• Reliably and efficiently spreads node metadata
through the cluster• Fault tolerant – continues to spread when nodes fail
SINGLE DATA CENTER
Architecture
• Replication strategy: SimpleStrategy• Clients can read from or write to any node ->
coordinator• Any node can be coordinator -> masterless • Data read or replicated in parallel• RF=3 in this example
MULTIPLE DATA CENTER SUPPORT
Architecture
Implementation• Replication strategy: NetworkTopologyStrategy• Client writes local• Data syncs across WAN• Replication Factor per Data Center (DC)• Different number of nodes per data center
Reasons for multiple data centers• Maximum chance for uptime• Reduce latency by bringing the data closer to the user• Workload isolation (onlyine – analytics, reporting)
TUNABLE CONSISTENCY
Architecture
• Consistency Level (CL)• Client specifies per operation• Handles multi-data center operations
Most commonly used consistency levels:• ALL = all replicas must acknowledge, erases
fault tolerance• QUORUM => 51% of replicas must acknowlege,
strong consistency but performance penalty• LOCAL_QUORUM => 51% of replicas in local DC
must acknowlege, strong consistency but only in local DC
• ONE: only one replica must acknowlege, good for log data
Plus a lot more... (see docs)
CONSISTENCY LEVEL: QUORUM
Architecture
• Recommended for production• Good balance between performance and
consistency• Reads allways return the latest version of
the data
NODE FAILURE
Architecture
• Replication Factor + Consistency Level = Success• This example:• RF = 3• CL = QUORUM
NODE RECOVERY – HINTED HANDOFF
Architecture
• Coordinator stores the hint locally (3 hours by default)• When node recovers, the coordinator replays the missed writes• Hint does not count the consistency level• If node is unavailable for more than 3 hours you must run
repair manually• Consistency level of ANY means
storing a hint suffices• Consistency level of ONE means
at least one replica must successfully write, hint does not suffice
FAST WRITE AT NODE
Architecture
• Write acknowledged after appending to commit log
• Data is written to both memory (Memtable) and commit log
• Memtable is flushed to disk SSTable• SSTables are compacted to eliminate
obsolate data
READING DATA FROM A NODE
Architecture
Bloom filter: it is absolutely not in the SSTable or could be? False positive is very rare.
GRAPHICAL INTERFACES
Operation
Datastax DevCenter Datastax OpsCenter
DRIVERS
Operation
Commercially and community supported : • Java• Python• Ruby• C# / .NET• Node.js• PHP• C++• Apache Spark• Clojure• Erlang• Go• Haskell• Rust• Scala
DATA STORAGE INTERNALS
• Row – Partition• Table• Partition key and primary key
Data storage internals
ROW (PARTITION)
• Partition is the smallest atomic storage; 1 partition - 1 node• If where is not on partition key, all nodes would have to be searched• Hashtables have constant time lookup
Data storage internals
CREATE TABLE test.videos ( id int PRIMARY KEY, name text, runtime int, year int)
In this case partition key = primary key
TABLES
Data storage internals
A table is just an artificial construct generated for the users from hash of hashes
COMPOSITE PARTITION KEYS
Data storage internals
Primary key = partition key
CLUSTERING COLUMNS
Data storage internals
• Group by and order by altogether• RDBMS primary key: uniqueness• Cassandra primary key: we would like to query on AND uniqueness
Sorted in ascending order (default)
Partition key
Clustering column
KEY (PARTITION KEY)
Data storage internals
CREATE TABLE discography ( title text, year int, genre text, performer text, tracks map<int, text>, PRIMARY KEY ((title, year)))
CREATE TABLE beatles_discography ( disc_title text, year int, track_no int, title text, PRIMARY KEY ((title, year), track_no))
Partition key
Partition key
Clustering column
CASSANDRA QUERY LANGUAGE (CQL)
Cassandra Query Language (CQL)
• Basic syntax• Data types• Tuples and user-defined types• Batch• Transactions• Scripts• Indexing• Materialized views
CQL BASICS
Cassandra Query Language (CQL)
create keyspace league WITH replication = {'class': 'SimpleStrategy', replication_factor':1};
use league;
create table teams (team_name varchar,player_name varchar,jersey int,primary key((team_name), player_name)
);
insert into teams (team_name, player_name, jersey) values ('Mighty Mutts', 'Felix',90);
select * from teams;
• Subset of SQL e.g. SELECT * FROM users• No JOINs!• Usual statements: CREATE / DROP / ALTER TABLE / SELECT
DATA TYPES
Cassandra Query Language (CQL)
CQL Type Constants Descriptionascii strings US-ASCII character stringbigint integers 64-bit signed longblob blobs Arbitrary bytes (no validation), expressed as hexadecimalboolean booleans true or falsecounter integers Distributed counter value (64-bit long)decimal integers, floats Variable-precision decimaldouble integers 64-bit IEEE-754 floating pointfloat integers, floats 32-bit IEEE-754 floating pointinet strings IP address string in IPv4 or IPv6 format*int integers 32-bit signed integerlist n/a A collection of one or more ordered elementsmap n/a A JSON-style array of literals: { literal : literal, literal : literal ... }set n/a A collection of one or more elementstext strings UTF-8 encoded stringtimestamp integers, strings Date plus time, encoded as 8 bytes since epochuuid uuids A UUID in standard UUID formattimeuuid uuids Type 1 UUID only (CQL 3)varchar strings UTF-8 encoded stringvarint integers Arbitrary-precision integer
Collection types
Distributed increment /decrement
Distributed uniqueids
+ tuples and user defined types (UDT)
UUID AND TIMEUUID
Cassandra Query Language (CQL)
128 bit number to Replace integer Ids because on distributed systems it is nearly impossible to generate correct sequence.
Universally Unique Identifier (UUID):• Example: 99051fe9-6a9c-46c2-b949-38ef78858dd0
Time UUID:• Timestamp value + MAC address• Sortable• Example: a9af60b0-58a5-11e6-8f4b-1f871fe45bc4
COUNTERS
Cassandra Query Language (CQL)
• Counter data type should be used for counting because int data type has concurrency issues Can be incremented/decremented, but cannot directly assign them.
• It must be the only non-partitioning column inside of a table.
• Can be multiple counter columns in the table. In this case all columns must be counter type except the primary key.
• Not always 100% accurate
COLLECTION COLUMNS
Cassandra Query Language (CQL)
• Multi valued columns• Designed to store small amount of data• Retrieved in its entirety• Cannot nest a collection inside another collection
• Types• Set: unique values, ordered by values• List: non-unique values, ordered by position• Map: key-value pairs, ordered by unique keys
SET EXAMPLES
Cassandra Query Language (CQL)
CREATE TABLE users ( user_id text PRIMARY KEY, first_name text, last_name text, emails set<text> );
INSERT INTO users (user_id, first_name, last_name, emails) VALUES('frodo', 'Frodo', 'Baggins', {'[email protected]', '[email protected]'});
UPDATE users SET emails = emails + {'[email protected]'} WHERE user_id = 'frodo';
UPDATE users SET emails = {} WHERE user_id = 'frodo'; DELETE emails FROM users WHERE user_id = 'frodo';
Observe the syntax
Add new element to a set:
Remove all elements:
LIST EXAMPLES
Cassandra Query Language (CQL)
CREATE TABLE users ( user_id text PRIMARY KEY, top_places list<text> );
UPDATE users SET top_places = [ 'rivendell', 'rohan' ] WHERE user_id = 'frodo';
UPDATE users SET top_places = [ 'the shire' ] + top_places WHERE user_id = 'frodo';
UPDATE users SET top_places = top_places + [ 'mordor' ] WHERE user_id = 'frodo';
UPDATE users SET top_places[2] = 'riddermark' WHERE user_id = 'frodo';
DELETE top_places[3] FROM users WHERE user_id = 'frodo';
UPDATE users SET top_places = top_places - ['riddermark'] WHERE user_id = 'frodo';
Set multiple values:
Prepend an element:
Append an element:
Set an element to a particular position:
Delete an element at a particular position:
Delete all elements having a particular value:
MAP EXAMPLES
Cassandra Query Language (CQL)
CREATE TABLE users ( user_id text PRIMARY KEY, todo map<timestamp, text>);
Set map data:
Set a specific element:
UPDATE users SET todo = { '2012-9-24' : 'enter mordor', '2012-10-2 12:00' : 'throw ring into mount doom' } WHERE user_id = 'frodo';
UPDATE users SET todo['2012-10-2 12:00'] = 'throw my precious into mount doom' WHERE user_id = 'frodo';
Delete an element from the map
DELETE todo['2012-9-24'] FROM users WHERE user_id = 'frodo';
Compute the TTL to use to expire todo list element on the day of the timestamp, and set the elements to expire
UPDATE users USING TTL <computed_ttl> SET todo['2012-10-1'] = 'find water' WHERE user_id = 'frodo';
TUPLES AND USER-DEFINDED TYPES
Cassandra Query Language (CQL)
User-defined type: attach multiple data fields, each named and typed, to a single column, cannot be nested
Tuple: holds fixed-length sets of typed positional fields, can be nested
CREATE TABLE collect_things ( k int PRIMARY KEY, v tuple<int, text, float> );
INSERT INTO collect_things (k, v) VALUES(0, (3, 'bar', 2.1))
CREATE TYPE basic_info ( birthday timestamp, nationality text, weight text, height text );
CREATE TABLE users ( id uuid PRIMARY KEY, lastname text, basics FROZEN<basic_info>);
FROZEN: • treated like a blob• after Cassandra version 3.6 not required if the
UDT contains only non-collection fields
When is UDT a good choice?• Reuse the structure in multiple tables• Nesting is not necessary• Update fields individually if not frozen
BATCH STATEMENT
Cassandra Query Language (CQL)
• Use it for keeping table in sync, NOT for performance.• Performance penalty
BEGIN BATCH;UPDATE users SET name=’Jim’ where id = 1;UPDATE users_by_ssn set name=’Jim’ where ssn=’888–99–9999';APPLY BATCH;
Good example:
Bad example:
BEGIN BATCH;INSERT INTO tester.users (userID, firstName, lastName) VALUES (1, 'Jim', 'James')INSERT INTO tester.users (userID, firstName, lastName) VALUES (2, 'Ernie', 'Orosco')INSERT INTO tester.users (userID, firstName, lastName) VALUES (3, 'Jose', 'Garza')INSERT INTO tester.users (userID, firstName, lastName) VALUES (4, 'Sammy', 'Mason')INSERT INTO tester.users (userID, firstName, lastName) VALUES (5, 'Larry', 'Bird')INSERT INTO tester.users (userID, firstName, lastName) VALUES (6, 'Jim', 'Smith')APPLY BATCH;
IMPORTING AND EXPORTING DATA
Cassandra Query Language (CQL)
users.csv
cqlsh -k killrvideo -e 'copy videos(video_id, added_date, description, title, user_id) to STDOUT' | head -n -1 | cqlsh -k mykspc -e 'copy videos(video_id, added_date, description, title, user_id) from STDIN'
Copying data from one table to another:
EXECUTING SCRIPTS
Cassandra Query Language (CQL)
Execute from cqlsh:
/home/datastax/script.cql
Execute as command line parameter:
LIGHTWEIGHT TRANSACTIONS
Cassandra Query Language (CQL)
• Not the RDBMS concept!• Also known as Compare and Set (CAS)• Normally if the row exists, it is updated („UPSERT”)• Executed only if the „IF” condition is met at INSERT or UPDATE
operation• Use only if necessary, because it is slow!
INSERT INTO cyclist_name (id, lastname, firstname) VALUES (4647f6d3-7bd2-4085-8d6c-1229351b5498, 'KNETEMANN', 'Roxxane') IF NOT EXISTS;
UPDATE cycling.cyclist_name SET firstname = ‘Roxane’ WHERE id = 4647f6d3-7bd2-4085-8d6c-1229351b5498 IF firstname = ‘Roxxane’;
MATERIALIZED VIEWS
Cassandra Query Language (CQL)
• Remember: joins are not supported• Cannot add any static columns from the source table• Columns of source table primary key must be part of the materialized view’s
primary key• Only one new column can be added to the primary key
CREATE MATERIALIZED VIEW cyclist_by_age AS SELECT age, birthday, name, country FROM cyclist_mv WHERE age IS NOT NULL AND cid IS NOT NULL PRIMARY KEY (age, cid);
INDEXING (SECONDARY INDEXES)
Cassandra Query Language (CQL)
• Provides means to access data using attributes other than the partition key• Can be used for any column except for counter columns• Hidden table that contains only the indexed values • SSTable Attached Secondary Index (SASI): improved the performance drastically from Cassandra version 3.4• Do NOT use index:▪ High-cardinality coulmns▪ In tables that use counter column▪ On frequently updated or deleted column▪ To look for a row in a large partition unless narrowly queried
INDEXING AND QUERYING EXAMPLES
Cassandra Query Language (CQL
CREATE INDEX ryear ON rank_by_year_and_name (race_year);
CREATE INDEX team_idx ON cyclist_career_teams ( teams ); SELECT * FROM cycling.cyclist_career_teams WHERE teams CONTAINS 'Nederland bloeit';
CREATE INDEX team_year_idx ON cyclist_teams ( KEYS (teams) ); SELECT * From cycling.cyclist_teams WHERE teams CONTAINS KEY 2015;
CREATE INDEX blist_idx ON cycling.birthday_list (ENTRIES(blist)); SELECT * FROM cycling.birthday_list WHERE blist['age'] = '23';
CREATE INDEX rnumbers_idx ON cycling.race_starts (FULL(rnumbers)); SELECT * FROM cycling.race_starts WHERE rnumbers = [39,7,14];
Simple fields
Set and list collections
Map keys
Map entries
CREATE INDEX blist_idx ON cycling.birthday_list (VALUES(blist)); SELECT * FROM cycling.birthday_list CONTAINS 'NETHERLANDS';
Map values
Full content of a frozen map
LABWORK
Cassandra Query Language (CQL)
• CQLSH: Command line interface to execute CQL statements• Create the simplest contact application keyspace and a table• Virtual Machine username and password: datastax/datastax
LABWORK CONTINUATION
In terminal start the CQL shell:
Cassandra Query Language (CQL)
Get the list of existing keyspaces:
If contact_app keyspace exists, drop it:
Create the keyspace contact_app:
cqlsh
describe keyspaces;
drop keyspace if exists contact_app;
create keyspace contact_app WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
Check the created keyspace:
describe keyspace contact_app ;
LABWORK CONTINUATION
Make the contact_app keyspace default for all operations:
Cassandra Query Language (CQL)
Create the table contacts, name primary key, phone numbers set of varchar:
Get the list of existing tables
Check the created table:
use contact_app;
create table contacts(name varchar primary key, phone_numbers set<varchar>);
describe tables;
describe table contacts;
Insert a record into the table (you can enter any values for the name and phone fields):
insert into contacts(name, phone_numbers) values ('andras', {'239847','3249872398','348576'});
LABWORK CONTINUATION
Check the content of the table
Cassandra Query Language (CQL)
Add a phone number to the set:
Check the content of the table
Remove any existing phone number
select * from contacts;
update contacts set phone_numbers = phone_numbers + {'8475847587'} where name = 'andras';
select * from contacts;
update contacts set phone_numbers = phone_numbers - {'8475847587'} where name = 'andras';
Check the content of the table
select * from contacts;
LABWORK CONTINUATION
Remove all phone numbers:
Cassandra Query Language (CQL)
Check the content of the table:
update contacts set phone_numbers = {} where name = 'andras';
select * from contacts;
Drop the table:
drop table contacts;
DATA MODELING
Data Modeling
• Principles
• Examples
PRINCIPLES
Data Modeling
There is no...:• Foreign keys• Joins• Referential integrity• ACID transactions
But we have:• Wide rows• Collection columns• AID transactions (Consistency is missing as there is no foreign key or referencial integrity, instead we
have tunable consistency)
Do not model in entity-relationship-attribute fashion (RDBMS).
Understand the questions you will need to ask the database
SHOPPING CART
Data Modeling
• Each customer can have or more shopping carts• De-normalize data for fast access• Shopping cart == One partition (Row Level Isolation)• Each item is a new column
CREATE TABLE user {username varchar,firstname varchar,lastname varchar,shopping_carts set<varchar>,PRIMARY KEY (username)
};
CREATE TABLE shopping_cart {username varchar,cart_name text,item_id int,item_name varchar,price float,item_detail map<varchar, varchar>PRIMARY KEY ((username, cart_name),
item_id)};
USER ACTIVITY
Data Modeling
• React to user input in real time• Scale for speed• Store history for batch processing later
CREATE TABLE user_activity {username varchar,interaction_time timeuuid,activity_code varchar,detail varchar,PRIMARY KEY (username, interaction time)
} WITH CLUSTERING ORDER BY (interaction_time DESC);
CREATE TABLE user_activity_history {username varchar,interaction_date varchar,interaction_time timeuuid,activity_code varchar,detail varchar,PRIMARY KEY ((username, interaction date), interaction
time);
INSERT INTO user_activity (..) VALUES (...) USING TTL 2592000
Reverse based on timestamp
Delete after 30 day
LOG COLLECTION
Data Modeling
• Collect log data at high speed• Dice data for various uses. Dashboard, lookup, etc.• Batch analysis of logs is too slow in some cases• Flume: distributed service to collect and fan out data
CREATE TABLE log_lookup {source varchar,date_to_minute varchar,timestamp timeuuid,raw_log blob,PRIMARY KEY ((source, date_to_minute),
timestamp)};
Common practice to GZIP
raw logs
CREATE TABLE login_failure {source varchar,date_to_minute varchar,successful_logins counter,PRIMARY KEY ((source), date_to_minute)
} WITH CLUSTERING ORDER BY (date_to_minute DESC);
CREATE TABLE login_success {source varchar,date_to_minute varchar,successful_logins counter,PRIMARY KEY ((source), date_to_minute)
} WITH CLUSTERING ORDER BY (date_to_minute DESC);
see definition in notes
LOG DASHBOARD
Data Modeling
SELECT date_to_minute, successful_loginsFROM login_successLIMIT 20;
SELECT date_to_minute, failed_loginsFROM login_failureLIMIT 20;
REAL TIME DATA SOURCE FOR INTERNET BANK
Cassandra Data Modeling Principles
• Real use case• Several TBs of historical transaction data• Search the top N transactions within given date interval in time real time:
• for a specific account• for a specific user
acc_id tran_date amount tran_id
acc001 2016-09-01 100 Tr301
acc001 2016-08-16 200 Tr201
acc002 2016-09-21 60 Tr101
acc003 2016-09-16 300 Tr001
For a specific user by account number in ascending and transaction date in descending order
tran_date amount tran_id
2016-09-21 100 tr301
2016-09-16 200 tr201
2016-09-01 60 tr101
2016-08-16 300 tr001
For a specific account order by transaction date in descending order (=newest first)
CLOSING
• Overview of the homework• Questions
DATABASE FOR A SIMPLE RECOMMENDATION ENGINE
Homework
See Amazon’s website...The database has to support recommendations based on • previous purchases• sponsored products
The homework must contain the CQL script that:• creates the „recommendation” keyspace• the table(s) that contain description of the products,
number of stars, price, recommended products based on previous purchases, sponsored products
• the insert scripts that load the some sample data to the table(s) - short description is enough, pictues not needed
• query or queries that return the recommended products for a given product id
Hints:• Denormalize!• Consider the collection types• Use the CQL reference documentation